* Added backend class for SparseEncoder and also SentenceTransformersSparseTextEmbedder
* Added SentenceTransformersSparseDocumentEmbedder
* Created a separate _SentenceTransformersSparseEmbeddingBackendFactory and added tests
* Remove unused parameter
* Wrapped output into SparseEmbedding dataclass + fix tests
* Return correct SparseEmbedding, imports and tests
* fix fmt
* Style changes and fixes
* Added a test for embed function
* Added integration test and fixed some other tests
* Add lint fixes
* Fixed positional arguments
* fix types, simplify and more
* fix
* token fixes
* pydocs, small model in test, cache improvement
* try 3.9 for docs
* better to pin click
* release note
* small fix
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* modify Documents Classifiers and Extractors to not make in-place changes
* Add e2e test for NER
* Add unit test for NER
* fixes + refinements
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* chore(lint): enforce and apply PEP 585 type hinting
* Run fmt fixes
* Fix all typing imports using some regex
* Fix all typing written in string in tests
* undo changes in the e2e tests
* make e2e test use list instead of List
* type fixes
* remove type:ignore
* pylint
* Remove typing from Usage example comments
* Remove typing from most of comments
* try to fix e2e tests on comm PRs
* fix
* Add tests typing.List in to adjust test compatiplity
- test/components/agents/test_state_class.py
- test/components/converters/test_output_adapter.py
- test/components/joiners/test_list_joiner.py
* simplify pyproject
* improve relnote
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* wip: fixing tests
* wip: fixing tests
* wip: fixing tests
* wip: fixing tests
* fixing circular imports
* decoupling resume and initial run() for agent
* adding release notes
* re-raising BreakPointException from pipeline.run()
* fixing imports
* refactor: Refactor suggestions for Pipeline breakpoints (#9614)
* Refactoring
* Start adding debug_path into Breakpoint class
* Fully move debug_path into Breakpoint dataclass
* Simplifications in pipeline run logic
* More simplification
* lint
* More simplification
* Updates
* Rename resume_state to pipeline_snapshot
* PR comments
* Missed renaming of state in a few more places
* feat: Add dataclasses to represent a `PipelineSnapshot` and refactored to use it (#9619)
* Refactor to use dataclasses for PipelineSnapshot and AgentSnapshot
* Fix integration tests
* Mypy
* Fix mypy
* Fix lint
* Refactor AgentSnapshot to only contain needed info
* Fix mypy
* More refactoring
* removing unused import
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* feat: saving include_outputs_from intermediate results to `PipelineState` object (#9629)
* saving intermediate components results in include_outputs_from into the PipelineSnaptshot
* cleaning up
* fixing tests
* fixing tests
* extending tests
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* linting
* moving intermediate results to pipeline state and adding pipeline outputs to state
* moving ordered_component_names and include_outputs_from to PipelineSnapshot
* moving original_input_data to PipelineSnapshot
* simplifying saving the intermediate results
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Update haystack/dataclasses/breakpoints.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
---------
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* linting
* cleaning up
* avoiding creating PipelineSnapshot for every component run
* removing unecessary code
* Update checks in Agent to not unecessarily create AgentSnapshot when not needed.
* Update haystack/components/agents/agent.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* Update haystack/components/agents/agent.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* cleaning up tests
* linting
---------
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
* Fix types in test_run.py
* Get test_run.py to pass fmt-check
* Add test_run to mypy checks
* Update test folder to pass ruff linting
* Fix merge
* Fix HF tests
* Fix hf test
* Try to fix tests
* Another attempt
* minor fix
* fix SentenceTransformersDiversityRanker
* skip integrations tests due to model unavailable on HF inference
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* chore: fix deepset_sync.py for pylint
* check .github with ruff
* fix
* Update .github/utils/pyproject_to_requirements.py
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
---------
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
* chore: Make the Haystack core "type complete"
For libraries with a `py.typed` marker, it is [recommended][1] to
make all public interfaces "type complete", i.e. to explicitly
annotate all function parameters and return types. Doing so has the
following benefits:
- It maximizes the type information available to users and IDEs.
- It ensures that the argument and return types are the intended ones.
- It sidesteps differences in type inference between the different
type checker implementations.
This change makes a first step towards type completeness by enabling
the Mypy `disallow_incomplete_defs` for the core modules (excluding
`haystack.components.*` and `haystack.testing.*`) and fixing the
resulting errors.
[1]: https://typing.python.org/en/latest/guides/libraries.html#how-much-of-my-library-needs-types
* chore: Add `python_version = 3.9` to Mypy config
This catches type constructs that are only supported in later Python
versions.
* Remove unused import
* try to fix linting
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* add token split_unit
* fix overlap with fallback
* reno
* mark as integration tests
* use type ignore instead of assert
* Update releasenotes/notes/recursive-splitter-token-df56428887ac45bd.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Start adding support for passing callable to Azure components
* Add to chat version
* Fix test
* Add reno
* Add support to azure doc and text embedder
* Rename
* update llm metadata extractor
* Add tests for text embedder
* Update tests
* Remove unused fixture and import
* Update reno
* fix: use is not to compare to sentinel value
* chore: release notes
* Update releasenotes/notes/fix-component-checks-with-ambiguous-truth-values-949c447b3702e427.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* fix: another sentinel value
* test: also test base class
* add pandas as test dependency
* format
* Trigger CI
* mark test with xfail strict=False
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>