* initial experiments
* progress
* draft
* fix header
* fix linting
* lot more lazy inits
* fixes to main init
* linting
* small refinements
* header fix
* release note
* improve consistency
* test: make sure no extra modules are being imported due to `__init__` definitions
* extend release note with an example
* refactoring import test
* updating release notes
* casting .keys() to list
* reverting to list
* Update haystack/__init__.py
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* fixing ident problem
* better comments
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* fix: use is not to compare to sentinel value
* chore: release notes
* Update releasenotes/notes/fix-component-checks-with-ambiguous-truth-values-949c447b3702e427.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* fix: another sentinel value
* test: also test base class
* add pandas as test dependency
* format
* Trigger CI
* mark test with xfail strict=False
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* feat: AsyncPipeline that can schedule components to run concurrently (#8812)
* add component checks
* pipeline should run deterministically
* add FIFOQueue
* add agent tests
* add order dependent tests
* run new tests
* remove code that is not needed
* test: intermediate from cycle outputs are available outside cycle
* add tests for component checks (Claude)
* adapt tests for component checks (o1 review)
* chore: format
* remove tests that aren't needed anymore
* add _calculate_priority tests
* revert accidental change in pyproject.toml
* test format conversion
* adapt to naming convention
* chore: proper docstrings and type hints for PQ
* format
* add more unit tests
* rm unneeded comments
* test input consumption
* lint
* fix: docstrings
* lint
* format
* format
* fix license header
* fix license header
* add component run tests
* fix: pass correct input format to tracing
* fix types
* format
* format
* types
* add defaults from Socket instead of signature
- otherwise components with dynamic inputs would fail
* fix test names
* still wait for optional inputs on greedy variadic sockets
- mirrors previous behavior
* fix format
* wip: warn for ambiguous running order
* wip: alternative warning
* fix license header
* make code more readable
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Introduce content tracing to a behavioral test
* Fixing linting
* Remove debug print statements
* Fix tracer tests
* remove print
* test: test for component inputs
* test: remove testing for run order
* chore: update component checks from experimental
* chore: update pipeline and base from experimental
* refactor: remove unused method
* refactor: remove unused method
* refactor: outdated comment
* refactor: inputs state is updated as side effect
- to prepare for AsyncPipeline implementation
* format
* test: add file conversion test
* format
* fix: original implementation deepcopies outputs
* lint
* fix: from_dict was updated
* fix: format
* fix: test
* test: add test for thread safety
* remove unused imports
* format
* test: FIFOPriorityQueue
* chore: add release note
* feat: add AsyncPipeline
* chore: Add release notes
* fix: format
* debug: switch run order to debug ubuntu and windows tests
* fix: consider priorities of other components while waiting for DEFER
* refactor: simplify code
* fix: resolve merge conflict with mermaid changes
* fix: format
* fix: remove unused import
* refactor: rename to avoid accidental conflicts
* fix: track pipeline type
* fix: and extend test
* fix: format
* style: sort alphabetically
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml
* fix: indentation, do not close loop
* fix: use asyncio.run
* fix: format
---------
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* feat: AsyncPipeline that can schedule components to run concurrently (#8812)
* add component checks
* pipeline should run deterministically
* add FIFOQueue
* add agent tests
* add order dependent tests
* run new tests
* remove code that is not needed
* test: intermediate from cycle outputs are available outside cycle
* add tests for component checks (Claude)
* adapt tests for component checks (o1 review)
* chore: format
* remove tests that aren't needed anymore
* add _calculate_priority tests
* revert accidental change in pyproject.toml
* test format conversion
* adapt to naming convention
* chore: proper docstrings and type hints for PQ
* format
* add more unit tests
* rm unneeded comments
* test input consumption
* lint
* fix: docstrings
* lint
* format
* format
* fix license header
* fix license header
* add component run tests
* fix: pass correct input format to tracing
* fix types
* format
* format
* types
* add defaults from Socket instead of signature
- otherwise components with dynamic inputs would fail
* fix test names
* still wait for optional inputs on greedy variadic sockets
- mirrors previous behavior
* fix format
* wip: warn for ambiguous running order
* wip: alternative warning
* fix license header
* make code more readable
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Introduce content tracing to a behavioral test
* Fixing linting
* Remove debug print statements
* Fix tracer tests
* remove print
* test: test for component inputs
* test: remove testing for run order
* chore: update component checks from experimental
* chore: update pipeline and base from experimental
* refactor: remove unused method
* refactor: remove unused method
* refactor: outdated comment
* refactor: inputs state is updated as side effect
- to prepare for AsyncPipeline implementation
* format
* test: add file conversion test
* format
* fix: original implementation deepcopies outputs
* lint
* fix: from_dict was updated
* fix: format
* fix: test
* test: add test for thread safety
* remove unused imports
* format
* test: FIFOPriorityQueue
* chore: add release note
* feat: add AsyncPipeline
* chore: Add release notes
* fix: format
* debug: switch run order to debug ubuntu and windows tests
* fix: consider priorities of other components while waiting for DEFER
* refactor: simplify code
* fix: resolve merge conflict with mermaid changes
* fix: format
* fix: remove unused import
* refactor: rename to avoid accidental conflicts
* fix: track pipeline type
* fix: and extend test
* fix: format
* style: sort alphabetically
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml
* fix: indentation, do not close loop
* fix: use asyncio.run
* fix: format
---------
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* updated changes for refactoring evaluations without pandas package
* added release notes for eval_run_result.py for refactoring EvaluationRunResult to work without pandas
* wip: cleaning and refactoring
* removing BaseEvaluationRunResult
* wip: fixing tests
* fixing tests and docstrings
* updating release notes
* fixing typing
* pylint fix
* adding deprecation warning
* fixing tests
* fixin types consistency
* adding stacklevel=2 to warning messages
* fixing docstrings
* fixing docstrings
* updating release notes
---------
Co-authored-by: mathislucka <mathis.lucka@gmail.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* initial rough draft
* expose backend instead of extracting from model_kwargs
* explictly set backend model path
* add reno
* expose backend for ST diversity backend
* add dtype tests and expose kwargs to ST ranker for backend parameters
* skip dtype tests as torch isnt compiled with cuda
* add new openvino dependency release, unskip tests
* resolve suggestion
* mock calls, turn integrations into unit tests
* remove unnecessary test dependencies
* Look through all streaming chunks for tools calls
* Add reno note
* mypy fixes
* Improve robustness
* Don't concatenate, use the last value
* typing
* Update releasenotes/notes/improve-tool-call-chunk-search-986474e814af17a7.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Small refactoring
* isort
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* CSV Document Splitter
* Add license header
* Add newline
* Add to docs
* Add lineterminator
* Updated csv splitter to allow user to specify to split by row, column or both
* Adding more tests
* Column tests
* Some refactoring to remove incorrect dropna call
* Fix
* More complicated test
* Adding more relevant metadata to match whats provided in our other splitters
* value error tests
* Fix mypy
* Docstring updates
* Add skip_blank_lines=False
* Add to dict test
* More from and to dict tests
* Fixes
* Move dict creation outside of for loop
* Initial OpenAPIConnector
* Add reno note
* Format
* Add headers
* Add test dep
* Use haystack logger
* Fix test
* Minor fix, spin CI
* Update reno release note format
* Add to docs, pydocs improvements
* add component checks
* pipeline should run deterministically
* add FIFOQueue
* add agent tests
* add order dependent tests
* run new tests
* remove code that is not needed
* test: intermediate from cycle outputs are available outside cycle
* add tests for component checks (Claude)
* adapt tests for component checks (o1 review)
* chore: format
* remove tests that aren't needed anymore
* add _calculate_priority tests
* revert accidental change in pyproject.toml
* test format conversion
* adapt to naming convention
* chore: proper docstrings and type hints for PQ
* format
* add more unit tests
* rm unneeded comments
* test input consumption
* lint
* fix: docstrings
* lint
* format
* format
* fix license header
* fix license header
* add component run tests
* fix: pass correct input format to tracing
* fix types
* format
* format
* types
* add defaults from Socket instead of signature
- otherwise components with dynamic inputs would fail
* fix test names
* still wait for optional inputs on greedy variadic sockets
- mirrors previous behavior
* fix format
* wip: warn for ambiguous running order
* wip: alternative warning
* fix license header
* make code more readable
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Introduce content tracing to a behavioral test
* Fixing linting
* Remove debug print statements
* Fix tracer tests
* remove print
* test: test for component inputs
* test: remove testing for run order
* chore: update component checks from experimental
* chore: update pipeline and base from experimental
* refactor: remove unused method
* refactor: remove unused method
* refactor: outdated comment
* refactor: inputs state is updated as side effect
- to prepare for AsyncPipeline implementation
* format
* test: add file conversion test
* format
* fix: original implementation deepcopies outputs
* lint
* fix: from_dict was updated
* fix: format
* fix: test
* test: add test for thread safety
* remove unused imports
* format
* test: FIFOPriorityQueue
* chore: add release note
* feat: add AsyncPipeline
* chore: Add release notes
* fix: format
* debug: switch run order to debug ubuntu and windows tests
* fix: consider priorities of other components while waiting for DEFER
* refactor: simplify code
* fix: resolve merge conflict with mermaid changes
* fix: format
* fix: remove unused import
* refactor: rename to avoid accidental conflicts
* fix: track pipeline type
* fix: and extend test
* fix: format
* style: sort alphabetically
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update test/core/pipeline/features/conftest.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml
* fix: indentation, do not close loop
* fix: use asyncio.run
* fix: format
---------
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Initial commit for csv cleaner
* Add release notes
* Update lineterminator
* Update releasenotes/notes/csv-document-cleaner-8eca67e884684c56.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* alphabetize
* Use lazy import
* Some refactoring
* Some refactoring
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* add component checks
* pipeline should run deterministically
* add FIFOQueue
* add agent tests
* add order dependent tests
* run new tests
* remove code that is not needed
* test: intermediate from cycle outputs are available outside cycle
* add tests for component checks (Claude)
* adapt tests for component checks (o1 review)
* chore: format
* remove tests that aren't needed anymore
* add _calculate_priority tests
* revert accidental change in pyproject.toml
* test format conversion
* adapt to naming convention
* chore: proper docstrings and type hints for PQ
* format
* add more unit tests
* rm unneeded comments
* test input consumption
* lint
* fix: docstrings
* lint
* format
* format
* fix license header
* fix license header
* add component run tests
* fix: pass correct input format to tracing
* fix types
* format
* format
* types
* add defaults from Socket instead of signature
- otherwise components with dynamic inputs would fail
* fix test names
* still wait for optional inputs on greedy variadic sockets
- mirrors previous behavior
* fix format
* wip: warn for ambiguous running order
* wip: alternative warning
* fix license header
* make code more readable
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Introduce content tracing to a behavioral test
* Fixing linting
* Remove debug print statements
* Fix tracer tests
* remove print
* test: test for component inputs
* test: remove testing for run order
* chore: update component checks from experimental
* chore: update pipeline and base from experimental
* refactor: remove unused method
* refactor: remove unused method
* refactor: outdated comment
* refactor: inputs state is updated as side effect
- to prepare for AsyncPipeline implementation
* format
* test: add file conversion test
* format
* fix: original implementation deepcopies outputs
* lint
* fix: from_dict was updated
* fix: format
* fix: test
* test: add test for thread safety
* remove unused imports
* format
* test: FIFOPriorityQueue
* chore: add release note
* fix: resolve merge conflict with mermaid changes
* fix: format
* fix: remove unused import
* refactor: rename to avoid accidental conflicts
* chore: remove unused inputs, add missing license header
* chore: extend release notes
* Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* fix: format
* fix: format
* Update release note
---------
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* feat: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder can accept and pass any arguments to SentenceTransformer.encode
* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility reasons
* docs: added explanation for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* test: added tests for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* doc: removed empty lines from docstrings of SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder
* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility (part II.)
* HF API Embedders: refactoring
* rename variables
* rm leftovers
* rm pin
* rm unused import
* relnote
* warning with truncate/normalize and serverless inference API
* test that warnings are raised
* compress graph data to support pako endpoint
* support mermaid.ink parameters and custom servers
* dont try to resolve conflicts with the github web ui...
* avoid double graph copy
* fixing typing, improving docstrings and release notes
* reverting type
* nit - force type checker no cache
* nit - force type checker no cache
---------
Co-authored-by: Ulises M <ulises@lbux.org>
Co-authored-by: Ulises M <30765968+lbux@users.noreply.github.com>
* fix: callables can be deserialized from fully qualified import path
* fix: license header
* fix: format
* fix: types
* fix? types
* test: extend test case
* format
* add release notes
* updated DocumentSplitter
issue #8741
* release note
* updated DocumentSplitter
in _create_docs_from_splits function initialize a new variable copied_mete instead to overwrite meta
* added test
test_duplicate_pages_get_different_doc_id
* fix fmt
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* initial import
* adding double new lines between container_texts so that passages can be detected
* reducing type specification to avoid import error
* adding release notes
* renaming variable