* First rough implementation of refactored run
* Further improve run logic
* Properly handle variadic input in run
* Further work
* Enhance names and add more documentation
* Fix issue with output distribution
* This works
* Enhance run comments
* Mark Multiplexer as greedy
* Remove MergeLoop in favour of Multiplexer in tests
* Remove FirstIntSelector in favour of Multiplexer
* Handle corner when waiting for input is stuck
* Remove unused import
* Handle mutable input data in run and misbehaving components
* Handle run input validation
* Test validation
* Fix pylint
* Fix mypy
* Call warm_up in run to fix tests
* feat-added-split-by-page-to-DocumentSplitter
* added test case and the suggested changes
* Update document_splitter.py
* Update haystack/components/preprocessors/document_splitter.py
* Update test_document_splitter.py
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Add weight and ranking_mode as params to run for easier experimentation
* renaming of metadata to meta
* User logger.warning instead of warnings
* Add another unit test
* Add support for sort_order and fix formatting of error messages
* Make MetaFieldRanker more robust. Doesn't crash pipeline if some Documents are missing keys.
* Don't print same warning message twice
* Add another test
* Making MetaFieldRanker more robust
* Move up if return statement to earlier in the function
* Setting up infer_type
* Remove infer_type for now
* Release notes
* Add init file
* Update releasenotes/notes/metafieldranker_sort-order_refactor-2000d89dc40dc15a.yaml
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* rename model parameter and internam model attribute in ExtractiveReader
* fix tests for ExtractiveReader
* fix e2e
* reno
* another fix
* review feedback
* Update releasenotes/notes/rename-model-param-reader-b8cbb0d638e3b8c2.yaml
* fix!: `InMemoryBM25Retriever` no longer returns documents that have a score of 0.0
Also update tests to accommodate the new behavior.
* Remove superfluous code
* rename model parameter in the openai doc embedder
* fix tests for openai doc embedder
* rename model parameter in the openai text embedder
* fix tests for openai text embedder
* rename model parameter in the st doc embedder
* fix tests for st doc embedder
* rename model parameter in the st backend
* fix tests for st backend
* rename model parameter in the st text embedder
* fix tests for st text embedder
* fix docstring
* fix pipeline utils
* fix e2e
* reno
* fix the indexing pipeline _create_embedder function
* fix e2e eval rag pipeline
* pytest
* rename model parameter in local transcriber
* fix tests for local transcriber
* rename model parameter in remote transcriber
* fix tests for remote transcriber
* reno
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* first draft for ranker
* same for the reader
* consider also bnb_4bit_compute_dtype
* dtype serialization in hugging_face_local_generator
* add release note
* address dtype defined in huggingface_pipeline_kwargs
* test quantization options in reader
* fix
* serialize quantization_config
* test quantization_config serialization
* address feedback
* fix typo
* feat: Add `NamedEntityExtractor`component
This component accepts a list of `Document`s which it annotates with named entities. The annotations are stored in the `meta` dictionary of each `Document` under a specific key.
The component currently support two backends for the annotation models: Hugging Face `transformers` and spaCy.
* Address comments
* Expand release note
* Add the `[torch]` extra package specifier to the lazy import
* Remove dead code
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* track default value in sockets
* remove dead code
* include default value in socket description
* add unit test
* add relnote
* unused import
* clarify
* Add scale_score functionality to the TransformersSimilarityRanker
* Updated test to check scores
* Use pytest approx when comparing floats
* Updated how scale score works and added calibration factor. Started to add score threshold.
* Add support for score_threshold
* Add some parameters to the run method
* Add release notes
* Fix mypy
* Be more tolerant on the score values
* Adding unit test for scale_score=False
* Add unit test for score threshold
* Update tests
* Rename test
* Fix typo
* PR comments
* Handle tools parameter in OpenAIChatGenerator
* Handle tools/functions parameter in OpenAIChatGenerator streaming mode
* Adjust OpenAPIServiceConnector to handle tools parameter
* We never deal with functions/tools in non-chat generator
* Add release note
* replace metadata w meta in tests/examples
* do not touch already broken e2e tests
* Revert "do not touch already broken e2e tests"
This reverts commit 1f911920d98954b57daacfe8d8ed02fd77d136db.