* fix: Update device deserializtion for SentenceTransformersTextEmbedder
* Add unit test
* Fix unit test
* Make same change to doc embedder
* Add release notes
* Add same change to Diversity Ranker and Named Entity Extractor
* Add unit test
* Add the same for whisper local
* Update release notes
* incorporating better bm25 impl without breaking interface
* all three bm25 algos
* 1. setting algo post-init not allowed; 2. remove extra underscore for naming consistency; 3. remove unused import
* 1. rename attribute name for IDF computation 2. organize document statistics as a dataclass instead of tuple to improve readability
* fix score type initialization (int -> float) to pass mypy check
* release note included
* fixing linting issues and mypy
* fixing tests
* removing heapq import and cleaning up logging
* changing indexing order
* adding more tests
* increasing tests
* removing rank_bm25 from pyproject.toml
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Update huggingface_hub classes used after library upgrade
* Fix chat tests
* Update lazy import guard and other references to huggingface_hub>=0.23.0
* In huggingface_hub 0.23.0 TextGenerationOutput property details is now optional
* More fixes
* Add reno note
* calculate page number of answer and add to meta
* fix mypy, add reno
* add test
* simplify unit test
* update release note
* undo @patch updates
* extend tests, check page_number type
* Initial commit pdfminer converter
* Revert back naming of argument all_text per pdfminer documentation
* Add the component decorator
* Add release notes
* Reformat code with black
* Remove LTPage and comments
* Update dependencies in pyproject.toml
* Added some tests and incorporated reference doc in docstring
* Added some tests and incorporated reference doc in docstring
* ci: trigger separate workflow
* ci: temporary use current branch
* ci: fix workflow name
* ci: try with same job name
* ci: try with dispatch
* Revert "ci: try with dispatch"
This reverts commit bd66e56c0697ae97fc2599eebaceff417d9be65c.
* Revert "ci: try with same job name"
This reverts commit 9e2ae5b402758c14a9f812c2e06f820bd3ece767.
* ci: try with workflow call in both cases
* ci: introduce change to trigger CI
* Revert "ci: introduce change to trigger CI"
This reverts commit e3ec07c5e26f114364babea69535183253c801b7.
* ci: add name
* Revert "Revert "ci: introduce change to trigger CI""
This reverts commit 6718585fd24069112e0f773e010056e1d96e3eee.
* ci: improve naming
* ci: further improve naming
* Unset reusable workflow version and use relative path
* Remove CI trigger
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
* Add the implementation for page counting used in the v1.25.x branch. It should work as expected in issue #6705.
* Add tests that reflect the desired behabiour. This behabiour is inffered from the one it had on Haystack 1.x
Solve some minor bugs spotted by tests.
* Update docstrings.
* Add reno.
* Update haystack/components/preprocessors/document_splitter.py
Update docstring from suggestion
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* solve suggestion to improve readability
* fragment tests
* Update haystack/components/preprocessors/document_splitter.py
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Update .gitignore
* Update .gitignore
* Update add-page-number-to-document-splitter-162e9dc7443575f0.yaml
* blackening
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* initial import
* wip
* cleaning up tests
* fixing tests
* adding context relevance
* reverting some wrong changes to due PyCharm error in refactoring
* building eval pipeline only once
* handling mypy issues
The new `EvaluationRunResult` has slightly different semantics - it separates the previous `data` parameter into `inputs` and `results`and expects aggregate scores to be provided in the latter.
* adding missing docstrings
* adding missing docstrings
* Update haystack/dataclasses/answer.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* reverting some docstrings due to pylint issue, adding a noqa for ruff
* reverting some docstrings due to pylint issue, adding a noqa for ruff
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>