Stefano Fiorucci
d90b0de124
Update README.md ( #6850 )
2024-01-30 10:03:01 +01:00
Silvano Cerza
f5e61338ba
chore: Remove all mentions of Canals ( #6844 )
...
* Remove unnecessary Connection class
* Remove all mentions of canals
* Add release notes
2024-01-29 17:26:11 +01:00
Silvano Cerza
9211f535b6
Remove unnecessary Connection class ( #6842 )
2024-01-29 17:25:52 +01:00
Silvano Cerza
b1ec32dae0
Simplify Pipeline.__eq__ logic ( #6840 )
2024-01-29 14:54:46 +01:00
Massimiliano Pippi
acf4cd502f
refact: Rename helper function ( #6831 )
...
* change function name
* add api docs
* release notes
2024-01-26 16:00:02 +01:00
Madeesh Kannan
fdf844f762
fix: Fix missing format string prefixes in pipeline.py
( #6834 )
2024-01-26 15:31:56 +01:00
Ashwin Mathur
7217f9d9f0
feat: Add F1 metric ( #6822 )
...
* Add F1 metric
* Add release notes
2024-01-26 11:04:43 +01:00
Stefano Fiorucci
b176750532
improve reno config ( #6827 )
2024-01-26 09:47:52 +01:00
Sebastian Husch Lee
3bea3b1714
feat: Add query and document prefix options for the TransformerSimilarityRanker ( #6826 )
...
* Add query and doc prefix
* Fix some tests
* add release notes
2024-01-25 15:29:19 +01:00
Rob Pasternak
7358b910d7
feat: Weights and score normalization for DocumentJoiner with reciprocal rank fusion ( #6735 )
...
* Add weighting and score normalization for DocumentJoiner w/ reciprocal rank fusion (fix trailing whitespace)
* Add release notes
* Add unit test
* Update release note
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-01-24 15:45:53 +01:00
Vladimir Blagojevic
6e86f4e26a
Update embedding integration tests ( #6823 )
2024-01-24 15:22:47 +01:00
Vladimir Blagojevic
c47b82c54f
Remove pipeline_utils package and dependent code ( #6806 )
2024-01-23 18:40:43 +01:00
Massimiliano Pippi
4efe40664c
use haystack-pydoc-tools package instead of local code ( #6818 )
2024-01-23 18:28:52 +01:00
Tuana Çelik
1825140654
Readme updates ( #6817 )
...
* add info on dD
* fix
* Update README.md
* make tip box
* move location
2024-01-23 15:29:36 +01:00
Daria Fokina
6d8f369e9d
chore: mention cookbook repo in README ( #6814 )
...
* readme update
* formatting fix
* format2
2024-01-23 14:06:34 +01:00
Daria Fokina
5d300a7356
add missing components to docs ( #6813 )
2024-01-23 14:03:15 +01:00
Massimiliano Pippi
df2a23dfa5
chore: cleanup unused code ( #6804 )
...
* remove validation module
* remove unused code
* adjust imports
* sort imports
2024-01-23 13:20:53 +01:00
Massimiliano Pippi
f44f123b3f
chore: mention integrations in the README ( #6805 )
...
* mention integrations in the README
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update README.md
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Tuana Çelik <tuana.celik@deepset.ai>
2024-01-22 15:38:00 +01:00
Madeesh Kannan
5c8feeac6a
proposal: Integration of 3rd party evaluation frameworks ( #6784 )
...
* proposal: Integration of 3rd party evaluation frameworks
* Add note about previous eval proposal
2024-01-22 12:35:27 +01:00
Ashwin Mathur
a238c6dd51
feat: Add Exact Match metric ( #6696 )
...
* Add exact match metric
* Add release notes
* Cleanup comments in test_eval_exact_match.py
* Create separate preprocessing function; Add output_key parameter
* Update release note
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-01-22 09:57:04 +01:00
Daria Fokina
8a08ab52e1
add telemetry overview ( #6785 )
2024-01-19 16:41:28 +01:00
Augustin Chan
cad30b039a
add .haystack_debug to .gitignore ( #6782 )
2024-01-19 10:44:35 +01:00
Vladimir Blagojevic
f47439c2a2
Use forward references for type hints, avoid NameError ( #6780 )
2024-01-18 18:46:59 +01:00
Silvano Cerza
d4f6531c52
feat: Refactor Pipeline.run()
( #6729 )
...
* First rough implementation of refactored run
* Further improve run logic
* Properly handle variadic input in run
* Further work
* Enhance names and add more documentation
* Fix issue with output distribution
* This works
* Enhance run comments
* Mark Multiplexer as greedy
* Remove MergeLoop in favour of Multiplexer in tests
* Remove FirstIntSelector in favour of Multiplexer
* Handle corner when waiting for input is stuck
* Remove unused import
* Handle mutable input data in run and misbehaving components
* Handle run input validation
* Test validation
* Fix pylint
* Fix mypy
* Call warm_up in run to fix tests
2024-01-18 17:53:47 +01:00
Vladimir Blagojevic
40a8b2b4a9
Move import to lazy import section ( #6778 )
2024-01-18 17:34:55 +01:00
Vladimir Blagojevic
0b177b3bc6
feat: Improve OpenAPIServiceConnector service response serialization ( #6772 )
...
* Better service response json -> str serialization
* Add unit test
2024-01-18 16:49:48 +01:00
Vladimir Blagojevic
fea1428e84
feat: Add HuggingFaceLocalChatGenerator
( #6751 )
2024-01-18 15:53:12 +01:00
dependabot[bot]
8d65a8630b
chore(deps): bump tj-actions/changed-files from 41 to 42 ( #6774 )
...
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files ) from 41 to 42.
- [Release notes](https://github.com/tj-actions/changed-files/releases )
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md )
- [Commits](https://github.com/tj-actions/changed-files/compare/v41...v42 )
---
updated-dependencies:
- dependency-name: tj-actions/changed-files
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-18 15:24:26 +01:00
dependabot[bot]
ac353c4652
chore(deps): bump actions/cache from 3 to 4 ( #6775 )
...
Bumps [actions/cache](https://github.com/actions/cache ) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases )
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md )
- [Commits](https://github.com/actions/cache/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-18 15:21:31 +01:00
Silvano Cerza
8079501925
Speed up Document dataclass import ( #6767 )
2024-01-18 15:18:02 +01:00
Silvano Cerza
1c76aa07bb
Fix __version__ handling ( #6765 )
2024-01-18 11:11:08 +01:00
Madeesh Kannan
5d66d040cc
feat: Add serde methods to HTMLToDocument
( #6758 )
2024-01-18 10:02:01 +01:00
Sebastian Husch Lee
c0b67432e4
feat: Add page breaks to default PDF to Document converter ( #6755 )
...
* Speedup tests for PyPDFToDocument
* Added unit test and removed skipping of empty pages
* add release note
* Add back some integration marks
2024-01-18 08:54:59 +01:00
Madeesh Kannan
eaec5bfe4a
refactor: Move HF-specific model serde code to a new submodule. ( #6754 )
...
* refactor: Move HF-specific model serde code to a new submodule.
* Remove unused import
2024-01-17 18:00:16 +01:00
Julian Risch
d1bdb8c63d
chore: bump Haystack version to beta5 ( #6757 )
v2.0.0-beta.5
2024-01-17 17:28:36 +01:00
sahusiddharth
a7ac4edd07
feat: added split by page to DocumentSplitter
( #6753 )
...
* feat-added-split-by-page-to-DocumentSplitter
* added test case and the suggested changes
* Update document_splitter.py
* Update haystack/components/preprocessors/document_splitter.py
* Update test_document_splitter.py
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-01-17 15:36:29 +01:00
Madeesh Kannan
6a1514550e
test: Update E2E tests to use Pipeline.dump/load
( #6756 )
2024-01-17 15:09:27 +01:00
Vladimir Blagojevic
88191e74bf
chore: Fix lazy import in HuggingFaceLocalGenerator ( #6752 )
...
* Fix lazy import in HuggingFaceLocalGenerator
* Fix pylint
* Import fix after merge
2024-01-17 14:32:03 +01:00
Madeesh Kannan
7376838922
feat!: Framework-agnostic device management ( #6748 )
...
* feat: Framework-agnostic device management
* Add release note
* Linting
* Fix test
* Add `first_device` property, expand release notes, validate `ComponentDevice` state
2024-01-17 10:41:34 +01:00
ZanSara
b8b8b5d5c6
feat!: rename model_name_or_path
to model
in NamedEntityExtractor
( #6744 )
...
* rename model_name_or_path to simply model
* fix tests
* reno
2024-01-16 15:32:48 +01:00
ZanSara
909c1eb023
fix a few docstrings ( #6743 )
2024-01-16 13:56:16 +01:00
Madeesh Kannan
d6cafeaff3
test: Rename RAG E2E test file ( #6750 )
...
Prior to this change, this broke `pytest` workflows in VSCode due to identical test names in this file and the integration/unit test file.
2024-01-16 13:40:22 +01:00
Sebastian Husch Lee
20f04f6054
feat: MetaFieldRanker update ( #6742 )
...
* Add weight and ranking_mode as params to run for easier experimentation
* renaming of metadata to meta
* User logger.warning instead of warnings
* Add another unit test
* Add support for sort_order and fix formatting of error messages
* Make MetaFieldRanker more robust. Doesn't crash pipeline if some Documents are missing keys.
* Don't print same warning message twice
* Add another test
* Making MetaFieldRanker more robust
* Move up if return statement to earlier in the function
* Setting up infer_type
* Remove infer_type for now
* Release notes
* Add init file
* Update releasenotes/notes/metafieldranker_sort-order_refactor-2000d89dc40dc15a.yaml
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-01-16 08:52:58 +01:00
Vladimir Blagojevic
8cafff0645
refactor: Extract HF stop words handling in hf_utils.py
( #6745 )
...
* Move StopWordsCriteria to hf_utils.py
* Raise ValueError for invalid StopWordsCriteria tokenizer
* StopWordsCriteria, make sure padding token exists
* Use proper torch types
* Update unit tests
2024-01-15 17:42:29 +01:00
ZanSara
96c0b59aaa
feat!: Rename model_name_or_path
to model
in ExtractiveReader
( #6736 )
...
* rename model parameter and internam model attribute in ExtractiveReader
* fix tests for ExtractiveReader
* fix e2e
* reno
* another fix
* review feedback
* Update releasenotes/notes/rename-model-param-reader-b8cbb0d638e3b8c2.yaml
2024-01-15 14:48:33 +01:00
ZanSara
b236ea49e3
fix: hybrid pipeline e2e test ( #6740 )
...
* fix hybrid pipeline e2e test
* warmup
* write to the right docstore
2024-01-15 14:20:02 +01:00
Stefano Fiorucci
8eba053dbc
fix pipeline test ( #6741 )
2024-01-15 13:59:11 +01:00
ZanSara
24afc2a7fc
feat: Highlight optional connections in Pipeline.draw()
( #6724 )
...
* highlight optional connections in Pipeline.draw()
* reno
2024-01-15 12:18:51 +01:00
Madeesh Kannan
a5189dd035
fix!: InMemoryBM25Retriever
no longer returns documents that have a score of 0.0 ( #6717 )
...
* fix!: `InMemoryBM25Retriever` no longer returns documents that have a score of 0.0
Also update tests to accommodate the new behavior.
* Remove superfluous code
2024-01-12 17:50:55 +01:00
Madeesh Kannan
4647f2a506
fix: ComponentMeta.__call__
handles keyword- and positional-only parameters correctly ( #6701 )
...
* fix: `ComponentMeta.__call__` handles keyword- and positional-only parameters correctly
* Update release note
2024-01-12 17:16:03 +01:00