sahusiddharth
3bd6ba93ca
feat:Add dimensions parameter to OpenAI Embedders to fully support th… ( #6841 )
...
* feat:Add dimensions parameter to OpenAI Embedders to fully support the new models
* fixed linting
* changed != None to is not None
2024-02-05 16:20:46 +01:00
Bilge Yücel
0fbb0655f0
Create breaking-change-proposal.md issue template ( #6892 )
...
* Create breaking-change-proposal.md issue template
* Update breaking-change-proposal.md
2024-02-05 14:54:33 +01:00
Madeesh Kannan
c3a9dac196
chore: Tick version to 2.0.0-beta.6 ( #6914 )
v2.0.0-beta.6
2024-02-05 13:51:45 +01:00
Madeesh Kannan
27d1af3068
feat!: Use Secret
for passing authentication secrets to components ( #6887 )
...
* feat!: Use `Secret` for passing authentication secrets to components
* Add comment to clarify type ignore
2024-02-05 13:17:01 +01:00
Ashwin Mathur
393a7993c3
feat: Add Semantic Answer Similarity metric ( #6877 )
...
* Add SAS metric
* Add release notes
* Round similarity scores for precision consistency
* Add tolerance to tests
* Update haystack/evaluation/eval.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Add types for preprocess_text; Add additional types for f1 and em methods
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-02-02 17:07:52 +01:00
Silvano Cerza
461556cca2
fix: Fix language servers never working with Components ( #6893 )
...
* Fix language servers never working with Components
* Add release notes
2024-02-02 15:59:05 +01:00
Massimiliano Pippi
27d0b28d06
chore: rename categories in the API docs ( #6885 )
...
* rename API categories
* fix
* update slugs
* rename files for consistency
* fix category ID
* try getting the right version
2024-02-01 16:47:26 +01:00
ZanSara
1039c73553
feat: Allow setting metadata for ByteStream
when created from file or from string ( #6857 )
...
* add params
* add tests
* reno
* add default
* defreeze
2024-02-01 12:50:11 +01:00
ZanSara
9af6c7e442
add some tolerance to Roberta test ( #6880 )
2024-01-31 17:19:07 +01:00
Madeesh Kannan
b772c1127c
feat: Implement Secret
for structured authentication ( #6855 )
...
* feat: Implement `AuthPolicy` for structured authentication
* Rename `AuthPolicy` to `Secret`
* Update release notes, fix typo
2024-01-31 12:51:14 +01:00
Stefano Fiorucci
537107ba6e
ci: bump transformers to 4.37.2 in test_requirements ( #6848 )
...
* bump transformers to 4.37.1 in test_requirements
* use 4.37.2
2024-01-30 17:19:22 +01:00
Sebastian Husch Lee
ceda4cd655
feat: Add support for device_map
( #6679 )
...
* Getting device_map working to support 8bit loading and multi device inference
* Update to take account the device specified by the user
* add release notes
* Add device_map support for ExtractiveReader
* Update test
* Update to model that doesn't have issues
* Update test
* Update pytest approx
* Update release notes
* Start supporting device map
* Update ExtractiveReader to use new ComponentDevice
* Update similarity ranker to follow extractive reader implementation
* Fixing pylint
* Make mypy mostly happy
* Add new unit test to test device_map
* Adding unit tests
* Some refactoring
* Add more tests
* Add more tests
* Add another unit test
* Update first_device property to return a ComponentDevice to be able to use the to methods
* Updating tests for test_device
* Update tests and now explicitly modify device_map in model_kwargs
* Update haystack/utils/hf.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Make mypy happy
* mypy
* Remove unneeded optional flag
* Update ExtractiveReader with new logic
* Update ranker to follow new logic
* Removing unneeded code
* Make mypy happy
* fxi pylint
* Fix test
* Adding unit tests for device_map="auto"
* Add unit tests for ranker
* PR comments
* Make util method
* Adding unit tests
* Fix type annotation
* Fix pylint
* Fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-01-30 13:47:57 +01:00
Silvano Cerza
76d324a149
feat: Change Pipeline.add_component
to fail when reusing Component
instances ( #6847 )
...
* Change Pipeline.add_component to fail when reusing Component instances
* Change variable name and store Pipeline instance in it
* Fix tests
2024-01-30 11:15:26 +01:00
Stefano Fiorucci
d90b0de124
Update README.md ( #6850 )
2024-01-30 10:03:01 +01:00
Silvano Cerza
f5e61338ba
chore: Remove all mentions of Canals ( #6844 )
...
* Remove unnecessary Connection class
* Remove all mentions of canals
* Add release notes
2024-01-29 17:26:11 +01:00
Silvano Cerza
9211f535b6
Remove unnecessary Connection class ( #6842 )
2024-01-29 17:25:52 +01:00
Silvano Cerza
b1ec32dae0
Simplify Pipeline.__eq__ logic ( #6840 )
2024-01-29 14:54:46 +01:00
Massimiliano Pippi
acf4cd502f
refact: Rename helper function ( #6831 )
...
* change function name
* add api docs
* release notes
2024-01-26 16:00:02 +01:00
Madeesh Kannan
fdf844f762
fix: Fix missing format string prefixes in pipeline.py
( #6834 )
2024-01-26 15:31:56 +01:00
Ashwin Mathur
7217f9d9f0
feat: Add F1 metric ( #6822 )
...
* Add F1 metric
* Add release notes
2024-01-26 11:04:43 +01:00
Stefano Fiorucci
b176750532
improve reno config ( #6827 )
2024-01-26 09:47:52 +01:00
Sebastian Husch Lee
3bea3b1714
feat: Add query and document prefix options for the TransformerSimilarityRanker ( #6826 )
...
* Add query and doc prefix
* Fix some tests
* add release notes
2024-01-25 15:29:19 +01:00
Rob Pasternak
7358b910d7
feat: Weights and score normalization for DocumentJoiner with reciprocal rank fusion ( #6735 )
...
* Add weighting and score normalization for DocumentJoiner w/ reciprocal rank fusion (fix trailing whitespace)
* Add release notes
* Add unit test
* Update release note
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-01-24 15:45:53 +01:00
Vladimir Blagojevic
6e86f4e26a
Update embedding integration tests ( #6823 )
2024-01-24 15:22:47 +01:00
Vladimir Blagojevic
c47b82c54f
Remove pipeline_utils package and dependent code ( #6806 )
2024-01-23 18:40:43 +01:00
Massimiliano Pippi
4efe40664c
use haystack-pydoc-tools package instead of local code ( #6818 )
2024-01-23 18:28:52 +01:00
Tuana Çelik
1825140654
Readme updates ( #6817 )
...
* add info on dD
* fix
* Update README.md
* make tip box
* move location
2024-01-23 15:29:36 +01:00
Daria Fokina
6d8f369e9d
chore: mention cookbook repo in README ( #6814 )
...
* readme update
* formatting fix
* format2
2024-01-23 14:06:34 +01:00
Daria Fokina
5d300a7356
add missing components to docs ( #6813 )
2024-01-23 14:03:15 +01:00
Massimiliano Pippi
df2a23dfa5
chore: cleanup unused code ( #6804 )
...
* remove validation module
* remove unused code
* adjust imports
* sort imports
2024-01-23 13:20:53 +01:00
Massimiliano Pippi
f44f123b3f
chore: mention integrations in the README ( #6805 )
...
* mention integrations in the README
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update README.md
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Tuana Çelik <tuana.celik@deepset.ai>
2024-01-22 15:38:00 +01:00
Madeesh Kannan
5c8feeac6a
proposal: Integration of 3rd party evaluation frameworks ( #6784 )
...
* proposal: Integration of 3rd party evaluation frameworks
* Add note about previous eval proposal
2024-01-22 12:35:27 +01:00
Ashwin Mathur
a238c6dd51
feat: Add Exact Match metric ( #6696 )
...
* Add exact match metric
* Add release notes
* Cleanup comments in test_eval_exact_match.py
* Create separate preprocessing function; Add output_key parameter
* Update release note
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-01-22 09:57:04 +01:00
Daria Fokina
8a08ab52e1
add telemetry overview ( #6785 )
2024-01-19 16:41:28 +01:00
Augustin Chan
cad30b039a
add .haystack_debug to .gitignore ( #6782 )
2024-01-19 10:44:35 +01:00
Vladimir Blagojevic
f47439c2a2
Use forward references for type hints, avoid NameError ( #6780 )
2024-01-18 18:46:59 +01:00
Silvano Cerza
d4f6531c52
feat: Refactor Pipeline.run()
( #6729 )
...
* First rough implementation of refactored run
* Further improve run logic
* Properly handle variadic input in run
* Further work
* Enhance names and add more documentation
* Fix issue with output distribution
* This works
* Enhance run comments
* Mark Multiplexer as greedy
* Remove MergeLoop in favour of Multiplexer in tests
* Remove FirstIntSelector in favour of Multiplexer
* Handle corner when waiting for input is stuck
* Remove unused import
* Handle mutable input data in run and misbehaving components
* Handle run input validation
* Test validation
* Fix pylint
* Fix mypy
* Call warm_up in run to fix tests
2024-01-18 17:53:47 +01:00
Vladimir Blagojevic
40a8b2b4a9
Move import to lazy import section ( #6778 )
2024-01-18 17:34:55 +01:00
Vladimir Blagojevic
0b177b3bc6
feat: Improve OpenAPIServiceConnector service response serialization ( #6772 )
...
* Better service response json -> str serialization
* Add unit test
2024-01-18 16:49:48 +01:00
Vladimir Blagojevic
fea1428e84
feat: Add HuggingFaceLocalChatGenerator
( #6751 )
2024-01-18 15:53:12 +01:00
dependabot[bot]
8d65a8630b
chore(deps): bump tj-actions/changed-files from 41 to 42 ( #6774 )
...
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files ) from 41 to 42.
- [Release notes](https://github.com/tj-actions/changed-files/releases )
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md )
- [Commits](https://github.com/tj-actions/changed-files/compare/v41...v42 )
---
updated-dependencies:
- dependency-name: tj-actions/changed-files
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-18 15:24:26 +01:00
dependabot[bot]
ac353c4652
chore(deps): bump actions/cache from 3 to 4 ( #6775 )
...
Bumps [actions/cache](https://github.com/actions/cache ) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases )
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md )
- [Commits](https://github.com/actions/cache/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-18 15:21:31 +01:00
Silvano Cerza
8079501925
Speed up Document dataclass import ( #6767 )
2024-01-18 15:18:02 +01:00
Silvano Cerza
1c76aa07bb
Fix __version__ handling ( #6765 )
2024-01-18 11:11:08 +01:00
Madeesh Kannan
5d66d040cc
feat: Add serde methods to HTMLToDocument
( #6758 )
2024-01-18 10:02:01 +01:00
Sebastian Husch Lee
c0b67432e4
feat: Add page breaks to default PDF to Document converter ( #6755 )
...
* Speedup tests for PyPDFToDocument
* Added unit test and removed skipping of empty pages
* add release note
* Add back some integration marks
2024-01-18 08:54:59 +01:00
Madeesh Kannan
eaec5bfe4a
refactor: Move HF-specific model serde code to a new submodule. ( #6754 )
...
* refactor: Move HF-specific model serde code to a new submodule.
* Remove unused import
2024-01-17 18:00:16 +01:00
Julian Risch
d1bdb8c63d
chore: bump Haystack version to beta5 ( #6757 )
v2.0.0-beta.5
2024-01-17 17:28:36 +01:00
sahusiddharth
a7ac4edd07
feat: added split by page to DocumentSplitter
( #6753 )
...
* feat-added-split-by-page-to-DocumentSplitter
* added test case and the suggested changes
* Update document_splitter.py
* Update haystack/components/preprocessors/document_splitter.py
* Update test_document_splitter.py
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-01-17 15:36:29 +01:00
Madeesh Kannan
6a1514550e
test: Update E2E tests to use Pipeline.dump/load
( #6756 )
2024-01-17 15:09:27 +01:00