Silvano Cerza
53b77dda6c
Move tests for write_documents from DocumentStoreBaseTests to separate class ( #6334 )
2023-11-17 19:25:16 +01:00
Silvano Cerza
326f51df9d
Move tests for count_document from DocumentStoreBaseTests to separate class ( #6332 )
2023-11-17 18:00:11 +01:00
Stefano Fiorucci
68be0d7f2c
refactor: improve Document representation ( #6333 )
...
* new repr
* reno
2023-11-17 17:49:00 +01:00
Silvano Cerza
5184481e50
refactor: Remove unecessary method to compare list of Document
s in DocumentStoreBaseTests
( #6324 )
...
* Change Document.__eq__ to compare all fields
* Remove unecessary method to compare list of Documents in DocumentStoreBaseTests
2023-11-17 17:03:16 +01:00
ZanSara
e888852aec
Standardize TextFileToDocument
( #6232 )
...
* simplify textfiletodocument
* fix error handling and tests
* stray print
* reno
* streams->sources
* reno
* feedback
* test
* fix tests
2023-11-17 15:39:39 +01:00
Silvano Cerza
c26a932423
Change preview tests to run all tests except integration ones ( #6325 )
2023-11-17 15:33:43 +01:00
ZanSara
dfc1d452bb
feat: upgrade canals to 0.10.1 ( #6309 )
...
* upgrade canals
* reno
* trigger preview e2e
* bump canals
* fix decorator
* fix test
* test factory
* tests inmemory
* tests writer
* test audio
* tests builders
* tests caching
* tests embedders
* tests converters
* tests generators
* tests rankers
* tests retrievers
* fix pipeline and telemetry tests
* remove trigger
2023-11-17 14:46:23 +01:00
Vladimir Blagojevic
21bcfe76fb
Convert function call JSON payload to str ( #6277 )
2023-11-17 14:45:15 +01:00
Stefano Fiorucci
dd6e35d675
build: upgrade to transformers==4.35.2
( #6322 )
...
* upgrade transformers to 4.35.2
* reno
2023-11-17 10:12:34 +01:00
Julian Risch
1c85e44156
test: Add langdetect installation to e2e tests ( #6327 )
...
* Add langdetect installation to e2e tests
* compare doc content and id only
2023-11-17 10:12:05 +01:00
Silvano Cerza
6dda6e5b2d
Change Document.__eq__ to compare all fields ( #6323 )
2023-11-16 17:17:43 +01:00
Massimiliano Pippi
ff3165b8b8
fix: fix un-flattening of metadata ( #6318 )
...
* fix un-flattening of metadata
* test should pass
* add relnote
* change policy: raise an error if both meta and keys are passed
* Update document.py
* support python 3.8
* adjust wording in the error message
2023-11-16 17:10:53 +01:00
Julian Risch
34ecff1d19
build: Upgrade openai-whisper and re-introduce audio extra ( #6319 )
...
* upgrade openai-whisper and re-introduce audio extra
* add audio extra to
2023-11-16 15:04:50 +01:00
Julian Risch
8b092a90c0
test: Add MetadataRouter to preprocessing pipeline in e2e test ( #6321 )
...
* add MetadataRouter to preprocessing pipeline
* replace mimetype check with language check
2023-11-16 11:22:37 +01:00
x110
c4cfe6cb90
fix: Load additional fields from SQUAD-format file to meta field for labels #5978 ( #6301 )
...
* Load additional fields from SQUAD-format file to meta field for labels
* added a test function
* rewritten test using pytest
* added release notes
* improve release note
* clean up test
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-16 10:44:51 +01:00
Stefano Fiorucci
c691412652
chore: improve the error messages of LazyImport
( #6316 )
...
* improve lazy import error messages
* revert changes to adaptive_model
2023-11-16 09:02:12 +01:00
Agnieszka Marzec
414cbcfd92
Update docstrings ( #6297 )
2023-11-15 17:54:10 +01:00
Stefano Fiorucci
f74f034549
fix pydoc config ( #6313 )
2023-11-15 15:32:29 +01:00
Vivek Silimkhan
f998bf4a4f
feat: add Amazon Bedrock support ( #6226 )
...
* Add Bedrock
* Update supported models for Bedrock
* Fix supports and add extract response in Bedrock
* fix errors imports
* improve and refactor supports
* fix install
* fix mypy
* fix pylint
* fix existing tests
* Added Anthropic Bedrock
* fix tests
* fix sagemaker tests
* add default prompt handler, constructor and supports tests
* more tests
* invoke refactoring
* refactor model_kwargs
* fix mypy
* lstrip responses
* Add streaming support
* bump boto3 version
* add class docstrings, better exception names
* fix layer name
* add tests for anthropic and cohere model adapters
* update cohere params
* update ai21 args and add tests
* support cohere command light model
* add tital tests
* better class names
* support meta llama 2 model
* fix streaming support
* more future-proof model adapter selection
* fix import
* fix mypy
* fix pylint for preview
* add tests for streaming
* add release notes
* Apply suggestions from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* fix format
* fix tests after msg changes
* fix streaming for cohere
---------
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
Co-authored-by: tstadel <thomas.stadelmann@deepset.ai>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2023-11-15 13:26:29 +01:00
Julian Risch
08ec492039
refactor!: Remove routing from DocumentLanguageClassifier and rename TextLanguageClassifier ( #6307 )
...
* remove routing from DocumentLanguageClassifier
* fix MetadataRouter typo
2023-11-15 13:10:07 +01:00
Julian Risch
5295b40def
docs: Reader returns top_k+1 answers if no_answer is enabled
2023-11-15 10:20:21 +01:00
Julian Risch
807cd6d139
chore: Add MetaFieldRanker to rankers __init__.py
2023-11-14 13:53:58 +01:00
Ashwin Mathur
4e4d5eb3e2
feat!: Remove unused query parameter from MetaFieldRanker
( #6300 )
...
* Remove unused query parameter from MetaFieldRanker
* Add release notes
2023-11-14 12:33:38 +01:00
Daria Fokina
34136382c1
docs: 2.0 API reference ( #6262 )
...
* docs: 2.0 API reference
* add builders and generators
* classifiers file path
2023-11-14 10:12:28 +01:00
Tuana Çelik
b8fdb880f9
Update docstring in html.py ( #6279 )
...
The explanation of 'sources' is inadequate especially because this is probably going to be most used with `LinkContentFetcher` that returns `List[ByteStream]`
2023-11-13 12:32:25 +01:00
Stefano Fiorucci
f708cf6056
refactor!: set scale_score
default value to False ( #6276 )
...
* set default scale_score to False
* release note
2023-11-13 11:59:18 +01:00
Silvano Cerza
8e7ce208fc
Fix Document init when passing non existing fields ( #6286 )
...
* Fix Document init when passing non existing fields
* Update releasenotes/notes/fix-document-init-09c1cbb14202be7d.yaml
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Fix linting
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-11-13 11:42:42 +01:00
Tuana Çelik
bf637e9c7e
Update transformers_similarity.py ( #6280 )
...
Fixing the Document examples
2023-11-13 09:35:00 +01:00
Stefano Fiorucci
92a8704de4
mypy ignore specific errors ( #6278 )
2023-11-10 18:10:38 +01:00
Massimiliano Pippi
1b63cfc8b3
fix: make types work without installing pypdf ( #6269 )
...
* make types work without installing pypdf
* make pylint happy, keep pyright happy, hope mypy doesn't care
2023-11-09 20:02:22 +01:00
Vladimir Blagojevic
b4d8d1c904
feat: Add custom conversion callable to PyPDFToDocument - Haystack 2.x ( #6258 )
...
* Allow user specified converter hook
* Add a release note
* More unit tests
* PR review - Massi, use protocol as converter
2023-11-09 17:35:33 +01:00
Agnieszka Marzec
1046bebbe0
Docs: Update docstrings lg ( #6260 )
...
* Update docstrings lg
* Update test_in_memory_bm25_retriever.py
* Update test_in_memory_embedding_retriever.py
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-11-09 17:34:52 +01:00
Tuana Çelik
3be6ec7840
Update openai.py ( #6263 )
2023-11-09 17:33:35 +01:00
Silvano Cerza
73e2843cf1
Fix deprecation warning when calling Document.from_dict() ( #6267 )
2023-11-09 16:50:06 +01:00
Stefano Fiorucci
f95937b0ce
chore: move HuggingFaceLocalGenerator
to the generators
directory ( #6264 )
...
* move HuggingFaceLocalGenerator to right directory
* fix tests
2023-11-09 15:59:23 +01:00
Stefano Fiorucci
2b3c77e41d
fix: make JoinDocuments
correctly handle duplicate documents w null scores ( #6261 )
...
* fix error with null values
* release note
* simplify
2023-11-09 14:28:56 +01:00
Domenico
676da681d0
feat: MetaField Ranker ( #6189 )
...
* proposal: meta field ranker
* Apply suggestions from code review
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
* update proposal filename
* feat: add metafield ranker
* fix docstrings
* remove proposal file from pr
* add release notes
* update code according to new Document class
* separate loops for each ranking mode in __merge_scores
* change error type in init and new tests for linear score warning
* docstring upd
---------
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-11-09 12:20:41 +01:00
Sebastian Husch Lee
71d0d92ea2
feat: Add model_kwargs
to ExtractiveReader to impact model loading ( #6257 )
...
* Add ability to pass model_kwargs to AutoModelForQuestionAnswering
* Add testing for new model_kwargs
* Add spacing
* Add release notes
* Update haystack/preview/components/readers/extractive.py
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
* Make changes suggested by Stefano
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-09 11:25:22 +01:00
Vladimir Blagojevic
cd429a73cd
feat: Add GPTChatGenerator
to Haystack 2.x ( #6212 )
...
* Add GPTChatGenerator
* Apply lessons from previous PR
* PR review - Stefano
2023-11-09 10:45:41 +01:00
Daria Fokina
08e211f9d6
docs: fix whisper_local indentations docstrings ( #6209 )
...
* whisper_local indentations
* Update whisper_local.py
* fix param docstrings
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-11-08 18:15:39 +01:00
Stefano Fiorucci
72cbf3ee0b
fix: replace haystack.lazy_imports
with haystack.preview.lazy_imports
( #6255 )
...
* lazy import transformers in tgi
* fix pylint
* fix wrong import
2023-11-08 17:33:07 +01:00
Massimiliano Pippi
f019896335
ci: Generate release notes in a Github workflow ( #6211 )
...
* first try
* Update config.yaml
* Update github_release.yml
* set the rc0 tag more explicitly
2023-11-08 12:29:37 +01:00
jambudipa
2f118e857c
feat: add tokenization details for gpt-4-1106-preview ( #6250 )
...
* feat: add tokenization details for gpt-4-1106-preview
* update max_tokens value
* reno
---------
Co-authored-by: jambudipa <mark.norgate@ext.ons.gov.uk>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-11-08 12:04:08 +01:00
Massimiliano Pippi
58e357148e
ci: tag when branching off for a release ( #6206 )
...
* tag when branching off
* change minor bump workflow
* Update .github/workflows/minor_version_release.yml
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update minor_version_release.yml
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-11-08 11:06:45 +01:00
Silvano Cerza
bf884094d1
refactor: Change Document.blob type and remove mime_type field ( #6249 )
...
* Change Document.blob type and remove mime_type field
* Add release notes
* Remove mime_type from Document docstring
2023-11-08 10:35:17 +01:00
Stefano Fiorucci
e2881e2ad3
fix: lazy import transformers in TGI Generators ( #6252 )
...
* lazy import transformers in tgi
* fix pylint
2023-11-08 00:09:42 +01:00
Vladimir Blagojevic
5497ca2a45
feat: Adapt GPTGenerator
to use str input/output format in Haystack 2.x ( #6214 )
...
* Adapt GPTGenerator to string input/output
* Finishing touches
* punctuation upd
* PR feedback
* Small naming fixes
* Update haystack/preview/components/generators/openai.py
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
* Update class pydoc with a printed response
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-11-07 18:00:43 +01:00
Silvano Cerza
6c5bfe3da4
Update README links and badges ( #6248 )
2023-11-07 16:53:17 +01:00
Stefano Fiorucci
9b76acb165
pin openai<1 ( #6244 )
2023-11-06 18:11:41 +01:00
Stefano Fiorucci
982ac3df01
fix: fix failing e2e test (after moving classifiers) ( #6243 )
...
* mv classifiers
* release note
* fix e2e test
2023-11-06 17:08:20 +01:00