Silvano Cerza
2acc41ea85
Add PromptBuilder
( #5713 )
...
* Add PromptBuilder
* Update release note
* Add test
2023-09-05 12:22:21 +02:00
bogdankostic
a5b815690e
feat: Add AnswersBuilder
component (2.0) ( #5701 )
...
* Add AnswersBuilder
* Add tests for AnswersBuilder
* Add release note
* PR feedback
* Fix mypy
* Remove redundant check for number of groups
* docstrings upd
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-04 21:16:20 +02:00
ZanSara
c5369a39ef
upgrae canals ( #5708 )
2023-09-04 14:55:05 +02:00
ZanSara
7886284d4e
chore: fix mypy failure ( #5707 )
...
* mypy
* add comment on type ignore
2023-09-04 12:08:59 +02:00
Massimiliano Pippi
24b8cfb1c7
Update 3558-embedding_retriever.md ( #5705 )
2023-09-04 11:28:51 +02:00
bogdankostic
11440395f4
fix: Set model_max_length in the Tokenizer of DefaultPromptHandler
( #5596 )
...
* Set model_max_length in tokenizer in prompt handler
* Add release note
2023-09-01 11:48:41 +02:00
bogdankostic
67da275ae0
Rename question
to query
in Answer
dataclass (2.0) ( #5699 )
2023-09-01 10:37:56 +02:00
ZanSara
5f1256ac7e
feat: generators
(2.0) ( #5690 )
...
* add generators module
* add tests for module helper
* reno
* add another test
* move into openai
* improve tests
2023-08-31 17:33:12 +02:00
Vladimir Blagojevic
6787ad2435
fix: Improve imports for new rankers ( #5696 )
...
* Proper imports for new rankers
* Small fix
2023-08-31 13:33:29 +02:00
Alexander
55b10a3868
Update squad_to_dpr.py ( #5689 )
...
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-08-30 20:39:14 +03:00
Tuana Çelik
1a872a7841
update description for pypi ( #5687 )
2023-08-30 15:29:12 +02:00
github-actions[bot]
88318bfdb5
Bump unstable version ( #5686 )
...
* Update unstable version
* Bump to gooooo
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-08-30 15:27:50 +02:00
ZanSara
ce06268990
test: fix e2e test failures ( #5685 )
...
* fix test errors
* fix pipeline yaml
* disable cache
* fix errors
* remove stray fixture
2023-08-30 12:24:03 +02:00
ZanSara
1709be162c
auto trigger e2e workflow on PRs that affect it ( #5684 )
2023-08-30 10:25:47 +02:00
Fanli Lin
40d9f34e68
feat: enable passing use_fast
to the underlying transformers' pipeline ( #5655 )
...
* copy instead of deepcopy
* fix pylint
* add use_fast
* add release note
* remove unrelevant changes
* black fix
* fix bug
* black
* bug fix
2023-08-30 10:25:18 +02:00
ZanSara
b1daa7c647
chore: migrate to canals==0.7.0
( #5647 )
...
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0
* canals==0.7.0
* whisper components
* add to_dict/from_dict stubs
* import serialization methods in init to hide canals imports
* reno
* export deserializationerror too
* Update haystack/preview/__init__.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* serialization methods for LocalWhisperTranscriber (#5648 )
* chore: serialization methods for `FileExtensionClassifier` (#5651 )
* serialization methods for FileExtensionClassifier
* Update test_file_classifier.py
* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652 )
* serialization methods for SentenceTransformersDocumentEmbedder
* fix device management
* serialization methods for SentenceTransformersTextEmbedder (#5653 )
* serialization methods for TextFileToDocument (#5654 )
* chore: serialization methods for `RemoteWhisperTranscriber` (#5650 )
* serialization methods for RemoteWhisperTranscriber
* remove patches
* Add default to_dict and from_dict in document stores built with factory (#5674 )
* fix tests (#5671 )
* chore: simplify serialization methods for `MemoryDocumentStore` (#5667 )
* simplify serialization for MemoryDocumentStore
* remove redundant tests
* pylint
* chore: serialization methods for `MemoryRetriever` (#5663 )
* serialization method for MemoryRetriever
* more tests
* remove hash from default_document_store_to_dict
* remove diff in factory.py
* chore: serialization methods for `DocumentWriter` (#5661 )
* serialization methods for DocumentWriter
* more tests
* use factory
* black
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-29 18:15:07 +02:00
Silvano Cerza
a613b1b7f5
Format crawler.py
2023-08-29 17:54:30 +02:00
Vladimir Blagojevic
a9b8fd9658
Move WebRetriever's new init parameter to last parameter position ( #5673 )
2023-08-29 17:46:12 +02:00
Daria Fokina
fbc1951e74
Update crawler.py ( #5610 )
2023-08-29 16:46:19 +02:00
Vladimir Blagojevic
e5e7bb9654
feat: Allow WebRetrieve to use custom LinkContentFetcher ( #5662 )
...
* Allow use of custom LinkContentFetcher
* Add release note
2023-08-29 15:46:48 +02:00
bogdankostic
07c85905f3
fix: Change use_auth_token to token in TransformersQueryClassifier ( #5659 )
2023-08-29 15:21:25 +02:00
Bilge Yücel
ee13125e06
Add information about preview
module ( #5643 )
...
* Add information about `preview` module
* Add discussion link
* Improve text
2023-08-29 15:57:57 +03:00
Vladimir Blagojevic
1f7c7b716a
Update release note for #5526 ( #5664 )
2023-08-29 14:25:52 +02:00
Julian Risch
fa81c611e8
build: Upgrade transformers to v4.32.1 ( #5658 )
...
* upgrade transformers to 4.32.1
* added release notes
* upgrade transformers version also for inference extra
2023-08-29 13:46:00 +02:00
Vladimir Blagojevic
791f322a94
Unpin safetensors ( #5657 )
2023-08-29 13:12:11 +02:00
ZanSara
5985b6d358
chore: refactor pipeline tests for e2e testing ( #5576 )
...
* enable pipeline filder in e2e
* merge standard pipeline tests with stanrdard pipeline batch tests
* merge summarization tests into standard pipelines tests
* Update test_standard_pipelines.py
* black
2023-08-29 11:22:39 +02:00
Vladimir Blagojevic
f13b37db24
fix: LinkContentFetcher - when no content retrieved (i.e. request blocked), default to snippet text ( #5656 )
...
* When no content retrieved (i.e. request blocked), default to snippet
* Add release note
2023-08-29 10:57:47 +02:00
Vladimir Blagojevic
2118f68769
feat: Add domain scoping to WebRetriever ( #5587 )
...
* WebSearch: add allowed_domains scoped search
* Add talk to website example
* Add release note
* Add allowed_domains to WebSearch
* Minor fix
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-28 20:02:02 +02:00
Massimiliano Pippi
81f3aaf3e5
Add coverage badge ( #5634 )
2023-08-28 18:30:01 +02:00
ZanSara
55235b09ff
remove self.warm_up() ( #5644 )
2023-08-28 17:38:56 +02:00
Stefano Fiorucci
72fe4fc57b
feat: SentenceTransformersDocumentEmbedder ( #5606 )
...
* first draft
* incorporate feedback
* some unit tests
* release notes
* real release notes
* refactored to use a factory class
* allow forcing fresh instances
* first draft
* Update haystack/preview/embedding_backends/sentence_transformers_backend.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* simplify implementation and tests
* add embed_meta_fields implementation
* lg update
* improve meta data embedding; tests
* support non-string metadata
* make factory private
* change return type; improve tests
* warm_up not called in run
* fix typing
* rm unused import
* Remove base test class
* black
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 16:23:41 +02:00
Stefano Fiorucci
89c1813d9f
feat: SentenceTransformersTextEmbedder ( #5600 )
...
* first draft
* incorporate feedback
* some unit tests
* release notes
* real release notes
* first draft
* refactored to use a factory class
* adapt to new ST Embedding Backend implementation
* allow forcing fresh instances
* add tests
* release notes
* fix typo
* little improvements in tests
* Update haystack/preview/embedding_backends/sentence_transformers_backend.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* simplify implementation and tests
* lg update
* input check
* better error message
* make factory private
* change return type; improve tests
* warm_up not called in run
* warm_up not called in run
* rm unused import; default model
* fix typing
* rm unused import
* Remove BaseTestComponent
* black
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 16:23:26 +02:00
Stefano Fiorucci
35dfe47186
feat: SentenceTransformersEmbeddingBackend (v2) ( #5572 )
...
* first draft
* incorporate feedback
* some unit tests
* release notes
* real release notes
* refactored to use a factory class
* allow forcing fresh instances
* Update haystack/preview/embedding_backends/sentence_transformers_backend.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* simplify implementation and tests
* make factory private
* change return type; improve tests
* fix typing
* rm unused import
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 12:32:37 +02:00
ZanSara
4dda25d67c
proposal: LLM support in Haystack 2.0 ( #5540 )
...
* Add proposal
* add pr number
* file name
* clarify input of LLM component
* promptbuilder is tokenizer-aware
* typo
* feedback
* streaming
* Chat API
2023-08-28 10:33:07 +02:00
Silvano Cerza
444edce126
Add workflow to trigger preview package release ( #5631 )
2023-08-25 17:10:28 +02:00
Stefano Fiorucci
8342b6a457
upgrade transformers ( #5619 )
2023-08-25 16:38:34 +02:00
totto
7c7a486014
fix: in a containerized environment (like AWS ECS) there is a file wr… ( #5499 )
...
* fix: in a containerized environment (like AWS ECS) there is a file write permission error: PermissionError: [Errno 13] Permission denied: 'feedback_squad_direct.json'. catch this error.
hint: future solution similar to FILE_UPLOAD_PATH to provide a writeable path in a container.
(cherry picked from commit c54ab7ed2d487e4391c0391be7c3e268ae525507)
* fix linter error: dont use f string in logger message
* reformat
* fix: pylint requires using % in logging message
2023-08-25 13:32:29 +02:00
Silvano Cerza
cb894061f7
Add terminate-runner job in benchmarks.yml ( #5611 )
2023-08-25 10:14:39 +02:00
Silvano Cerza
66f615a3a4
Remove BaseTestComponent ( #5613 )
...
* Remove BaseTestComponent
* Add release notes
2023-08-23 17:03:37 +02:00
Silvano Cerza
d5599df029
Fix release notes ( #5599 )
2023-08-18 17:59:07 +02:00
Silvano Cerza
b53fad4c4f
Add missing integration tests to catch-all required step in tests.yml ( #5598 )
2023-08-18 17:58:26 +02:00
Silvano Cerza
03ebef7219
Remove DocumentStoreAwareMixin
( #5585 )
...
* Remove Pipeline
* Add release notes
* Enhance imports
* Update release note
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Remove Pipeline tests
* Remove DocumentStoreAwareMixin
* Add release notes
* Remove DocumentStoreAwareMixin from __all__
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-08-18 17:56:24 +02:00
Silvano Cerza
4ef813fc8a
Remove specialised Pipeline
( #5584 )
...
* Remove Pipeline
* Add release notes
* Enhance imports
* Update release note
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Remove Pipeline tests
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-08-18 17:48:13 +02:00
Silvano Cerza
72e0a588db
Rework DocumentWriter
( #5583 )
...
* Remove DocumentStoreAwareMixin from DocumentWriter
* Add release notes
2023-08-18 17:03:17 +02:00
Silvano Cerza
4bc68cbc2f
Rework MemoryRetriever
( #5582 )
...
* Remove DocumentStoreAwareMixin from MemoryRetriever
* Add release notes
* Update an article
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-08-18 16:33:35 +02:00
Massimiliano Pippi
011baf492f
leftover from #5580 ( #5593 )
2023-08-18 12:53:40 +02:00
Massimiliano Pippi
7e633c6b0c
chore: change import paths under preview
( #5592 )
...
* fix import paths
* add release notes
2023-08-18 12:53:25 +02:00
Massimiliano Pippi
39a1f61326
chore: improve error message in FileExtensionClassifier ( #5590 )
...
* output an actionable error
* add release note
* fix matching in raised error
* fix release note category
2023-08-18 12:28:55 +02:00
Stefano Fiorucci
aa8da40820
chore: add preview
section to release notes ( #5591 )
...
* add preview section to reno config and update existing notes
* Empty commit to trigger CLA
2023-08-18 09:59:01 +02:00
Vladimir Blagojevic
da67700318
Rename web_lfqa_improved and update questions ( #5588 )
2023-08-17 17:10:49 +02:00