3174 Commits

Author SHA1 Message Date
Silvano Cerza
2acc41ea85
Add PromptBuilder (#5713)
* Add PromptBuilder

* Update release note

* Add test
2023-09-05 12:22:21 +02:00
bogdankostic
a5b815690e
feat: Add AnswersBuilder component (2.0) (#5701)
* Add AnswersBuilder

* Add tests for AnswersBuilder

* Add release note

* PR feedback

* Fix mypy

* Remove redundant check for number of groups

* docstrings upd

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-04 21:16:20 +02:00
ZanSara
c5369a39ef
upgrae canals (#5708) 2023-09-04 14:55:05 +02:00
ZanSara
7886284d4e
chore: fix mypy failure (#5707)
* mypy

* add comment on type ignore
2023-09-04 12:08:59 +02:00
Massimiliano Pippi
24b8cfb1c7
Update 3558-embedding_retriever.md (#5705) 2023-09-04 11:28:51 +02:00
bogdankostic
11440395f4
fix: Set model_max_length in the Tokenizer of DefaultPromptHandler (#5596)
* Set model_max_length in tokenizer in prompt handler

* Add release note
2023-09-01 11:48:41 +02:00
bogdankostic
67da275ae0
Rename question to query in Answer dataclass (2.0) (#5699) 2023-09-01 10:37:56 +02:00
ZanSara
5f1256ac7e
feat: generators (2.0) (#5690)
* add generators module

* add tests for module helper

* reno

* add another test

* move into openai

* improve tests
2023-08-31 17:33:12 +02:00
Vladimir Blagojevic
6787ad2435
fix: Improve imports for new rankers (#5696)
* Proper imports for new rankers

* Small fix
2023-08-31 13:33:29 +02:00
Alexander
55b10a3868
Update squad_to_dpr.py (#5689)
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-08-30 20:39:14 +03:00
Tuana Çelik
1a872a7841
update description for pypi (#5687) 2023-08-30 15:29:12 +02:00
github-actions[bot]
88318bfdb5
Bump unstable version (#5686)
* Update unstable version

* Bump to gooooo

---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-08-30 15:27:50 +02:00
ZanSara
ce06268990
test: fix e2e test failures (#5685)
* fix test errors

* fix pipeline yaml

* disable cache

* fix errors

* remove stray fixture
2023-08-30 12:24:03 +02:00
ZanSara
1709be162c
auto trigger e2e workflow on PRs that affect it (#5684) 2023-08-30 10:25:47 +02:00
Fanli Lin
40d9f34e68
feat: enable passing use_fast to the underlying transformers' pipeline (#5655)
* copy instead of deepcopy

* fix pylint

* add use_fast

* add release note

* remove unrelevant changes

* black fix

* fix bug

* black

* bug fix
2023-08-30 10:25:18 +02:00
ZanSara
b1daa7c647
chore: migrate to canals==0.7.0 (#5647)
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0

* canals==0.7.0

* whisper components

* add to_dict/from_dict stubs

* import serialization methods in init to hide canals imports

* reno

* export deserializationerror too

* Update haystack/preview/__init__.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* serialization methods for LocalWhisperTranscriber (#5648)

* chore: serialization methods for `FileExtensionClassifier` (#5651)

* serialization methods for FileExtensionClassifier

* Update test_file_classifier.py

* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652)

* serialization methods for SentenceTransformersDocumentEmbedder

* fix device management

* serialization methods for SentenceTransformersTextEmbedder (#5653)

* serialization methods for TextFileToDocument (#5654)

* chore: serialization methods for `RemoteWhisperTranscriber` (#5650)

* serialization methods for RemoteWhisperTranscriber

* remove patches

* Add default to_dict and from_dict in document stores built with factory (#5674)

* fix tests (#5671)

* chore: simplify serialization methods for `MemoryDocumentStore` (#5667)

* simplify serialization for MemoryDocumentStore

* remove redundant tests

* pylint

* chore: serialization methods for `MemoryRetriever` (#5663)

* serialization method for MemoryRetriever

* more tests

* remove hash from default_document_store_to_dict

* remove diff in factory.py

* chore: serialization methods for `DocumentWriter` (#5661)

* serialization methods for DocumentWriter

* more tests

* use factory

* black

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-29 18:15:07 +02:00
Silvano Cerza
a613b1b7f5 Format crawler.py 2023-08-29 17:54:30 +02:00
Vladimir Blagojevic
a9b8fd9658
Move WebRetriever's new init parameter to last parameter position (#5673) 2023-08-29 17:46:12 +02:00
Daria Fokina
fbc1951e74
Update crawler.py (#5610) 2023-08-29 16:46:19 +02:00
Vladimir Blagojevic
e5e7bb9654
feat: Allow WebRetrieve to use custom LinkContentFetcher (#5662)
* Allow use of custom LinkContentFetcher

* Add release note
2023-08-29 15:46:48 +02:00
bogdankostic
07c85905f3
fix: Change use_auth_token to token in TransformersQueryClassifier (#5659) 2023-08-29 15:21:25 +02:00
Bilge Yücel
ee13125e06
Add information about preview module (#5643)
* Add information about `preview` module

* Add discussion link

* Improve text
2023-08-29 15:57:57 +03:00
Vladimir Blagojevic
1f7c7b716a
Update release note for #5526 (#5664) 2023-08-29 14:25:52 +02:00
Julian Risch
fa81c611e8
build: Upgrade transformers to v4.32.1 (#5658)
* upgrade transformers to 4.32.1

* added release notes

* upgrade transformers version also for inference extra
2023-08-29 13:46:00 +02:00
Vladimir Blagojevic
791f322a94
Unpin safetensors (#5657) 2023-08-29 13:12:11 +02:00
ZanSara
5985b6d358
chore: refactor pipeline tests for e2e testing (#5576)
* enable pipeline filder in e2e

* merge standard pipeline tests with stanrdard pipeline batch tests

* merge summarization tests into standard pipelines tests

* Update test_standard_pipelines.py

* black
2023-08-29 11:22:39 +02:00
Vladimir Blagojevic
f13b37db24
fix: LinkContentFetcher - when no content retrieved (i.e. request blocked), default to snippet text (#5656)
* When no content retrieved (i.e. request blocked), default to snippet

* Add release note
2023-08-29 10:57:47 +02:00
Vladimir Blagojevic
2118f68769
feat: Add domain scoping to WebRetriever (#5587)
* WebSearch: add allowed_domains scoped search

* Add talk to website example

* Add release note

* Add allowed_domains to WebSearch

* Minor fix

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-28 20:02:02 +02:00
Massimiliano Pippi
81f3aaf3e5
Add coverage badge (#5634) 2023-08-28 18:30:01 +02:00
ZanSara
55235b09ff
remove self.warm_up() (#5644) 2023-08-28 17:38:56 +02:00
Stefano Fiorucci
72fe4fc57b
feat: SentenceTransformersDocumentEmbedder (#5606)
* first draft

* incorporate feedback

* some unit tests

* release notes

* real release notes

* refactored to use a factory class

* allow forcing fresh instances

* first draft

* Update haystack/preview/embedding_backends/sentence_transformers_backend.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* simplify implementation and tests

* add embed_meta_fields implementation

* lg update

* improve meta data embedding; tests

* support non-string metadata

* make factory private

* change return type; improve tests

* warm_up not called in run

* fix typing

* rm unused import

* Remove base test class

* black

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 16:23:41 +02:00
Stefano Fiorucci
89c1813d9f
feat: SentenceTransformersTextEmbedder (#5600)
* first draft

* incorporate feedback

* some unit tests

* release notes

* real release notes

* first draft

* refactored to use a factory class

* adapt to new ST Embedding Backend implementation

* allow forcing fresh instances

* add tests

* release notes

* fix typo

* little improvements in tests

* Update haystack/preview/embedding_backends/sentence_transformers_backend.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* simplify implementation and tests

* lg update

* input check

* better error message

* make factory private

* change return type; improve tests

* warm_up not called in run

* warm_up not called in run

* rm unused import; default model

* fix typing

* rm unused import

* Remove BaseTestComponent

* black

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 16:23:26 +02:00
Stefano Fiorucci
35dfe47186
feat: SentenceTransformersEmbeddingBackend (v2) (#5572)
* first draft

* incorporate feedback

* some unit tests

* release notes

* real release notes

* refactored to use a factory class

* allow forcing fresh instances

* Update haystack/preview/embedding_backends/sentence_transformers_backend.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* simplify implementation and tests

* make factory private

* change return type; improve tests

* fix typing

* rm unused import

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-08-28 12:32:37 +02:00
ZanSara
4dda25d67c
proposal: LLM support in Haystack 2.0 (#5540)
* Add proposal

* add pr number

* file name

* clarify input of LLM component

* promptbuilder is tokenizer-aware

* typo

* feedback

* streaming

* Chat API
2023-08-28 10:33:07 +02:00
Silvano Cerza
444edce126
Add workflow to trigger preview package release (#5631) 2023-08-25 17:10:28 +02:00
Stefano Fiorucci
8342b6a457
upgrade transformers (#5619) 2023-08-25 16:38:34 +02:00
totto
7c7a486014
fix: in a containerized environment (like AWS ECS) there is a file wr… (#5499)
* fix: in a containerized environment (like AWS ECS) there is a file write permission error: 	PermissionError: [Errno 13] Permission denied: 'feedback_squad_direct.json'. catch this error.
hint: future solution similar to FILE_UPLOAD_PATH to provide a writeable path in a container.

(cherry picked from commit c54ab7ed2d487e4391c0391be7c3e268ae525507)

* fix linter error: dont use f string in logger message

* reformat

* fix: pylint requires using % in logging message
2023-08-25 13:32:29 +02:00
Silvano Cerza
cb894061f7
Add terminate-runner job in benchmarks.yml (#5611) 2023-08-25 10:14:39 +02:00
Silvano Cerza
66f615a3a4
Remove BaseTestComponent (#5613)
* Remove BaseTestComponent

* Add release notes
2023-08-23 17:03:37 +02:00
Silvano Cerza
d5599df029
Fix release notes (#5599) 2023-08-18 17:59:07 +02:00
Silvano Cerza
b53fad4c4f
Add missing integration tests to catch-all required step in tests.yml (#5598) 2023-08-18 17:58:26 +02:00
Silvano Cerza
03ebef7219
Remove DocumentStoreAwareMixin (#5585)
* Remove Pipeline

* Add release notes

* Enhance imports

* Update release note

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* Remove Pipeline tests

* Remove DocumentStoreAwareMixin

* Add release notes

* Remove DocumentStoreAwareMixin from __all__

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-08-18 17:56:24 +02:00
Silvano Cerza
4ef813fc8a
Remove specialised Pipeline (#5584)
* Remove Pipeline

* Add release notes

* Enhance imports

* Update release note

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* Remove Pipeline tests

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-08-18 17:48:13 +02:00
Silvano Cerza
72e0a588db
Rework DocumentWriter (#5583)
* Remove DocumentStoreAwareMixin from DocumentWriter

* Add release notes
2023-08-18 17:03:17 +02:00
Silvano Cerza
4bc68cbc2f
Rework MemoryRetriever (#5582)
* Remove DocumentStoreAwareMixin from MemoryRetriever

* Add release notes

* Update an article

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-08-18 16:33:35 +02:00
Massimiliano Pippi
011baf492f
leftover from #5580 (#5593) 2023-08-18 12:53:40 +02:00
Massimiliano Pippi
7e633c6b0c
chore: change import paths under preview (#5592)
* fix import paths

* add release notes
2023-08-18 12:53:25 +02:00
Massimiliano Pippi
39a1f61326
chore: improve error message in FileExtensionClassifier (#5590)
* output an actionable error

* add release note

* fix matching in raised error

* fix release note category
2023-08-18 12:28:55 +02:00
Stefano Fiorucci
aa8da40820
chore: add preview section to release notes (#5591)
* add preview section to reno config and update existing notes

* Empty commit to trigger CLA
2023-08-18 09:59:01 +02:00
Vladimir Blagojevic
da67700318
Rename web_lfqa_improved and update questions (#5588) 2023-08-17 17:10:49 +02:00