ZanSara
94c5d6d216
feat: make GPT35Generator
non batch ( #5764 )
...
* make gpt35generator not batch
* fix tests
* review feedback
* mypy
2023-09-12 18:19:28 +02:00
Christian Clauss
6846448bac
pylint: Set limits on code complexity ( #5771 )
2023-09-12 18:13:23 +02:00
ZanSara
24c42b1e03
fix tests ( #5773 )
2023-09-12 17:41:08 +02:00
ZanSara
7194343458
remove test ( #5753 )
2023-09-12 16:04:36 +02:00
ZanSara
869f69d0d1
fix: temporary pin tiktoken ( #5774 )
...
* exclude breaking tiktoken version
* exclude breaking tiktoken version
2023-09-12 14:35:52 +02:00
Christian Clauss
23f7308bec
ci: pre-commit autoupdate ( #5777 )
2023-09-12 14:34:41 +02:00
Christian Clauss
45cc40bf51
linting.yml: Upgrade GitHub Actions ( #5752 )
2023-09-11 20:49:20 +02:00
ZanSara
6e70d403f8
feat: Improve Document
for Haystack 2.0 ( #5738 )
...
* initial draft
* tests
* add proposal
* proposal number
* reno
* fix tests and usage of content and content_type
* update branch & fix more tests
* mypy
* add docstring
* fix more tests
* review feedback
* improve __str__
* Apply suggestions from code review
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/preview/dataclasses/document.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* improve __str__
* fix tests
* fix more tests
* Update haystack/preview/document_stores/memory/document_store.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-11 17:40:00 +02:00
Stefano Fiorucci
2edf85f739
MemoryEmbeddingRetriever
(2.0) (#5726 )
...
* MemoryDocumentStore - Embedding retrieval draft
* add release notes
* fix mypy
* better comment
* improve return_embeddings handling
* MemoryEmbeddingRetriever - first draft
* address PR comments
* release note
* update docstrings
* update docstrings
* incorporated feeback
* add return_embedding to __init__
* rm leftover docstring
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-08 15:52:48 +02:00
Stefano Fiorucci
d860a5c604
make tests more robust ( #5747 )
2023-09-08 15:50:56 +02:00
Tuana Çelik
b5987a6d8d
Update web.py ( #5742 )
...
Fixing the api docs for webretriever.
2023-09-08 09:06:14 +02:00
Stefano Fiorucci
b7bea3ae9c
MemoryDocumentStore
- Embedding retrieval (2.0) (#5715 )
...
* MemoryDocumentStore - Embedding retrieval draft
* add release notes
* fix mypy
* better comment
* improve return_embeddings handling
* address PR comments
* update docstrings
* incorporated feeback
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-07 15:44:07 +02:00
bogdankostic
71852c7b06
Fix output of AnswerBuilder ( #5737 )
2023-09-07 12:54:24 +02:00
ZanSara
7abd73419f
fix remote whisper tests ( #5732 )
2023-09-07 10:53:29 +02:00
bogdankostic
42b6954aa5
docs: Remove mention of hosted annotation tool ( #5735 )
2023-09-07 10:40:31 +02:00
ZanSara
63cbde7287
feat: GPT35Generator
( #5714 )
...
* chatgpt backend
* fix tests
* reno
* remove print
* helpers tests
* add chatgpt generator
* use openai sdk
* remove backend
* tests are broken
* fix tests
* stray param
* move _check_troncated_answers into the class
* wrong import
* rename function
* typo in test
* add openai deps
* mypy
* improve system prompt docstring
* typos update
* Update haystack/preview/components/generators/openai/chatgpt.py
* pylint
* Update haystack/preview/components/generators/openai/chatgpt.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/preview/components/generators/openai/chatgpt.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/preview/components/generators/openai/chatgpt.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* review feedback
* fix tests
* freview feedback
* reno
* remove tenacity mock
* gpt35generator
* fix naming
* remove stray references to chatgpt
* fix e2e
* Update releasenotes/notes/chatgpt-llm-generator-d043532654efe684.yaml
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* add another test
* test wrong model name
* review feedback
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-09-07 10:06:57 +02:00
Vladimir Blagojevic
c5edb45c10
feat: Add SerperDevWebSearch
Haystack 2.0 component ( #5712 )
...
* Add SerperDev
* Add release note
* PR Feedback
* Simplify, remove one-liner
* Update haystack/preview/components/websearch/serper_dev.py
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
* Update haystack/preview/components/websearch/serper_dev.py
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
* Fix formatting
* PR feedback
* Fix tests
* Function rename
* Remove scoring, update tests
* PR feedback
* Fix return
* small adjustments
* fix tests
* add e2e test
* fix release notes
* fix tests
* fix e2e
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-09-06 17:31:42 +02:00
ZanSara
0bbc219a59
chore: enable e2e preview tests ( #5730 )
...
* enable e2e preview tests
* fix transcriber test
* quotes
* add missing dep
* missing comma
* ffmpeg
2023-09-06 16:48:45 +02:00
Timo Moeller
d048bb5352
docs: Add minimal getting started code to showcase haystack + RAG ( #5578 )
...
* init
* Change question
* Add TODO comment
* Addressing feedback
* Add local folder option. Move additional functions inside haystack.utils for easier imports
* Apply Daria's review suggestions
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Add integration test
* change string formatting
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Add outputparser to HF
* Exclude anthropic test
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-09-06 12:14:08 +02:00
ZanSara
10d6886255
chore: move PromptBuilder in builders ( #5729 )
2023-09-06 11:52:21 +02:00
Timo Moeller
d540883469
Add api keys to CI workflows ( #5722 )
2023-09-05 16:21:17 +02:00
Agnieszka Marzec
5d2a7534a0
Correct the number of tokens ( #5548 )
...
As per https://discord.com/channels/954421988141711382/1136952298740920341/1138936382467866694
2023-09-05 15:07:45 +02:00
bogdankostic
639f7cf888
chore: Rename AnswersBuilder
to AnswerBuilder
( #5720 )
...
* Add AnswersBuilder
* Add tests for AnswersBuilder
* Add release note
* PR feedback
* Fix mypy
* Remove redundant check for number of groups
* Rename AnswersBuilder to AnswerBuilder
* Update test/preview/components/builders/test_answer_builder.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Rename reno file
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-05 14:34:22 +02:00
Silvano Cerza
2acc41ea85
Add PromptBuilder
( #5713 )
...
* Add PromptBuilder
* Update release note
* Add test
2023-09-05 12:22:21 +02:00
bogdankostic
a5b815690e
feat: Add AnswersBuilder
component (2.0) ( #5701 )
...
* Add AnswersBuilder
* Add tests for AnswersBuilder
* Add release note
* PR feedback
* Fix mypy
* Remove redundant check for number of groups
* docstrings upd
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-04 21:16:20 +02:00
ZanSara
c5369a39ef
upgrae canals ( #5708 )
2023-09-04 14:55:05 +02:00
ZanSara
7886284d4e
chore: fix mypy failure ( #5707 )
...
* mypy
* add comment on type ignore
2023-09-04 12:08:59 +02:00
Massimiliano Pippi
24b8cfb1c7
Update 3558-embedding_retriever.md ( #5705 )
2023-09-04 11:28:51 +02:00
bogdankostic
11440395f4
fix: Set model_max_length in the Tokenizer of DefaultPromptHandler
( #5596 )
...
* Set model_max_length in tokenizer in prompt handler
* Add release note
2023-09-01 11:48:41 +02:00
bogdankostic
67da275ae0
Rename question
to query
in Answer
dataclass (2.0) ( #5699 )
2023-09-01 10:37:56 +02:00
ZanSara
5f1256ac7e
feat: generators
(2.0) ( #5690 )
...
* add generators module
* add tests for module helper
* reno
* add another test
* move into openai
* improve tests
2023-08-31 17:33:12 +02:00
Vladimir Blagojevic
6787ad2435
fix: Improve imports for new rankers ( #5696 )
...
* Proper imports for new rankers
* Small fix
2023-08-31 13:33:29 +02:00
Alexander
55b10a3868
Update squad_to_dpr.py ( #5689 )
...
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-08-30 20:39:14 +03:00
Tuana Çelik
1a872a7841
update description for pypi ( #5687 )
2023-08-30 15:29:12 +02:00
github-actions[bot]
88318bfdb5
Bump unstable version ( #5686 )
...
* Update unstable version
* Bump to gooooo
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-08-30 15:27:50 +02:00
ZanSara
ce06268990
test: fix e2e test failures ( #5685 )
...
* fix test errors
* fix pipeline yaml
* disable cache
* fix errors
* remove stray fixture
2023-08-30 12:24:03 +02:00
ZanSara
1709be162c
auto trigger e2e workflow on PRs that affect it ( #5684 )
2023-08-30 10:25:47 +02:00
Fanli Lin
40d9f34e68
feat: enable passing use_fast
to the underlying transformers' pipeline ( #5655 )
...
* copy instead of deepcopy
* fix pylint
* add use_fast
* add release note
* remove unrelevant changes
* black fix
* fix bug
* black
* bug fix
2023-08-30 10:25:18 +02:00
ZanSara
b1daa7c647
chore: migrate to canals==0.7.0
( #5647 )
...
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0
* canals==0.7.0
* whisper components
* add to_dict/from_dict stubs
* import serialization methods in init to hide canals imports
* reno
* export deserializationerror too
* Update haystack/preview/__init__.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* serialization methods for LocalWhisperTranscriber (#5648 )
* chore: serialization methods for `FileExtensionClassifier` (#5651 )
* serialization methods for FileExtensionClassifier
* Update test_file_classifier.py
* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652 )
* serialization methods for SentenceTransformersDocumentEmbedder
* fix device management
* serialization methods for SentenceTransformersTextEmbedder (#5653 )
* serialization methods for TextFileToDocument (#5654 )
* chore: serialization methods for `RemoteWhisperTranscriber` (#5650 )
* serialization methods for RemoteWhisperTranscriber
* remove patches
* Add default to_dict and from_dict in document stores built with factory (#5674 )
* fix tests (#5671 )
* chore: simplify serialization methods for `MemoryDocumentStore` (#5667 )
* simplify serialization for MemoryDocumentStore
* remove redundant tests
* pylint
* chore: serialization methods for `MemoryRetriever` (#5663 )
* serialization method for MemoryRetriever
* more tests
* remove hash from default_document_store_to_dict
* remove diff in factory.py
* chore: serialization methods for `DocumentWriter` (#5661 )
* serialization methods for DocumentWriter
* more tests
* use factory
* black
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-29 18:15:07 +02:00
Silvano Cerza
a613b1b7f5
Format crawler.py
2023-08-29 17:54:30 +02:00
Vladimir Blagojevic
a9b8fd9658
Move WebRetriever's new init parameter to last parameter position ( #5673 )
2023-08-29 17:46:12 +02:00
Daria Fokina
fbc1951e74
Update crawler.py ( #5610 )
2023-08-29 16:46:19 +02:00
Vladimir Blagojevic
e5e7bb9654
feat: Allow WebRetrieve to use custom LinkContentFetcher ( #5662 )
...
* Allow use of custom LinkContentFetcher
* Add release note
2023-08-29 15:46:48 +02:00
bogdankostic
07c85905f3
fix: Change use_auth_token to token in TransformersQueryClassifier ( #5659 )
2023-08-29 15:21:25 +02:00
Bilge Yücel
ee13125e06
Add information about preview
module ( #5643 )
...
* Add information about `preview` module
* Add discussion link
* Improve text
2023-08-29 15:57:57 +03:00
Vladimir Blagojevic
1f7c7b716a
Update release note for #5526 ( #5664 )
2023-08-29 14:25:52 +02:00
Julian Risch
fa81c611e8
build: Upgrade transformers to v4.32.1 ( #5658 )
...
* upgrade transformers to 4.32.1
* added release notes
* upgrade transformers version also for inference extra
2023-08-29 13:46:00 +02:00
Vladimir Blagojevic
791f322a94
Unpin safetensors ( #5657 )
2023-08-29 13:12:11 +02:00
ZanSara
5985b6d358
chore: refactor pipeline tests for e2e testing ( #5576 )
...
* enable pipeline filder in e2e
* merge standard pipeline tests with stanrdard pipeline batch tests
* merge summarization tests into standard pipelines tests
* Update test_standard_pipelines.py
* black
2023-08-29 11:22:39 +02:00
Vladimir Blagojevic
f13b37db24
fix: LinkContentFetcher - when no content retrieved (i.e. request blocked), default to snippet text ( #5656 )
...
* When no content retrieved (i.e. request blocked), default to snippet
* Add release note
2023-08-29 10:57:47 +02:00