238 Commits

Author SHA1 Message Date
Christian Clauss
30ca042370
ci: Use ruff in pre-commit to further limit code complexity (#5783)
* ci: Use ruff in pre-commit to further limit complexity

* Delete releasenotes/notes/ruff-4d2504d362035166.yaml

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-09-13 15:18:16 +02:00
Shantanu
027980358a
Use newer tiktoken (#5785)
* Use newer tiktoken

* reno

---------

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-09-13 15:11:21 +02:00
Julian Risch
4ae0924ea0
feat!: Remove SklearnQueryClassifier (#5779)
* remove SklearnQueryClassifier

* reno
2023-09-13 12:55:33 +02:00
Christian Clauss
6846448bac
pylint: Set limits on code complexity (#5771) 2023-09-12 18:13:23 +02:00
ZanSara
869f69d0d1
fix: temporary pin tiktoken (#5774)
* exclude breaking tiktoken version

* exclude breaking tiktoken version
2023-09-12 14:35:52 +02:00
ZanSara
63cbde7287
feat: GPT35Generator (#5714)
* chatgpt backend

* fix tests

* reno

* remove print

* helpers tests

* add chatgpt generator

* use openai sdk

* remove backend

* tests are broken

* fix tests

* stray param

* move _check_troncated_answers into the class

* wrong import

* rename function

* typo in test

* add openai deps

* mypy

* improve system prompt docstring

* typos update

* Update haystack/preview/components/generators/openai/chatgpt.py

* pylint

* Update haystack/preview/components/generators/openai/chatgpt.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/preview/components/generators/openai/chatgpt.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/preview/components/generators/openai/chatgpt.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* review feedback

* fix tests

* freview feedback

* reno

* remove tenacity mock

* gpt35generator

* fix naming

* remove stray references to chatgpt

* fix e2e

* Update releasenotes/notes/chatgpt-llm-generator-d043532654efe684.yaml

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* add another test

* test wrong model name

* review feedback

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-09-07 10:06:57 +02:00
ZanSara
0bbc219a59
chore: enable e2e preview tests (#5730)
* enable e2e preview tests

* fix transcriber test

* quotes

* add missing dep

* missing comma

* ffmpeg
2023-09-06 16:48:45 +02:00
Silvano Cerza
2acc41ea85
Add PromptBuilder (#5713)
* Add PromptBuilder

* Update release note

* Add test
2023-09-05 12:22:21 +02:00
ZanSara
c5369a39ef
upgrae canals (#5708) 2023-09-04 14:55:05 +02:00
Tuana Çelik
1a872a7841
update description for pypi (#5687) 2023-08-30 15:29:12 +02:00
ZanSara
b1daa7c647
chore: migrate to canals==0.7.0 (#5647)
* add default_to_dict and default_from_dict placeholders to ease migration to canals 0.7.0

* canals==0.7.0

* whisper components

* add to_dict/from_dict stubs

* import serialization methods in init to hide canals imports

* reno

* export deserializationerror too

* Update haystack/preview/__init__.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* serialization methods for LocalWhisperTranscriber (#5648)

* chore: serialization methods for `FileExtensionClassifier` (#5651)

* serialization methods for FileExtensionClassifier

* Update test_file_classifier.py

* chore: serialization methods for `SentenceTransformersDocumentEmbedder` (#5652)

* serialization methods for SentenceTransformersDocumentEmbedder

* fix device management

* serialization methods for SentenceTransformersTextEmbedder (#5653)

* serialization methods for TextFileToDocument (#5654)

* chore: serialization methods for `RemoteWhisperTranscriber` (#5650)

* serialization methods for RemoteWhisperTranscriber

* remove patches

* Add default to_dict and from_dict in document stores built with factory (#5674)

* fix tests (#5671)

* chore: simplify serialization methods for `MemoryDocumentStore` (#5667)

* simplify serialization for MemoryDocumentStore

* remove redundant tests

* pylint

* chore: serialization methods for `MemoryRetriever` (#5663)

* serialization method for MemoryRetriever

* more tests

* remove hash from default_document_store_to_dict

* remove diff in factory.py

* chore: serialization methods for `DocumentWriter` (#5661)

* serialization methods for DocumentWriter

* more tests

* use factory

* black

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-29 18:15:07 +02:00
Julian Risch
fa81c611e8
build: Upgrade transformers to v4.32.1 (#5658)
* upgrade transformers to 4.32.1

* added release notes

* upgrade transformers version also for inference extra
2023-08-29 13:46:00 +02:00
Vladimir Blagojevic
791f322a94
Unpin safetensors (#5657) 2023-08-29 13:12:11 +02:00
Stefano Fiorucci
8342b6a457
upgrade transformers (#5619) 2023-08-25 16:38:34 +02:00
Silvano Cerza
bb7af3827d
Update canals to 0.5.0 (#5564)
* Update canals to 0.5.0

* Fix RemoteWhisperTranscriber serialisation
2023-08-14 20:08:34 +02:00
ZanSara
5ca4874df9
Migrate existing v2 components to Canals 0.4.0 (#5532)
* pin canals==0.4.0

* update audio components

* allow audio components to receive whisper_params in init too

* migrating memoryretriever

* migrate memoryretriever

* migrate TextFileToDocument

* fix TextFileToDocument tests

* fix pipeline tests

* fix defaults management

* reno

* inverted assignments

* Simplify release notes

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-08-09 15:51:32 +02:00
Stefano Fiorucci
30e6c7ac43
build: pin safetensors (#5528)
* pin safetensors

* rm unneeded optional pin
2023-08-08 18:05:56 +02:00
Stefano Fiorucci
3f472995bb
refactor: update Crawler to support selenium>=4.11.0 and simplify it (#5515)
* refactor crawler

* rm unused imports

* release notes!

* rm outdated mock
2023-08-08 15:13:22 +02:00
Massimiliano Pippi
c079576a87
chore: move base test class into haystack core (#5509)
* move base test class into haystack core

* fix linter

* do not compute coverage of testing code
2023-08-04 12:42:13 +02:00
bogdankostic
97e4522a83
build: Remove upper bound for weaviate client (#5486)
* Set upper bound for boto3 and botocore versions

* Set lower bound for weaviate client

* Remove upper bound for version from weaviate

* Add release note

* Update release note

* Remove release note
2023-08-02 11:08:50 +02:00
Silvano Cerza
9ab6298f1d
build: Unpin mlflow, constraint dulwich and botocore (#5441)
* Unpin mlflow

* Pin dulwich

* Pin botocore
2023-07-26 12:59:16 +02:00
Massimiliano Pippi
363f3edbf7
feat: add reno to manage release notes (#5397)
* first draft

* add release notes

* remove old settings

* add reno usage instructions

* page the docs team when release notes are added

* add reno to the dev dependencies

* Apply suggestions from code review

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-07-24 17:02:46 +02:00
Stefano Fiorucci
1706b662db
build: upgrade transformers to v4.31.0 (#5391)
* Update transformers

* fix the forgotten pin
2023-07-21 09:30:03 +02:00
ZanSara
8f3fe85878
feat: extend pipeline.add_component to support stores (#5261)
* add protocol and adapt pipeline

* change API in pipeline.add_component

* adapt pipeline tests

* adapt memoryretriever

* additional checks

* separate protocol and mixin

* review feedback & update tests

* pylint

* Update haystack/preview/document_stores/protocols.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/preview/document_stores/memory/document_store.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* docstring of Store

* adapt memorydocumentstore

* fix tests

* remove direct inheritance

* pylint

* Update haystack/preview/document_stores/mixins.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test/preview/components/retrievers/test_memory_retriever.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test/preview/components/retrievers/test_memory_retriever.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test/preview/components/retrievers/test_memory_retriever.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test/preview/components/retrievers/test_memory_retriever.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test/preview/components/retrievers/test_memory_retriever.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* test names

* revert suggestion

* private self._stores

* move asserts out

* remove protocols

* review feedback

* review feedback

* fix tests

* mypy

* review feedback

* fix tests & other details

* naming

* mypy

* fix tests

* typing

* partial review feedback

* move .store to input dataclass

* Revert "move .store to input dataclass"

This reverts commit 53f624b99f3414c89d5134711725b31bd94ef77a.

* disable reusing components with stores

* disable sharing components with docstores

* Update mixins.py

* black

* upgrade canals & fix tests

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-07-17 15:06:19 +02:00
ZanSara
7848f00d01
feat: upgrade canals in preview (#5344)
* upgrade nodes

* linting
2023-07-13 12:30:49 +02:00
Stefano Fiorucci
8750d92763
pin scikit-learn>=1.3.0 (#5322) 2023-07-13 11:11:28 +02:00
bogdankostic
86d1fb5e1c
builld: Add elasticsearch7 and elasticsearch8 extra (#5296) 2023-07-10 09:59:51 +02:00
Massimiliano Pippi
aee862833e
Update pyproject.toml (#5244) 2023-06-30 19:44:08 +02:00
Massimiliano Pippi
037e4f24ce
refactor: add a new Document Store supporting Elasticsearch 8 (#5231)
* introduce es8

* prepare tests

* fix unit tests

* adjust tests

* install elastic_transport package

* make mypy happy

* fix opensearch tests
2023-06-29 16:40:10 +02:00
Malte Pietsch
c9179ed0eb
feat: enable LLMs hosted via AWS SageMaker in PromptNode (#5155)
* Add SageMakerInvocationLayer
---------

Co-authored-by: oryx1729 <78848855+oryx1729@users.noreply.github.com>
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-06-23 15:33:20 +02:00
Julian Risch
30fdf2b5df
feat!: Add extra for inference dependencies such as torch (#5147)
* feat!: add extra for inference dependencies such as torch

* add inference extra to 'all' and 'all-gpu' extra

* install inference extra in selected integration tests

* import LazyImport

* review feedback

* add import error messages and update readme

* remove extra dot
2023-06-20 09:54:10 +02:00
Julian Risch
b2b4ccdb87
build: Upgrade transformers to v4.30.1 (#5120) 2023-06-13 10:13:39 +02:00
Julian Risch
72fe43a7cc
build: Move Azure's Form Recognizer dependency to extras (#5096)
* build: Move Azure's Form Recognizer dependency to extras

* try catch imports for AzureConverter

* assign None to failed imports

* use lazy import

* use forward reference in type hints
2023-06-12 12:23:32 +02:00
ZanSara
52e7a77595
feat: introduce lazy_import (#5084)
* generalimport -> lazy-imports

* remove generalimport

* fix pdftotextconverter import check

* customize error messages

* pylint

* fix sql.py

* pylint

* Update haystack/document_stores/sql.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* make contextmanager less verbose

* do not catch syntax errors

* review feedback

* Update haystack/nodes/file_converter/pdf.py

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-06-08 12:11:38 +02:00
bogdankostic
eca8f66ffa
build: Pin mlflow (#5094) 2023-06-07 11:24:01 +02:00
Silvano Cerza
ffe7b2af9a
Update prompthub-py (#5061)
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-06-05 17:03:51 +02:00
ZanSara
5f6d161cfe
pin generalimport (#5074) 2023-06-05 10:29:51 +02:00
ZanSara
89de76d5fe
feat: move cli out from preview (#5055)
* move cli from preview

* readme

* review feedback

* test mocks & import paths

* import path
2023-05-31 18:34:14 +02:00
ZanSara
6249e65bc8
feat: prompts caching from PromptHub (#5048)
* split up prompttemplate init

* caching

* docstring

* add platformdirs

* use user_data_dir

* fix tests

* add tests

* pylint

* mypy
2023-05-30 16:55:48 +02:00
ZanSara
76a6eefe5e
pin prompthub (#5045) 2023-05-30 15:36:13 +02:00
Silvano Cerza
ba06bc4805
Unpin typing_extensions and remove all its uses (#5040) 2023-05-29 15:31:34 +02:00
Julian Risch
2ede4d1d1d
build: Remove dill dependency (#4985)
* remove dill dependency

* remove dill from .toml
2023-05-26 17:50:55 +02:00
Julian Risch
75fb6db4d5
build: Install protobuf via transformers extra sentencepiece (#4989) 2023-05-26 11:31:28 +02:00
ZanSara
949b1b63b3
PromptHub integration in PromptNode (#4879)
* initial integration

* upgrade of prompthub

* fix get_prompt_template

* feedback

* add prompthub-py to dependencies

* tests

* mypy

* stray changes

* review feedback

* missing init

* fix test

* move logic in prompttemplate

* linting

* bugfixes

* fix unit tests

* fix cache

* simplify prompttemplate init

* remove unused function

* removing wrong params

* try remove all instances of prompt names

* more tests

* fix agent tests

* more tests

* fix tests

* pylint

* comma

* black

* fix test

* docstring

* review feedback

* review feedback

* fix mocks

* mypy

* fix mocks

* fix reference to missing templates

* feedback

* remove direct references to default template var

* tests

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-23 15:22:58 +02:00
Julian Risch
9e4feb6bed
build: Remove tiktoken alternative (#4991)
* remove conditional statements from tiktoken

* remove count_openai_tokens method
2023-05-23 13:05:30 +02:00
Silvano Cerza
afc342b5fe
Pin typing_extensions to fix Pydantic issue (#4987) 2023-05-23 11:08:50 +02:00
ZanSara
516db4cb52
RemoteWhisperTranscriber (v2) (#4910)
* original-component

* stub

* fix implementation

* fix tests

* review feedback

* review feedback

* upgrade canals

* upgrade canals

* upgrade canals to fix pipeline test

* remove requests_with_retry

* feedback
2023-05-22 16:02:58 +02:00
Massimiliano Pippi
8228081e7a
chore: leftovers from removing knowledge graph support (#4974)
* leftovers from removing knowledge graph support

* more leftovers
2023-05-22 10:03:51 +02:00
Massimiliano Pippi
4974bf7ab3
chore: remove deprecated MilvusDocumentStore (#4951)
* remove deprecated MilvusDocumentStore

* remove leftovers

* fix pylint
2023-05-19 16:37:38 +02:00
Massimiliano Pippi
df55ec5e61
Pin Weaviate client (#4952) 2023-05-18 12:22:16 +02:00