3803 Commits

Author SHA1 Message Date
Vladimir Blagojevic
37cadd702a
fix: Make sure summary memory is cumulative (#4932)
* Fix summary memory not being cummulative

* PR feedback - Julian
2023-05-16 13:35:19 +02:00
Stefano Fiorucci
6e0000732d
feat: add BLIP support in TransformersImageToText (#4912)
* add blip support

* fix typo

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-16 10:57:41 +02:00
Vladimir Blagojevic
4c9843017c
feat: Add agent memory (#4829) 2023-05-15 18:08:44 +02:00
Julian Risch
d4bbde2d9d
build: Upgrade transformers to 4.29.1 (#4886)
* Upgrade transformers to 4.29.0

* Upgrade transformers to 4.29.1
2023-05-15 17:11:17 +02:00
Ben Heckmann
099d0deb86
fix: Dynamic max_answers for SquadProcessor (fixes IndexError when max_answers is less than the number of answers in the dataset) (#4817)
* #4320 implemented dynamic max_answers for SquadProcessor, fixed IndexError when max_answers is less than the number of answers in the dataset

* #4320 added two unit tests for dataset_from_dicts testing default and manual max_answers

* apply suggestions from code review

Co-authored-by: bogdankostic <bogdankostic@web.de>

* simplify comment, fix mypy & pylint errors, fix old test

* adjust max_answers to each dataset individually

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-05-15 14:34:23 +02:00
ZanSara
8fbfca9ebb
fix: Document v2 JSON serialization (#4863)
* fix json serialization

* add missing markers

* pylint

* fix decoder bug

* pylint

* add some more tests

* linting & windows

* windows

* windows

* windows paths again
2023-05-15 11:39:04 +02:00
ZanSara
bffe2d8c19
add base test class (#4908) 2023-05-15 10:36:55 +02:00
Farzad E
6eb251d1f0
fix: Support for gpt-4-32k (#4825)
* Add step to loook up tokenizers by prefix in openai_utils

* Updated tiktoken min version + openai_utils test

* Added test case for GPT-4 and Azure model naming

* Broken down tests

* Added default case

---------

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-05-12 19:02:12 +02:00
bogdankostic
179e9cea08
feat: Send pipeline config hash every 100 runs (#4884)
* Add since_last_run property

* Revert "Add since_last_run property"

This reverts commit c1c907ef58a696a97d964fb9c45fbee0c80365aa.

* Send pipeline config hash for each run

* Send event every 100 runs

* Merge branch 'main' into telemetry_since_last_run

* PR review

* Move constant
2023-05-12 18:51:26 +02:00
Vladimir Blagojevic
73380b194a
feat: Add Cohere PromptNode invocation layer (#4827)
* Add CohereInvocationLayer
---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-05-12 17:50:09 +02:00
Silvano Cerza
7e2b824bea
Add Datadog in Who Uses Haystack README.md section (#4894) 2023-05-12 11:37:22 +02:00
Massimiliano Pippi
428096733d
ci: add a job to vet license of direct dependencies only (#4885)
* add conversion script

* run job in CI

* typo

* invoke python

* install toml

* fix pylint error

* more exclusions

* add toml to dev dependencies

* fix exclusions list

* fix mypy and remove test clause
2023-05-12 11:20:48 +02:00
Massimiliano Pippi
d322beed6c
build: do not install 'dev' extras with 'all' (#4888)
* do not install 'dev' with 'all'

* some fixes around
2023-05-11 19:24:47 +02:00
ZanSara
618699eb52
fix: improve Document comparison (v2) (#4860)
* don't compare on content directly, use id as proxy

* stray change

* add more tests

* fix tests

* pylint

* black

* review feedback

* fix tests
2023-05-11 18:28:56 +02:00
Silvano Cerza
6c84a05d98
Upload coverage only if all unit tests pass (#4874) 2023-05-11 14:29:44 +02:00
Silvano Cerza
98947e4c3c
feat: Add Anthropic invocation layer (#4818)
* feat: Add Anthropic Claude Invocation Layer

* feat: Add AnthropicClaude Invocation Layer

* fix: Permission changes

* fix: Permission changes

* Move anthropic utils in anthropic invocation layer file

* Rework method to post data

* Simplify invoke

* Simplify supports classmethod

* Remove unnecessary functions

* Use always same tokenizer

* Add module import

* Rename some members and kwargs

* Add tests

* Fix _post not handling HTTPError

* Fix handling of streamed response

* Fix kwargs handling

* Update tests

* Update supports to be generic

* Fix failing test

* Use correct tokenizer and fix tests

* Update lg

* Fix mypy issue

* Move requests-cache from dev to base dependencies

* Fix failing test

* Handle all stop words use cases

---------

Co-authored-by: recrudesce <recrudesce@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-05-11 10:14:33 +02:00
ZanSara
3a6db68408
feat: allow filtering documents on all fields (v2) (#4773)
* extend tests

* remove stray test

* pylint

* mypy

* review feedback

* fix tests

* fix last tests

* remove comment

* remove print statement

* pylint

* add flatten test

* remove direct acces/ direct write in docstore tests

* fix tests
2023-05-10 16:33:47 +02:00
Massimiliano Pippi
c619aa29ec
ci: add new license checker (#4779)
* try

* add exclusions

* fix vanilla distribution

* use different requirements files

* fix comments and file name

* try with a recent version of pip

* use cpu version of torch

* try

* again

* exclude nvidia libraries

* revert old change

* send report to FOSSA

* add gpu section

* display job names

* remove FOSSA check

* send complete report to FOSSA

* removed FIXME
2023-05-10 16:33:08 +02:00
Sebastian
eff420cce0
test: Update unit tests for schema (#4835)
* Updated text_label tests to match tabel_label tests. Also added answer text as part of the Answer.__eq__ comparison.

* Updated text document unit tests to match ones from table docs

* Converting text answer unit tests to match table answer

* Update some document tests

* Minor update

* Separating unit tests
2023-05-10 16:16:45 +02:00
ZanSara
6a7d31fb5b
chore: remove optional imports in v2 (#4855)
* remove optional imports in v2

* unused import
2023-05-10 14:02:18 +02:00
ZanSara
9cb153d0f4
fix: add unit markers to several v2 tests (#4851)
* add markers

* remove stray marker
2023-05-10 13:46:13 +02:00
ZanSara
611b09b6c0
pin canals (#4853) 2023-05-10 13:45:57 +02:00
Silvano Cerza
06193e08b1
Add missing unit tests topics to coverage upload step (#4873) 2023-05-10 12:51:52 +02:00
Daria Fokina
7ef6bd8373
[Docs] Hide api classes in prompt_node (#4869)
* hide api classes in prompt_node

* add jsonconverter to docs
2023-05-10 10:56:46 +02:00
Silvano Cerza
f12e5a0127
fix: Fix missing error in openai_request retry strategy (#4802)
* Fix missing error in openai_request retry strategy

* Correctly handle OpenAIUnauthorizedError

Co-authored-by: bogdankostic <bogdankostic@web.de>

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-05-10 10:31:07 +02:00
ZanSara
c734c58b4b
skip flaky test (#4846) 2023-05-09 20:26:59 +02:00
Daria Fokina
dbbdc5464a
docs: fix Prompt_Node and Pipelines API reference (#4858)
* fix: Prompt_Node and Pipelines API reference

* deleted invocation layers
2023-05-09 19:04:55 +02:00
ZanSara
28463e38e5
multi-os dep checker (#4845) 2023-05-09 11:46:53 +02:00
Sebastian
707f1c3546
Add modeling to unit tests so it we can get coverage for that (#4809)
* Add modeling to unit tests so it we can get coverage for that

* fix unit tests

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-05-08 19:05:21 +02:00
ZanSara
28260c5c3f
feat: introduce generalimport (#4662)
* introduce generalimport

* pylint

* fix optional deps typing for schema

* leftover

* typo

* typing with faiss

* make Base generation optional too

* handle sqlalchemy

* (almost) all import are optional

* TO REMOVE hijacking CI for tests

* some deps are actually needed

* get feature branch in CI

* get feature branch in CI

* fix array_equal

* pylint

* pandas also required

* improve imports.yml

* fix SquadData

* fix SquadData again

* generalimport imports list

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* review feedback

* remove todos

* reference main release

* pylint

* circular import

* review feedback

* move is_imported in init

* pylint

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-08 15:20:10 +02:00
bogdankostic
5b2ef2afd6
Revert "refactor!: Deprecate name param in PromptTemplate and introduce template_name instead (#4810)" (#4834)
This reverts commit f660f41c0615e6b3064ef3e321f1e5a295fafc1b.
2023-05-08 11:31:04 +02:00
ZanSara
6e982e9283
fix: preserve root_node in JoinNode's output (#4820)
* preserve root_node and add tests

* Added if statement to fix failing tests

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2023-05-08 10:17:36 +02:00
bogdankostic
f660f41c06
refactor!: Deprecate name param in PromptTemplate and introduce template_name instead (#4810)
* Deprecate name parameter

* Adapt existing tests and uses of PromptTemplate

* Move parameter `name` to end

* Adapt existing tests

* lg update

---------

Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-05-08 10:12:29 +02:00
Philip May
2ff8b0ddd0
fix: str issues in squad_to_dpr (#4826)
* fix #4754

* fix #4753

* run black formatting

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-05-07 19:50:23 +02:00
Silvano Cerza
705a2c025f
Update preview Pipelines following Canals changes (#4821) 2023-05-05 19:47:32 +02:00
bogdankostic
43509c88bf
fix: Add support for _split_overlap meta to Pinecone and dict metadata in general to Weaviate (#4805)
* Add support for dicts to Weaviate

* Add support for _split_overlap to Pinecone

* Add tests

* Fix Pylint

* Fix Pylint

* Fix test

* Implement PR feedback
2023-05-05 11:20:21 +02:00
Massimiliano Pippi
d8dc0d7403
chore: move custom linter to a separate package (#4790)
* move custom linter to its own package

* install the custom linter

* fix formatting

* drop python 3.7
2023-05-04 15:49:26 +02:00
Vladimir Blagojevic
8091ced8d5
refactor: Extract ToolsManager, add it to Agent by composition (#4794)
* Extract ToolsManager, add it to Agent by the composition
* PR feedback Massi
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-05-03 16:45:40 +02:00
Silvano Cerza
4cc69eeb29
Fix release_docs.py to create docs with correct version (#4803) 2023-05-03 15:04:50 +02:00
Silvano Cerza
9b67611169
Add others folder to unit test job (#4800) 2023-05-03 10:47:21 +02:00
Sebastian
a67ca289db
refactor: Update schema objects to handle Dataframes in to_{dict,json} and from_{dict,json} (#4747)
* Adding support for table Documents when serializing Labels in Haystack

* Fix table label equality test

* Add serialization support and __eq__ support for table answers

* Made convenience functions for converting dataframes. Added some TODOs. Epxanded schema tests for table labels. Updated Multilabel to not convert Dataframes into strings.

* get Answer and Label to_json working with DataFrame

* Fix from_dict method of Label

* Use Dict and remove unneccessary if check

* Using pydantic instead of builtins for type detection

* Update haystack/schema.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/schema.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/schema.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Separated table label equivalency tests and added pytest.mark.unit


* Added unit test for _dict_factory

* Using more descriptive variable names

* Adding json files to test to_json and from_json functions

* Added sample files for tests

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-03 09:42:07 +02:00
ZanSara
a9ec954c45
bug: fix filtering in MemoryDocumentStore (v2) (#4768)
* fix filtering bug

* pylint

* improve asserts
2023-05-03 09:33:12 +02:00
Pouyan
75ff768c21
Pouyanpi/feat/search engine/providers/google api (#4722)
* feat: implement google api search engine provider

Signed-off-by: Pouyan <prezakhanipr@gmail.com>

---------

Signed-off-by: Pouyan <prezakhanipr@gmail.com>
2023-05-02 17:09:17 +02:00
yuanwu2017
c88bc19791
fix: load the local finetuning model from pipeline YAML (#4729) (#4760)
If using the local model in pipeline YAML. The PromptModel cannot select
the HFLocalInvocationLayer, because of the get_task cannot support the
offline model.

*Local model usage:
  add the task_name parameter in model_kwargs for local model. for
  example text-generation or text2text-generation.

- name: PModel
  type: PromptModel
  params:
    model_name_or_path: /local_model_path
    model_kwargs:
      task_name: text-generation
- name: Prompter
  params:
    model_name_or_path: PModel
    default_prompt_template: question-answering
  type: PromptNode

Signed-off-by: yuanwu <yuan.wu@intel.com>
2023-05-02 17:04:42 +02:00
duffn
479092e3c1
bug: (rest_api) remove full logging of overwritten env variables (#4791)
* bug: (rest_api) remove logging of overwritten env variables

* Update haystack/pipelines/config.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update test

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-02 16:48:19 +02:00
Vladimir Blagojevic
1e9f4c1d50
feat: Add HF local runtime token streaming support (#4652)
* Add HF local runtime token streaming support

* Add stream and stream_handler as model kwargs

* Improve HF streaming unit tests
2023-05-02 12:50:20 +02:00
Mayank Jobanputra
dcf3ddddff
Added deprecation tests for seq2seq generator and RAG Generator (#4782) 2023-05-02 13:30:22 +05:30
Mayank Jobanputra
896eb6a2ea
chore: fixed reader loading test for hf-hub starting 0.14.0 (#4607)
* fixed test base for hub 0.13.3

* check if test succeed from branch

* 2nd check if test succeed from branch

* removed dependency changes

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-05-02 08:22:44 +02:00
ZanSara
b60d9a2cbf
test: move several modeling tests in e2e/ (#4308)
* no dpr test seems worth mocking

* move distillation tests

* pylint

* mypy

* pylint

* move feature_extraction tests as well

* move feature_extraction tests as well

* merge feature extractor suites

* get_language_model tests and adaptive model tests

* duplicate test

* moving fixtures

* mypy

* mypy-again

* trigger

* un-mock integration test

* review feedback

* feedback

* pylint
2023-04-28 17:08:41 +02:00
Bilge Yücel
5a17a40685
fix: update ImportError for 'metrics' dependency (#4778) 2023-04-28 13:42:51 +02:00