26 Commits

Author SHA1 Message Date
Stefano Fiorucci
cc70b4b613
deprecation (#5954) 2023-10-03 12:48:06 +02:00
Christian Clauss
9405eb90ee
ci: Fix invalid escape sequences in Python code (#5802)
* ci: Use ruff in pre-commit to further limit complexity

* Fix invalid escape sequences in Python code

* Delete releasenotes/notes/ruff-4d2504d362035166.yaml
2023-09-14 16:42:48 +02:00
Stefano Fiorucci
90ff3817e7
feat: support OpenAI-Organization for authentication (#5292)
* add openai_organization to invocation layer, generator and retriever

* added tests
2023-07-07 12:02:21 +02:00
bogdankostic
0697f5c63e
fix: Support isolated node eval in run_batch in Generators (#5291)
* Add isolated node eval to BaseGenerator's run_batch

* Add unit tests
2023-07-07 10:32:43 +02:00
Stefano Fiorucci
637433841e
chore: remove deprecated Seq2SeqGenerator and RAGenerator (#5180)
* first draft of removal

* more removals

* don't download unused models
2023-06-21 16:38:45 +02:00
ZanSara
65cdf36d72
chore: block all HTTP requests in CI (#5088) 2023-06-13 14:52:24 +02:00
Michael Feil
6ea8ae01a2
feat: Allow setting custom api_base for OpenAI nodes (#5033)
* add changes for api_base

* format retriever

* Update haystack/nodes/retriever/dense.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update haystack/nodes/audio/whisper_transcriber.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update haystack/preview/components/audio/whisper_remote.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update haystack/nodes/answer_generator/openai.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update test_retriever.py

* Update test_whisper_remote.py

* Update test_generator.py

* Update test_retriever.py

* reformat with black

* Update haystack/nodes/prompt/invocation_layer/chatgpt.py

Co-authored-by: Daria Fokina <daria.f93@gmail.com>

* Add unit tests

* apply docstring suggestions

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
Co-authored-by: michaelfeil <me@michaelfeil.eu>
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
2023-06-05 11:32:06 +02:00
Massimiliano Pippi
929b8d1fb0
ci: run Elasticsearch 8.6 in compatibility mode (#3853)
* bump ES version in CI

disable ssl

wait for service to start

set env vars

do not use choco to install ES

re-enable jobs deps

skip test on windows CI because of OOM

allocate more memory for ES

uniform ES installation and use default heap size

skip tests causing OOM

increase job timeout

restore memory limit for ES8

* Use latest elasticsearch version
2023-05-24 18:53:54 +02:00
ZanSara
949b1b63b3
PromptHub integration in PromptNode (#4879)
* initial integration

* upgrade of prompthub

* fix get_prompt_template

* feedback

* add prompthub-py to dependencies

* tests

* mypy

* stray changes

* review feedback

* missing init

* fix test

* move logic in prompttemplate

* linting

* bugfixes

* fix unit tests

* fix cache

* simplify prompttemplate init

* remove unused function

* removing wrong params

* try remove all instances of prompt names

* more tests

* fix agent tests

* more tests

* fix tests

* pylint

* comma

* black

* fix test

* docstring

* review feedback

* review feedback

* fix mocks

* mypy

* fix mocks

* fix reference to missing templates

* feedback

* remove direct references to default template var

* tests

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-23 15:22:58 +02:00
Vladimir Blagojevic
5d7ee2e5e6
feat: Add max_tokens to BaseGenerator params (#4168)
* Add max_tokens to BaseGenerator params

* Make mypy happy

* Rebase and resolve conflicts

* Fix signature issues

* Update lg

* Add a mocked unit test method

* end-of-file-fixer corrected file

* Convert to unit test

* Mark test as integration

* make the test unit

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-05-18 15:19:29 +02:00
bogdankostic
5b2ef2afd6
Revert "refactor!: Deprecate name param in PromptTemplate and introduce template_name instead (#4810)" (#4834)
This reverts commit f660f41c0615e6b3064ef3e321f1e5a295fafc1b.
2023-05-08 11:31:04 +02:00
bogdankostic
f660f41c06
refactor!: Deprecate name param in PromptTemplate and introduce template_name instead (#4810)
* Deprecate name parameter

* Adapt existing tests and uses of PromptTemplate

* Move parameter `name` to end

* Adapt existing tests

* lg update

---------

Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-05-08 10:12:29 +02:00
Mayank Jobanputra
dcf3ddddff
Added deprecation tests for seq2seq generator and RAG Generator (#4782) 2023-05-02 13:30:22 +05:30
Stefano Fiorucci
57f87e24a3
refactor: OpenAIAnswerGenerator - avoid tokenizing all documents several times (#4504) 2023-03-29 22:38:27 +02:00
tstadel
382ca8094e
feat: PromptTemplate extensions (#4378)
* use outputshapers in prompttemplate

* fix pylint

* first iteration on regex

* implement new promptnode syntax based on f-strings

* finish fstring implementation

* add additional tests

* add security tests

* fix mypy

* fix pylint

* fix test_prompt_templates

* fix test_prompt_template_repr

* fix test_prompt_node_with_custom_invocation_layer

* fix test_invalid_template

* more security tests

* fix test_complex_pipeline_with_all_features

* fix agent tests

* refactor get_prompt_template

* fix test_prompt_template_syntax_parser

* fix test_complex_pipeline_with_all_features

* allow functions in comprehensions

* break out of fstring test

* fix additional tests

* mark new tests as unit tests

* fix agents tests

* convert missing templates

* proper use of get_prompt_template

* refactor and add docstrings

* fix tests

* fix pylint

* fix agents test

* fix tests

* refactor globals

* make allowed functions configurable via env variable

* better dummy variable

* fix special alias

* don't replace special char variables

* more special chars, better docstrings

* cherrypick fix audio tests

* fix test

* rework shapers

* fix pylint

* fix tests

* add new templates

* add reference parsing

* add more shaper tests

* add tests for join and to_string

* fix pylint

* fix pylint

* fix pylint for real

* auto fill shaper function params

* fix reference parsing for multiple references

* fix output variable inference

* consolidate qa prompt template output and make shaper work per-document

* fix types after merge

* introduce output_parser

* fix tests

* better docstring

* rename RegexAnswerParser to AnswerParser

* better docstrings
2023-03-27 12:14:11 +02:00
Vladimir Blagojevic
79bf25aaea
feat: Add Azure as OpenAI endpoint (#4170)
* Add Azure as OpenAI endpoint
---------

Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-03-02 09:55:09 +01:00
ZanSara
165a0a5faa
test: mock all Translator tests and move one to e2e (#4290)
* mock all translator tests and move one to e2e

* typo

* extract pipeline tests using translator

* remove duplicate test

* move generator test in e2e

* Update e2e/pipelines/test_extractive_qa.py

* pytest.mark.unit

* black

* remove model name as well

* remove unused fixture

* rename original and improve pipeline tests

* fixes

* pylint
2023-03-01 14:52:05 +01:00
Sebastian
2bedb80ba5
Fix for custom template in OpenAIAnswerGenerator (#4220) 2023-02-21 13:35:17 +01:00
Sebastian
75ef959678
feat: Update OpenAIAnswerGenerator defaults and with learnings from PromptNode (#4038)
* added instruction_prompt and update defaults

* Change back max_tokens

* Code formatting

* Starting to update instruction_prompt to be a PromptTemplate

* Using PromptTemplate in OpenAIAnswerGenerator

* Removed hardcoded value

* pylint and make examples and examples_context optional prompt parameters

* Added new test for when prompt length goes past max token limit

* Improve doc strings.

* Make "text-davinci-003" the new default model

* Renaming variable to prompt_template and name to question-answering-with-examples

* Reduced repetitive code.

* Added some comments to explain key logic for future debuggers

* Update docs for max_tokens and increase defaul

* Updating variable name to prompt_template and docs.

* Updated test and handled Answer case where no documents are used.

* Slight update to docs.

* Adding more doc strings

* lg updates

* Blackify

---------

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-12 00:08:07 +01:00
tstadel
92c58cfda1
feat: Support multiple document_ids in Answer object (for generative QA) (#4062)
* initial version without shapers

* set document_ids for BaseGenerator

* introduce question-answering-with-references template

* better prompt

* make PromptTemplate control output_variable

* update schema

* fix add_doc_meta_data_to_answer

* Revert "fix add_doc_meta_data_to_answer"

This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9.

* fix add_doc_meta_data_to_answer

* fix eval

* fix pylint

* fix pinecone

* fix other tests

* fix test

* fix flaky test

* Revert "fix flaky test"

This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775.

* adjust docstrings

* make Label loading backward-compatible

* fix Label backward compatibility for pinecone

* fix Label backward compatibility for search engines

* fix Label backward compatibility for deepset Cloud

* fix tests

* fix None issue

* fix test_write_feedback

* add tests for legacy label support

* add document_id test for pinecone

* reduce unnecessary contents

* add comment to pinecone test
2023-02-08 08:37:22 +01:00
Daniel Bichuetti
621e1af74c
refactor: improve support for dataclasses (#3142)
* refactor: improve support for dataclasses

* refactor: refactor class init

* refactor: remove unused import

* refactor: testing 3.7 diffs

* refactor: checking meta where is Optional

* refactor: reverting some changes on 3.7

* refactor: remove unused imports

* build: manual pre-commit run

* doc: run doc pre-commit manually

* refactor: post initialization hack for 3.7-3.10 compat.

TODO: investigate another method to improve 3.7 compatibility.

* doc: force pre-commit

* refactor: refactored for both Python 3.7 and 3.9

* docs: manually run pre-commit hooks

* docs: run api docs manually

* docs: fix wrong comment

* refactor: change no type-checked test code

* docs: update primitives

* docs: api documentation

* docs: api documentation

* refactor: minor test refactoring

* refactor: remova unused enumeration on test

* refactor: remove unneeded dir in gitignore

* refactor: exclude all private fields and change meta def

* refactor: add pydantic comment

* refactor : fix for mypy on Python 3.7

* refactor: revert custom init

* docs: update docs to new pydoc-markdown style

* Update test/nodes/test_generator.py

Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
2022-09-09 11:31:37 +02:00
Sara Zan
d8e7aaeacc
API key check in OpenAIAnswerGenerator (#2791)
* api key check in node and tests

* Clarify skip message

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-12 14:05:47 +02:00
Malte Pietsch
ba08fc86f5
Add node to use OpenAI's GPT-3 for QA (#2605)
* first draft of openai node for QA

* Update Documentation & Code Style

* fix mypy. add node to inits

* Update Documentation & Code Style

* fix linter

* Adapt OpenAIGenerator to completions endpoint

* Update Documentation & Code Style

* Fix pylint

* Fix doc strings

* Make use of temperature

* Make use of api key in tests

* Adapt doc strings

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-07-08 13:59:27 +02:00
Sara Zan
54518ac790
[CI Refactoring] Refactor Document fixtures in tests (#2577)
* Refactor document fixtures

* Add embedding files

* Update Documentation & Code Style

* Indentation issue

* Update Documentation & Code Style

* Fix type conversion in conftest.py

* Update Documentation & Code Style

* mypy on sql.py

* mypy on crawler.py

* mypy on pinecone.py

* Adapt retriever tests

* Update Documentation & Code Style

* mypy on crawler.py

* Update Documentation & Code Style

* mypy on crawler.py again

* Update Documentation & Code Style

* mypy fix was too rough

* Fix some more tests

* Update Documentation & Code Style

* Skip meaningless test on FilterRetriever

* Make embedding values less specific

* Update Documentation & Code Style

* Use stable IDs in retriever tests that depend on it

* Remove needless fixtures

* docs_with_ids

* Update Documentation & Code Style

* Typo

* Fix retriever tests

* Fix reader tests

* Update Documentation & Code Style

* Workaround #2626

* Update Documentation & Code Style

* Fix label generator tests

* Reorder vectors

* remove print

* Update Documentation & Code Style

* Update Documentation & Code Style

* git tags leftover

* Update Documentation & Code Style

* fix last failing test

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 18:22:48 +02:00
Sara Zan
59608ca474
[CI Refactoring] Workflow refactoring (#2576)
* Unify CI tests (from #2466)

* Update Documentation & Code Style

* Change folder names

* Fix markers list

* Remove marker 'slow', replaced with 'integration'

* Soften children check

* Start ES first so it has time to boot while Python is setup

* Run the full workflow

* Try to make pip upgrade on Windows

* Set KG tests as integration

* Update Documentation & Code Style

* typo

* faster pylint

* Make Pylint use the cache

* filter diff files for pylint

* debug pylint statement

* revert pylint changes

* Remove path from asserted log (fails on Windows)

* Skip preprocessor test on Windows

* Tackling Windows specific failures

* Fix pytest command for windows suites

* Remove \ from command

* Move poppler test into integration

* Skip opensearch test on windows

* Add tolerance in reader sas score for Windows

* Another pytorch approx

* Raise time limit for unit tests :(

* Skip poppler test on Windows CI

* Specify to pull with FF only in docs check

* temporarily run the docs check immediately

* Allow merge commit for now

* Try without fetch depth

* Accelerating test

* Accelerating test

* Add repository and ref alongside fetch-depth

* Separate out code&docs check from tests

* Use setup-python cache

* Delete custom action

* Remove the pull step in the docs check, will find a way to run on bot commits

* Add requirements.txt in .github for caching

* Actually install dependencies

* Change deps group for pylint

* Unclear why the requirements.txt is still required :/

* Fix the code check python setup

* Install all deps for pylint

* Make the autoformat check depend on tests and doc updates workflows

* Try installing dependencies in another order

* Try again to install the deps

* quoting the paths

* Ad back the requirements

* Try again to install rest_api and ui

* Change deps group

* Duplicate haystack install line

* See if the cache is the problem

* Disable also in mypy, who knows

* split the install step

* Split install step everywhere

* Revert "Separate out code&docs check from tests"

This reverts commit 1cd59b15ffc5b984e1d642dcbf4c8ccc2bb6c9bd.

* Add back the action

* Proactive support for audio (see text2speech branch)

* Fix label generator tests

* Remove install of libsndfile1 on win temporarily

* exclude audio tests on win

* install ffmpeg for integration tests

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-07 09:23:03 +02:00
Sara Zan
ff4303c51b
[CI refactoring] Categorize tests into folders (#2554)
* Categorize tests into folders

* Fix linux_ci.yml and an import

* Wrong path
2022-05-17 09:55:53 +01:00