* Fix debug on PromptNode
Allow the ability to control debug output on PromptNode
* added tests, simplified code
---------
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
* fix: issue evaluation check for content type
Evaluation currently breaks, when the content type is not a str.
* add black
* add test table eval
* add black formatting
* Expand integration test
---------
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
* Update docstrings + add api docs
* Update with reviewer's changes
* Fix category id and blackify
* make max iterations test more robust
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* add protection, in case we use IVF* indexing, we need to train the index first
Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
* fix formatting issue
Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
* just raising error, instead of silently training the index
* fixed mypy issue
* fixed error msg
---------
Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
* add import for canals
* add stores support to canals
* pyproject.toml
* move tests
* add v2 to the extras in ci
* install v2 in action
* pylint
* save and load
* save and load
* codename "Alfalfa"
* workflows
* add lanaguage classifier node
* Fix a few bugs and general code style
* whitespace
* first draft and refactoring
* draft of classes separation
* improve base class
* fix inivisible character; add some tests
* fix and more tests
* more docs and tests
* move __init__ to base
* add transformers node; improve tests
* incorporate feedback; little fix to other node
* labels_to_languages mapping
* better docstrings
* use logger instead of logging
---------
Co-authored-by: Stanislav Zamecnik <stanislav.zamecnik@telekom.com>
Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>
Co-authored-by: stazam <zamecnik.stanislav@gmail.com>
* fix: Fix `print_answers` for output of query `run_batch` (#4255)
* fix: print "Answers" label even with no query list
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* test: add unit tests for `print_answers` on `run`, `run_batch` output (#4255)
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Added changes from table-qa-pipeline
* Moved classes around to make diff to main look nicer.
* Cleaned things up. Removed option to return_no_answer (not needed), added docs and added integration marks.
* Remove unneeded code
* Added fix for test
* Add check for document_ids in answer
* Prevent passing of empty list to np.mean
* Batching doesn't work with TableQAPipeline b/c of HF issue
* Cleanup of table reader tests, added check for document ids.
* Fixing pylint
* More pylint
* PR comments
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Adding execution time to the debug output of pipeline components
* Linting issue fix
* [EMPTY] Re-trigger CI
* fixed test
---------
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
* Refactoring to remove duplicate code when using OpenAI API
* Adding docstrings
* Fix mypy issue
* Moved retry mechanism to openai_request function in openai_utils
* Migrate OpenAI embedding encoder to use the openai_request util function.
* Adding docstrings.
* pylint import errors
* More pylint import errors
* Move construction of headers into openai_request and api_key as input variable.
* Made _openai_text_completion_tokenization_details so can be resued in PromptNode and OpenAIAnswerGenerator
* Add prompt truncation to the PromptNode.
* Removed commented out test.
* Bump version of tiktoken to 0.2.0 so we can use MODEL_TO_ENCODING to automatically determine correct tokenizer for the requested model
* Change one method back to public
* Fixed bug in token length truncation. Included answer length into truncation amount. Moved truncation higher up to PromptNode level.
* Pylint error
* Improved warning message
* Added _ensure_token_limit for HFLocalInvocationLayer. Had to remove max_length from base PromptModelInvocationLayer to ensure that max_length has a default value.
* Adding tests
* Expanded on doc strings
* Updated tests
* Update docstrings
* Update tests, and go back to how USE_TIKTOKEN was used before.
* Update haystack/nodes/prompt/prompt_node.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update haystack/nodes/prompt/prompt_node.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update haystack/nodes/prompt/prompt_node.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update haystack/nodes/retriever/_openai_encoder.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update haystack/utils/openai_utils.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update haystack/utils/openai_utils.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Updated docstrings, and added integration marks
* Remove comment
* Update test
* Fix test
* Update test
* Updated openai_request function to work with the azure api
* Fixed error in _openai_encodery.py
---------
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* mock all translator tests and move one to e2e
* typo
* extract pipeline tests using translator
* remove duplicate test
* move generator test in e2e
* Update e2e/pipelines/test_extractive_qa.py
* pytest.mark.unit
* black
* remove model name as well
* remove unused fixture
* rename original and improve pipeline tests
* fixes
* pylint
* Added new test for using EntityExtractor in query node and made some fixtures to reduce code duplication.
* Reuse ner_node fixture
* Added pytest unit markings and swapped over to in memory doc store.
* Change to integration tests