Daria Fokina
23a22be03c
docs: update CLI readme ( #5129 )
...
* docs: update CLI readme
Update CLI Readme for easier understanding and more details.
* update cache path
* version as separate command
* resolve comments
* prompthub_cache_path env var
* wording
* Update fetch.py
2023-06-15 16:32:36 +02:00
Vladimir Blagojevic
8d8de65492
Add AgentToolLogger, unit test, and example usage ( #5087 )
2023-06-15 08:43:20 +02:00
bogdankostic
7731713a1e
test: Add benchmark config files ( #5093 )
...
* Add config files
* Add top-k and batch size to configs
* Add batch size to configs
* Add batch size to configs
* Remove configs using 1m docs
2023-06-14 18:15:50 +02:00
Ben Heckmann
60e5d73424
fix: changing document scores ( #5090 )
...
* #4653 fix changing scores by returning new document objects from document store queries
* added integration test for InMemoryDocumentStore demonstrating the desired behavior
* Update test/document_stores/test_memory.py
2023-06-14 17:35:46 +02:00
darionreyes
58c022ef86
fix: increase max token length for openai 16k models ( #5145 )
2023-06-14 16:24:04 +02:00
ZanSara
20c1f23fff
feat: optional transformers ( #5101 )
...
* generalimport -> lazy-imports
* remove generalimport
* fix pdftotextconverter import check
* customize error messages
* pylint
* fix sql.py
* pylint
* Update haystack/document_stores/sql.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* make contextmanager less verbose
* do not catch syntax errors
* review feedback
* make all torch and transformers import lazy
* fix environment.py
* mypy
* merge leftovers
* fix schema
* pylint
* review feedback
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-06-14 12:00:20 +02:00
Julian Risch
ce1c9c9ddb
fix: Relax ChatGPT model name check to support gpt-3.5-turbo-0613 ( #5142 )
...
* relax model name checking for chatgpt
* add unit tests
2023-06-14 09:53:00 +02:00
Julian Risch
4c8e0b9d4a
fix: PromptNode falls back to empty list of documents if none are provided but expected ( #5132 )
...
* add warning, default to empty docs list, tests
* pylint
2023-06-13 16:35:19 +02:00
Silvano Cerza
3b8992968d
test: Skip flaky PromptNode test ( #5039 )
...
* Skip flaky PromptNode test
* Add skip reason
* Update test/prompt/test_prompt_node.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-06-13 16:24:29 +02:00
ZanSara
65cdf36d72
chore: block all HTTP requests in CI ( #5088 )
2023-06-13 14:52:24 +02:00
bogdankostic
29a6bfe621
fix: Don't log info message in DataSilo with SquadProcessor about clipping ( #5127 )
2023-06-13 10:31:39 +02:00
Julian Risch
b2b4ccdb87
build: Upgrade transformers to v4.30.1 ( #5120 )
2023-06-13 10:13:39 +02:00
ZanSara
3c71f0ae3d
chore: mark some unit tests under test/pipeline ( #5124 )
...
* mark some unit tests as such
* remove marker
2023-06-12 17:58:31 +02:00
ZanSara
49e037a055
fix: rename requests.py into requests_utils.py ( #5099 )
...
* requests.py -> requests_utils.py
* fix tests
* reimport requrests
* fix more tests
* review feedback
2023-06-12 12:40:21 +02:00
Julian Risch
72fe43a7cc
build: Move Azure's Form Recognizer dependency to extras ( #5096 )
...
* build: Move Azure's Form Recognizer dependency to extras
* try catch imports for AzureConverter
* assign None to failed imports
* use lazy import
* use forward reference in type hints
2023-06-12 12:23:32 +02:00
Vladimir Blagojevic
0cc9ce7522
fix: WebRetriever top_k is ignored in a pipeline ( #5106 )
...
* Initial changes
* Add WebSearch, WebRetriever top_k unit tests
* Add exact integration test that failed Tuana
* PR review
2023-06-09 10:42:37 +02:00
Julian Risch
d8a4f20379
feat: Consider prompt_node's default_prompt_template in agent ( #5095 )
...
* consider prompt_node's default_prompt_template in agent
* make test a unit test via mocking
* updated docstring
2023-06-08 13:42:28 +02:00
ZanSara
52e7a77595
feat: introduce lazy_import ( #5084 )
...
* generalimport -> lazy-imports
* remove generalimport
* fix pdftotextconverter import check
* customize error messages
* pylint
* fix sql.py
* pylint
* Update haystack/document_stores/sql.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* make contextmanager less verbose
* do not catch syntax errors
* review feedback
* Update haystack/nodes/file_converter/pdf.py
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-06-08 12:11:38 +02:00
ZanSara
5022abb546
chore: Remove stray import ( #5097 )
2023-06-07 18:07:27 +02:00
Vladimir Blagojevic
e3b069620b
feat: pass model parameters to HFLocalInvocationLayer via model_kwargs, enabling direct model usage ( #4956 )
...
* Simplify HFLocalInvocationLayer, move/add unit tests
* PR feedback
* Better pipeline invocation, add mocked tests
* Minor improvements
* Mock pipeline directly, unit test updates
* PR feedback, change pytest type to integration
* Mock supports unit test
* add full stop
* PR feedback, improve unit tests
* Add mock_get_task fixture
* Further improve unit tests
* Minor unit test improvement
* Add unit tests, increase coverage
* Add unit tests, increase test coverage
* Small optimization, improve _ensure_token_limit unit test
---------
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-06-07 13:34:45 +02:00
bogdankostic
eca8f66ffa
build: Pin mlflow ( #5094 )
2023-06-07 11:24:01 +02:00
Silvano Cerza
a2156ee8fb
fix: Fix handling of streaming response in AnthropicClaudeInvocationLayer ( #4993 )
...
* Fix handling of streaming response in AnthropicClaudeInvocationLayer
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-06-07 10:57:36 +02:00
bogdankostic
da1f245a84
feat: Add batch_size parameter and cast timeout_config value to tuple for WeaviateDocumentStore ( #5079 )
...
* Add batch_size parameter and cast timeout_config to tuple
* Add unit test
* Remove debug tqdm
* Remove debug tqdm introduced in #5063
2023-06-06 17:06:10 +02:00
Sebastian
1777b22fcb
fix: Ensure eval mode for farm and transformer models for predictions ( #3791 )
...
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-06-06 13:06:30 +02:00
ZanSara
97d5db3b9c
revert fix: change the Docker workflow runner ( #5078 )
2023-06-05 19:11:38 +02:00
Silvano Cerza
ffe7b2af9a
Update prompthub-py ( #5061 )
...
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-06-05 17:03:51 +02:00
ZanSara
be3eb3cdb5
fix: change Docker workflow runner ( #5077 )
2023-06-05 15:59:58 +02:00
Michael Feil
6ea8ae01a2
feat: Allow setting custom api_base for OpenAI nodes ( #5033 )
...
* add changes for api_base
* format retriever
* Update haystack/nodes/retriever/dense.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/audio/whisper_transcriber.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/preview/components/audio/whisper_remote.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/answer_generator/openai.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update test_retriever.py
* Update test_whisper_remote.py
* Update test_generator.py
* Update test_retriever.py
* reformat with black
* Update haystack/nodes/prompt/invocation_layer/chatgpt.py
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
* Add unit tests
* apply docstring suggestions
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
Co-authored-by: michaelfeil <me@michaelfeil.eu>
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
2023-06-05 11:32:06 +02:00
ZanSara
5f6d161cfe
pin generalimport ( #5074 )
2023-06-05 10:29:51 +02:00
bogdankostic
9cb83402c4
refactor: Use globally defined request timeout in ElasticsearchDocumentStore and OpenSearchDocumentStore ( #5064 )
...
* Include benchmark config in output
* Use queries from aggregated labels
* Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore
* Use globally defined timeout
* Fix mypy
* Use self.batch_size in write_documents
* Use 10_000 as default batch size
* Add unit tests for write documents
2023-06-05 09:47:31 +02:00
bogdankostic
a9a49e2c0a
feat: Add batching for querying in ElasticsearchDocumentStore and OpenSearchDocumentStore ( #5063 )
...
* Include benchmark config in output
* Use queries from aggregated labels
* Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore
* Fix mypy
* Use self.batch_size in write_documents
* Use 10_000 as default batch size
* Add unit tests for write documents
2023-06-01 18:47:24 +02:00
bogdankostic
c3e59914da
refactor: Delete outdated benchmark files ( #5008 )
2023-06-01 13:59:12 +02:00
ZanSara
8487cddc69
add cli to the jobs list ( #5060 )
2023-06-01 13:22:17 +02:00
bogdankostic
6774e0ae58
fix: Use queries from aggregated labels in benchmarks ( #5054 )
...
* Include benchmark config in output
* Use queries from aggregated labels
2023-06-01 10:49:54 +02:00
ZanSara
89de76d5fe
feat: move cli out from preview ( #5055 )
...
* move cli from preview
* readme
* review feedback
* test mocks & import paths
* import path
2023-05-31 18:34:14 +02:00
Philippe Creux
e209abd48e
Fix doc FARMReader.predict ( #5049 )
2023-05-31 10:01:43 +02:00
Silvano Cerza
3fd9e0fd89
feat: Add CLI prompt cache command ( #5050 )
...
* Add CLI prompt cache command
* Rename prompt cache to prompt fetch
2023-05-30 18:04:52 +02:00
ZanSara
6249e65bc8
feat: prompts caching from PromptHub ( #5048 )
...
* split up prompttemplate init
* caching
* docstring
* add platformdirs
* use user_data_dir
* fix tests
* add tests
* pylint
* mypy
2023-05-30 16:55:48 +02:00
ZanSara
76a6eefe5e
pin prompthub ( #5045 )
2023-05-30 15:36:13 +02:00
Silvano Cerza
ba06bc4805
Unpin typing_extensions and remove all its uses ( #5040 )
2023-05-29 15:31:34 +02:00
ZanSara
9612aa90bb
fix examples ( #5041 )
2023-05-29 15:15:38 +02:00
Silvano Cerza
6c9e062052
chore: Change checklist to simple list in PR template ( #4872 )
...
* Change checklist to simple list in PR template
* Update .github/pull_request_template.md
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-05-29 12:40:18 +02:00
Silvano Cerza
37518c8b8c
chore: Simplify DefaultPromptHandler logic and add tests ( #4979 )
...
* Simplify DefaultPromptHandler logic and add tests
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* Remove commented code
* Split single unit test into multiple tests
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-05-29 12:13:32 +02:00
Fanli Lin
7001aee3fe
fix: prompt_template_resolved.output_variable is NoneType issue ( #4976 )
...
* try except instead or
* fix black formatting
* bug fix
* revert back the formatting
2023-05-29 10:48:10 +02:00
Massimiliano Pippi
4aaf4fcc31
ci: fix Datadog event body ( #5024 )
...
* fix Datadog event body
* Update .github/workflows/license_compliance.yml
Co-authored-by: bogdankostic <bogdankostic@web.de>
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-05-27 18:12:53 +02:00
ZanSara
7e5fa0dd94
fix: Move check for default PromptTemplates in PromptTemplate itself ( #5018 )
...
* make prompttemplate load the defaults instead of promptnode
* add test
* fix tenacity decorator
* fix tests
* fix error handling
* mypy
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-27 18:05:05 +02:00
bogdankostic
b8ff1052d4
refactor: Adapt running benchmarks ( #5007 )
...
* Generate eval result in separate method
* Adapt benchmarking utils
* Adapt running retriever benchmarks
* Adapt error message
* Adapt running reader benchmarks
* Adapt retriever reader benchmark script
* Adapt running benchmarks script
* Adapt README.md
* Raise error if file doesn't exist
* Raise error if path doesn't exist or is a directory
* minor readme update
* Create separate methods for checking if pipeline contains reader or retriever
* Fix reader pipeline case
---------
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-05-26 18:48:11 +02:00
Julian Risch
2ede4d1d1d
build: Remove dill dependency ( #4985 )
...
* remove dill dependency
* remove dill from .toml
2023-05-26 17:50:55 +02:00
bogdankostic
5633446173
refactor: Add reader-retriever benchmark script ( #5006 )
...
* Generate eval result in separate method
* Adapt benchmarking utils
* Adapt running retriever benchmarks
* Adapt error message
* Adapt running reader benchmarks
* Adapt retriever reader benchmark script
* Raise error if file doesn't exist
* Raise error if path doesn't exist or is a directory
* Remove unused line
* Create separate method for getting reader config
* Make use of get_reader_config
* Create separate method for retriever config
2023-05-26 13:54:52 +02:00
bogdankostic
796340e788
refactor: Adapt reader benchmarks ( #5005 )
2023-05-26 11:40:35 +02:00