haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-10-26 23:38:58 +00:00

Author	SHA1	Message	Date
Daria Fokina	23a22be03c	docs: update CLI readme (#5129 ) * docs: update CLI readme Update CLI Readme for easier understanding and more details. * update cache path * version as separate command * resolve comments * prompthub_cache_path env var * wording * Update fetch.py	2023-06-15 16:32:36 +02:00
Vladimir Blagojevic	8d8de65492	Add AgentToolLogger, unit test, and example usage (#5087 )	2023-06-15 08:43:20 +02:00
bogdankostic	7731713a1e	test: Add benchmark config files (#5093 ) * Add config files * Add top-k and batch size to configs * Add batch size to configs * Add batch size to configs * Remove configs using 1m docs	2023-06-14 18:15:50 +02:00
Ben Heckmann	60e5d73424	fix: changing document scores (#5090 ) * #4653 fix changing scores by returning new document objects from document store queries * added integration test for InMemoryDocumentStore demonstrating the desired behavior * Update test/document_stores/test_memory.py	2023-06-14 17:35:46 +02:00
darionreyes	58c022ef86	fix: increase max token length for openai 16k models (#5145 )	2023-06-14 16:24:04 +02:00
ZanSara	20c1f23fff	feat: optional `transformers` (#5101 ) * generalimport -> lazy-imports * remove generalimport * fix pdftotextconverter import check * customize error messages * pylint * fix sql.py * pylint * Update haystack/document_stores/sql.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * make contextmanager less verbose * do not catch syntax errors * review feedback * make all torch and transformers import lazy * fix environment.py * mypy * merge leftovers * fix schema * pylint * review feedback --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-06-14 12:00:20 +02:00
Julian Risch	ce1c9c9ddb	fix: Relax ChatGPT model name check to support gpt-3.5-turbo-0613 (#5142 ) * relax model name checking for chatgpt * add unit tests	2023-06-14 09:53:00 +02:00
Julian Risch	4c8e0b9d4a	fix: PromptNode falls back to empty list of documents if none are provided but expected (#5132 ) * add warning, default to empty docs list, tests * pylint	2023-06-13 16:35:19 +02:00
Silvano Cerza	3b8992968d	test: Skip flaky PromptNode test (#5039 ) * Skip flaky PromptNode test * Add skip reason * Update test/prompt/test_prompt_node.py Co-authored-by: bogdankostic <bogdankostic@web.de> --------- Co-authored-by: bogdankostic <bogdankostic@web.de>	2023-06-13 16:24:29 +02:00
ZanSara	65cdf36d72	chore: block all HTTP requests in CI (#5088 )	2023-06-13 14:52:24 +02:00
bogdankostic	29a6bfe621	fix: Don't log info message in DataSilo with SquadProcessor about clipping (#5127 )	2023-06-13 10:31:39 +02:00
Julian Risch	b2b4ccdb87	build: Upgrade transformers to v4.30.1 (#5120 )	2023-06-13 10:13:39 +02:00
ZanSara	3c71f0ae3d	chore: mark some unit tests under `test/pipeline` (#5124 ) * mark some unit tests as such * remove marker	2023-06-12 17:58:31 +02:00
ZanSara	49e037a055	fix: rename `requests.py` into `requests_utils.py` (#5099 ) * requests.py -> requests_utils.py * fix tests * reimport requrests * fix more tests * review feedback	2023-06-12 12:40:21 +02:00
Julian Risch	72fe43a7cc	build: Move Azure's Form Recognizer dependency to extras (#5096 ) * build: Move Azure's Form Recognizer dependency to extras * try catch imports for AzureConverter * assign None to failed imports * use lazy import * use forward reference in type hints	2023-06-12 12:23:32 +02:00
Vladimir Blagojevic	0cc9ce7522	fix: WebRetriever top_k is ignored in a pipeline (#5106 ) * Initial changes * Add WebSearch, WebRetriever top_k unit tests * Add exact integration test that failed Tuana * PR review	2023-06-09 10:42:37 +02:00
Julian Risch	d8a4f20379	feat: Consider prompt_node's default_prompt_template in agent (#5095 ) * consider prompt_node's default_prompt_template in agent * make test a unit test via mocking * updated docstring	2023-06-08 13:42:28 +02:00
ZanSara	52e7a77595	feat: introduce `lazy_import` (#5084 ) * generalimport -> lazy-imports * remove generalimport * fix pdftotextconverter import check * customize error messages * pylint * fix sql.py * pylint * Update haystack/document_stores/sql.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * make contextmanager less verbose * do not catch syntax errors * review feedback * Update haystack/nodes/file_converter/pdf.py --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-06-08 12:11:38 +02:00
ZanSara	5022abb546	chore: Remove stray import (#5097 )	2023-06-07 18:07:27 +02:00
Vladimir Blagojevic	e3b069620b	feat: pass model parameters to HFLocalInvocationLayer via `model_kwargs`, enabling direct model usage (#4956 ) * Simplify HFLocalInvocationLayer, move/add unit tests * PR feedback * Better pipeline invocation, add mocked tests * Minor improvements * Mock pipeline directly, unit test updates * PR feedback, change pytest type to integration * Mock supports unit test * add full stop * PR feedback, improve unit tests * Add mock_get_task fixture * Further improve unit tests * Minor unit test improvement * Add unit tests, increase coverage * Add unit tests, increase test coverage * Small optimization, improve _ensure_token_limit unit test --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-07 13:34:45 +02:00
bogdankostic	eca8f66ffa	build: Pin mlflow (#5094 )	2023-06-07 11:24:01 +02:00
Silvano Cerza	a2156ee8fb	fix: Fix handling of streaming response in AnthropicClaudeInvocationLayer (#4993 ) * Fix handling of streaming response in AnthropicClaudeInvocationLayer --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-07 10:57:36 +02:00
bogdankostic	da1f245a84	feat: Add batch_size parameter and cast timeout_config value to tuple for `WeaviateDocumentStore` (#5079 ) * Add batch_size parameter and cast timeout_config to tuple * Add unit test * Remove debug tqdm * Remove debug tqdm introduced in #5063	2023-06-06 17:06:10 +02:00
Sebastian	1777b22fcb	fix: Ensure eval mode for farm and transformer models for predictions (#3791 ) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-06-06 13:06:30 +02:00
ZanSara	97d5db3b9c	revert fix: change the Docker workflow runner (#5078 )	2023-06-05 19:11:38 +02:00
Silvano Cerza	ffe7b2af9a	Update prompthub-py (#5061 ) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>	2023-06-05 17:03:51 +02:00
ZanSara	be3eb3cdb5	fix: change Docker workflow runner (#5077 )	2023-06-05 15:59:58 +02:00
Michael Feil	6ea8ae01a2	feat: Allow setting custom api_base for OpenAI nodes (#5033 ) * add changes for api_base * format retriever * Update haystack/nodes/retriever/dense.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/audio/whisper_transcriber.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/preview/components/audio/whisper_remote.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/answer_generator/openai.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update test_retriever.py * Update test_whisper_remote.py * Update test_generator.py * Update test_retriever.py * reformat with black * Update haystack/nodes/prompt/invocation_layer/chatgpt.py Co-authored-by: Daria Fokina <daria.f93@gmail.com> * Add unit tests * apply docstring suggestions --------- Co-authored-by: bogdankostic <bogdankostic@web.de> Co-authored-by: michaelfeil <me@michaelfeil.eu> Co-authored-by: Daria Fokina <daria.f93@gmail.com>	2023-06-05 11:32:06 +02:00
ZanSara	5f6d161cfe	pin generalimport (#5074 )	2023-06-05 10:29:51 +02:00
bogdankostic	9cb83402c4	refactor: Use globally defined request timeout in `ElasticsearchDocumentStore` and `OpenSearchDocumentStore` (#5064 ) * Include benchmark config in output * Use queries from aggregated labels * Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore * Use globally defined timeout * Fix mypy * Use self.batch_size in write_documents * Use 10_000 as default batch size * Add unit tests for write documents	2023-06-05 09:47:31 +02:00
bogdankostic	a9a49e2c0a	feat: Add batching for querying in `ElasticsearchDocumentStore` and `OpenSearchDocumentStore` (#5063 ) * Include benchmark config in output * Use queries from aggregated labels * Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore * Fix mypy * Use self.batch_size in write_documents * Use 10_000 as default batch size * Add unit tests for write documents	2023-06-01 18:47:24 +02:00
bogdankostic	c3e59914da	refactor: Delete outdated benchmark files (#5008 )	2023-06-01 13:59:12 +02:00
ZanSara	8487cddc69	add `cli` to the jobs list (#5060 )	2023-06-01 13:22:17 +02:00
bogdankostic	6774e0ae58	fix: Use queries from aggregated labels in benchmarks (#5054 ) * Include benchmark config in output * Use queries from aggregated labels	2023-06-01 10:49:54 +02:00
ZanSara	89de76d5fe	feat: move `cli` out from `preview` (#5055 ) * move cli from preview * readme * review feedback * test mocks & import paths * import path	2023-05-31 18:34:14 +02:00
Philippe Creux	e209abd48e	Fix doc FARMReader.predict (#5049 )	2023-05-31 10:01:43 +02:00
Silvano Cerza	3fd9e0fd89	feat: Add CLI prompt cache command (#5050 ) * Add CLI prompt cache command * Rename prompt cache to prompt fetch	2023-05-30 18:04:52 +02:00
ZanSara	6249e65bc8	feat: prompts caching from PromptHub (#5048 ) * split up prompttemplate init * caching * docstring * add platformdirs * use user_data_dir * fix tests * add tests * pylint * mypy	2023-05-30 16:55:48 +02:00
ZanSara	76a6eefe5e	pin prompthub (#5045 )	2023-05-30 15:36:13 +02:00
Silvano Cerza	ba06bc4805	Unpin typing_extensions and remove all its uses (#5040 )	2023-05-29 15:31:34 +02:00
ZanSara	9612aa90bb	fix examples (#5041 )	2023-05-29 15:15:38 +02:00
Silvano Cerza	6c9e062052	chore: Change checklist to simple list in PR template (#4872 ) * Change checklist to simple list in PR template * Update .github/pull_request_template.md Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>	2023-05-29 12:40:18 +02:00
Silvano Cerza	37518c8b8c	chore: Simplify DefaultPromptHandler logic and add tests (#4979 ) * Simplify DefaultPromptHandler logic and add tests Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> * Remove commented code * Split single unit test into multiple tests --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2023-05-29 12:13:32 +02:00
Fanli Lin	7001aee3fe	fix: prompt_template_resolved.output_variable is NoneType issue (#4976 ) * try except instead or * fix black formatting * bug fix * revert back the formatting	2023-05-29 10:48:10 +02:00
Massimiliano Pippi	4aaf4fcc31	ci: fix Datadog event body (#5024 ) * fix Datadog event body * Update .github/workflows/license_compliance.yml Co-authored-by: bogdankostic <bogdankostic@web.de> --------- Co-authored-by: bogdankostic <bogdankostic@web.de>	2023-05-27 18:12:53 +02:00
ZanSara	7e5fa0dd94	fix: Move check for default `PromptTemplate`s in `PromptTemplate` itself (#5018 ) * make prompttemplate load the defaults instead of promptnode * add test * fix tenacity decorator * fix tests * fix error handling * mypy --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-05-27 18:05:05 +02:00
bogdankostic	b8ff1052d4	refactor: Adapt running benchmarks (#5007 ) * Generate eval result in separate method * Adapt benchmarking utils * Adapt running retriever benchmarks * Adapt error message * Adapt running reader benchmarks * Adapt retriever reader benchmark script * Adapt running benchmarks script * Adapt README.md * Raise error if file doesn't exist * Raise error if path doesn't exist or is a directory * minor readme update * Create separate methods for checking if pipeline contains reader or retriever * Fix reader pipeline case --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-05-26 18:48:11 +02:00
Julian Risch	2ede4d1d1d	build: Remove dill dependency (#4985 ) * remove dill dependency * remove dill from .toml	2023-05-26 17:50:55 +02:00
bogdankostic	5633446173	refactor: Add reader-retriever benchmark script (#5006 ) * Generate eval result in separate method * Adapt benchmarking utils * Adapt running retriever benchmarks * Adapt error message * Adapt running reader benchmarks * Adapt retriever reader benchmark script * Raise error if file doesn't exist * Raise error if path doesn't exist or is a directory * Remove unused line * Create separate method for getting reader config * Make use of get_reader_config * Create separate method for retriever config	2023-05-26 13:54:52 +02:00
bogdankostic	796340e788	refactor: Adapt reader benchmarks (#5005 )	2023-05-26 11:40:35 +02:00

... 15 16 17 18 19 ...

3115 Commits