haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-08-09 00:59:08 +00:00

Author	SHA1	Message	Date
bogdankostic	8c63e295f4	fix: Allow filtering on list fields in `InMemoryDocumentStore` with all operators (#5208 ) * Add support for list fields * Unskip tests	2023-06-29 12:10:39 +02:00
Massimiliano Pippi	6373e2ea66	refactor: prepare support to Elasticsearch 8 (#5226 ) * make a package * Update haystack/document_stores/elasticsearch/es7.py Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * do not expose ES types from the package --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-06-29 11:06:20 +02:00
bogdankostic	ed1bad1155	fix: Use `add_isolated_node_eval` of `eval_batch` in `run_batch` (#5223 ) * Fix isolated node eval in eval_batch * Add unit test	2023-06-28 16:51:23 +02:00
Vladimir Blagojevic	bc86f57715	feat: BM25 retrieval for `MemoryDocumentStore` (#5151 )	2023-06-27 17:42:23 +02:00
Massimiliano Pippi	c068e34954	Remove deprecated param `return_table_cell` (#5218 ) * remove deprecated param * Update haystack/nodes/reader/table.py Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> * try * remove unused functions and ignore mypy error --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>	2023-06-27 16:14:29 +02:00
ZanSara	462f3a5c99	feat: globally disable progress bars (#5207 ) * add SilenceableTqdm and update usage * pylint * rename module * add tests	2023-06-27 11:45:17 +02:00
Vladimir Blagojevic	5ee393226d	fix: Support all SageMaker HF text generation models (other than Falcon) (#5205 ) * Create SageMaker base class and two implementation subclasses --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>	2023-06-26 19:59:16 +02:00
bogdankostic	82291b56ad	fix: Send batches of query-doc pairs to inference_from_objects (#5125 ) * Send batches of query-doc pairs to inference_from_objects * Use absolute import path * Add separate preprocessing_batch_size parameter	2023-06-26 14:26:26 +02:00
Vladimir Blagojevic	eb2255c0dd	Rename SageMakerInvocationLayer -> SageMakerHFTextGenerationInvocationLayer (#5204 )	2023-06-26 11:03:30 +02:00
Stefano Fiorucci	25d5dedb46	Fix: `FARMReader` - Consider the max number of labels/answers during training (#5197 ) * first draft * improve it a bit * unit tests * PR review, improved tests * PR review, improved tests 2	2023-06-26 10:14:21 +02:00
Sebastian	f1932492f1	feat: Add CohereRanker node using Cohere reranking endpoint (#5152 ) * Started to add CohereRanker node * Small refactoring of SentenceTransformersRanker node * Started to add predict_batch method * Simplified predict_batch code * Added missing imports * Undoing a change * Fix mypy * Adding unit tests using mocking * Updated truncation warning message. * Update doc strings * Update to docs * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/ranker/cohere.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Updating docs to reflect PR discussion * Update haystack/nodes/ranker/cohere.py Co-authored-by: Daria Fokina <daria.f93@gmail.com> --------- Co-authored-by: bogdankostic <bogdankostic@web.de> Co-authored-by: Daria Fokina <daria.f93@gmail.com>	2023-06-23 16:46:46 +02:00
Malte Pietsch	c9179ed0eb	feat: enable LLMs hosted via AWS SageMaker in PromptNode (#5155 ) * Add SageMakerInvocationLayer --------- Co-authored-by: oryx1729 <78848855+oryx1729@users.noreply.github.com> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2023-06-23 15:33:20 +02:00
ZanSara	31664627eb	feat: hard document length limit at `max_chars_check` (#5191 ) * implement hard cut at max_chars_check * regenerate ids * black * docstring * black	2023-06-23 12:34:19 +02:00
ZanSara	36192eca72	feat: `current_datetime` shaper function (#5195 ) * current_datetime shaper * explicitly add current_datetime to the functions allowed in a prompt template	2023-06-23 10:33:34 +02:00
bogdankostic	612c5cd005	chore: Remove `add_tool` from `ToolsManager` (#5192 ) * Remove add_tool from ToolsManager * Fix tests	2023-06-23 09:26:06 +02:00
Sebastian	1602f3abdd	test: Adding unit tests to Ranker (#5167 ) * adding unit tests for sentence transformers ranker * Adding more unit tests * Remove empty line * Undo static method * Revert change * Updated indentation and added match message * Remove unneeded paranthesis	2023-06-22 15:23:23 +02:00
Michael Feil	cfd703fa3e	fix: model_tokenizer in openai text completion tokenization details (#5104 ) * fix: model_tokenizer * Update test --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>	2023-06-22 14:23:19 +02:00
Stefano Fiorucci	637433841e	chore: remove deprecated `Seq2SeqGenerator` and `RAGenerator` (#5180 ) * first draft of removal * more removals * don't download unused models	2023-06-21 16:38:45 +02:00
Sebastian	7a140c1524	feat: add ensure token limit for direct prompting of ChatGPT (#5166 ) * Add support for prompt truncation when using chatgpt if direct prompting is used * Update tests for test token limit for prompt node * Update warning message to be correct * Minor cleanup * Mark back to integration * Update count_openai_tokens_messages to reflect changes shown in tiktoken * Use mocking to avoid request call * Fix test to make it comply with unit test requirements * Move tests to respective invocation layers * Moved fixture to one spot	2023-06-21 15:41:28 +02:00
Vladimir Blagojevic	089187ac8b	fix: Check Agent's prompt template variables and prompt resolver parameters are aligned (#5163 ) * Check Agent's prompt template parameters and prompt resolver parameters are aligned * Lower the logger warning * Automatically append transcript if needed * Amend flaky test	2023-06-21 14:34:41 +02:00
Bilge Yücel	6a1b6b1ae3	feat: Update ConversationalAgent (#5065 ) * feat: Update ConversationalAgent * Add Tools * Add test * Change default params * fix tests * Fix circular import error * Update conversational-agent prompt * Add conversational-agent-without-tools to legacy list * Add warning to add tools to conversational agent * Add callable tools * Add example script * Fix linter errors * Update ConversationalAgent depending on the existance of tools * Initialize the base Agent with different arguments when there's tool * Inject memory to the prompt in both cases, update prompts accordingly * Override the add_tools method to prevent adding tools to ConversationalAgent without tools * Update test * Fix linter error * Remove unused import * Update docstrings and api reference * Fix imports and doc string code snippet * docstrings update * Update conversational.py * Mock PromptNode * Prevent circular import error * Add max_steps to the ConversationalAgent * Update resolver description * Add prompt_template as parameter * Change docstring --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-20 13:09:21 +03:00
Shukri	916e8452f5	feat!: simplify weaviate auth (#5115 ) * feat!: simplify weaviate auth * docs: explain param precedence * refactor: simplify _get_embedded_options	2023-06-19 15:46:58 +02:00
Ben Heckmann	1318ac5074	feat: Optional Content Moderation for OpenAI PromptNode & OpenAIAnswerGenerator (#5017 ) * #4071 implemented optional content moderation for OpenAI PromptNode * added two simple integration tests * improved documentation & renamed _invoke method to _execute_openai_request * added a flag to check_openai_policy_violation that will return a full dict of all text violations and their categories * re-implemented the tests as unit tests & without use of the OpenAI APIs * removed unused patch * changed check_openai_policy_violation back to only return a bool * fixed pylint and test error --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-06-19 13:27:11 +02:00
ZanSara	f52477d31b	fix: small improvement to pipeline v2 tests (#5153 ) * add missing return * improve test * docstring	2023-06-16 12:07:00 +02:00
Vladimir Blagojevic	8d8de65492	Add AgentToolLogger, unit test, and example usage (#5087 )	2023-06-15 08:43:20 +02:00
bogdankostic	7731713a1e	test: Add benchmark config files (#5093 ) * Add config files * Add top-k and batch size to configs * Add batch size to configs * Add batch size to configs * Remove configs using 1m docs	2023-06-14 18:15:50 +02:00
Ben Heckmann	60e5d73424	fix: changing document scores (#5090 ) * #4653 fix changing scores by returning new document objects from document store queries * added integration test for InMemoryDocumentStore demonstrating the desired behavior * Update test/document_stores/test_memory.py	2023-06-14 17:35:46 +02:00
Julian Risch	ce1c9c9ddb	fix: Relax ChatGPT model name check to support gpt-3.5-turbo-0613 (#5142 ) * relax model name checking for chatgpt * add unit tests	2023-06-14 09:53:00 +02:00
Julian Risch	4c8e0b9d4a	fix: PromptNode falls back to empty list of documents if none are provided but expected (#5132 ) * add warning, default to empty docs list, tests * pylint	2023-06-13 16:35:19 +02:00
Silvano Cerza	3b8992968d	test: Skip flaky PromptNode test (#5039 ) * Skip flaky PromptNode test * Add skip reason * Update test/prompt/test_prompt_node.py Co-authored-by: bogdankostic <bogdankostic@web.de> --------- Co-authored-by: bogdankostic <bogdankostic@web.de>	2023-06-13 16:24:29 +02:00
ZanSara	65cdf36d72	chore: block all HTTP requests in CI (#5088 )	2023-06-13 14:52:24 +02:00
ZanSara	3c71f0ae3d	chore: mark some unit tests under `test/pipeline` (#5124 ) * mark some unit tests as such * remove marker	2023-06-12 17:58:31 +02:00
ZanSara	49e037a055	fix: rename `requests.py` into `requests_utils.py` (#5099 ) * requests.py -> requests_utils.py * fix tests * reimport requrests * fix more tests * review feedback	2023-06-12 12:40:21 +02:00
Vladimir Blagojevic	0cc9ce7522	fix: WebRetriever top_k is ignored in a pipeline (#5106 ) * Initial changes * Add WebSearch, WebRetriever top_k unit tests * Add exact integration test that failed Tuana * PR review	2023-06-09 10:42:37 +02:00
Julian Risch	d8a4f20379	feat: Consider prompt_node's default_prompt_template in agent (#5095 ) * consider prompt_node's default_prompt_template in agent * make test a unit test via mocking * updated docstring	2023-06-08 13:42:28 +02:00
Vladimir Blagojevic	e3b069620b	feat: pass model parameters to HFLocalInvocationLayer via `model_kwargs`, enabling direct model usage (#4956 ) * Simplify HFLocalInvocationLayer, move/add unit tests * PR feedback * Better pipeline invocation, add mocked tests * Minor improvements * Mock pipeline directly, unit test updates * PR feedback, change pytest type to integration * Mock supports unit test * add full stop * PR feedback, improve unit tests * Add mock_get_task fixture * Further improve unit tests * Minor unit test improvement * Add unit tests, increase coverage * Add unit tests, increase test coverage * Small optimization, improve _ensure_token_limit unit test --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-07 13:34:45 +02:00
Silvano Cerza	a2156ee8fb	fix: Fix handling of streaming response in AnthropicClaudeInvocationLayer (#4993 ) * Fix handling of streaming response in AnthropicClaudeInvocationLayer --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-07 10:57:36 +02:00
bogdankostic	da1f245a84	feat: Add batch_size parameter and cast timeout_config value to tuple for `WeaviateDocumentStore` (#5079 ) * Add batch_size parameter and cast timeout_config to tuple * Add unit test * Remove debug tqdm * Remove debug tqdm introduced in #5063	2023-06-06 17:06:10 +02:00
Sebastian	1777b22fcb	fix: Ensure eval mode for farm and transformer models for predictions (#3791 ) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-06-06 13:06:30 +02:00
Michael Feil	6ea8ae01a2	feat: Allow setting custom api_base for OpenAI nodes (#5033 ) * add changes for api_base * format retriever * Update haystack/nodes/retriever/dense.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/audio/whisper_transcriber.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/preview/components/audio/whisper_remote.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/answer_generator/openai.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update test_retriever.py * Update test_whisper_remote.py * Update test_generator.py * Update test_retriever.py * reformat with black * Update haystack/nodes/prompt/invocation_layer/chatgpt.py Co-authored-by: Daria Fokina <daria.f93@gmail.com> * Add unit tests * apply docstring suggestions --------- Co-authored-by: bogdankostic <bogdankostic@web.de> Co-authored-by: michaelfeil <me@michaelfeil.eu> Co-authored-by: Daria Fokina <daria.f93@gmail.com>	2023-06-05 11:32:06 +02:00
bogdankostic	a9a49e2c0a	feat: Add batching for querying in `ElasticsearchDocumentStore` and `OpenSearchDocumentStore` (#5063 ) * Include benchmark config in output * Use queries from aggregated labels * Introduce batching for querying in ElasticsearchDocStore and OpenSearchDocStore * Fix mypy * Use self.batch_size in write_documents * Use 10_000 as default batch size * Add unit tests for write documents	2023-06-01 18:47:24 +02:00
bogdankostic	c3e59914da	refactor: Delete outdated benchmark files (#5008 )	2023-06-01 13:59:12 +02:00
bogdankostic	6774e0ae58	fix: Use queries from aggregated labels in benchmarks (#5054 ) * Include benchmark config in output * Use queries from aggregated labels	2023-06-01 10:49:54 +02:00
ZanSara	89de76d5fe	feat: move `cli` out from `preview` (#5055 ) * move cli from preview * readme * review feedback * test mocks & import paths * import path	2023-05-31 18:34:14 +02:00
Silvano Cerza	3fd9e0fd89	feat: Add CLI prompt cache command (#5050 ) * Add CLI prompt cache command * Rename prompt cache to prompt fetch	2023-05-30 18:04:52 +02:00
ZanSara	6249e65bc8	feat: prompts caching from PromptHub (#5048 ) * split up prompttemplate init * caching * docstring * add platformdirs * use user_data_dir * fix tests * add tests * pylint * mypy	2023-05-30 16:55:48 +02:00
Silvano Cerza	37518c8b8c	chore: Simplify DefaultPromptHandler logic and add tests (#4979 ) * Simplify DefaultPromptHandler logic and add tests Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> * Remove commented code * Split single unit test into multiple tests --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2023-05-29 12:13:32 +02:00
ZanSara	7e5fa0dd94	fix: Move check for default `PromptTemplate`s in `PromptTemplate` itself (#5018 ) * make prompttemplate load the defaults instead of promptnode * add test * fix tenacity decorator * fix tests * fix error handling * mypy --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-05-27 18:05:05 +02:00
bogdankostic	b8ff1052d4	refactor: Adapt running benchmarks (#5007 ) * Generate eval result in separate method * Adapt benchmarking utils * Adapt running retriever benchmarks * Adapt error message * Adapt running reader benchmarks * Adapt retriever reader benchmark script * Adapt running benchmarks script * Adapt README.md * Raise error if file doesn't exist * Raise error if path doesn't exist or is a directory * minor readme update * Create separate methods for checking if pipeline contains reader or retriever * Fix reader pipeline case --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-05-26 18:48:11 +02:00
bogdankostic	5633446173	refactor: Add reader-retriever benchmark script (#5006 ) * Generate eval result in separate method * Adapt benchmarking utils * Adapt running retriever benchmarks * Adapt error message * Adapt running reader benchmarks * Adapt retriever reader benchmark script * Raise error if file doesn't exist * Raise error if path doesn't exist or is a directory * Remove unused line * Create separate method for getting reader config * Make use of get_reader_config * Create separate method for retriever config	2023-05-26 13:54:52 +02:00

... 7 8 9 10 11 ...

1204 Commits