haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-08 22:03:54 +00:00

Author	SHA1	Message	Date
Agnieszka Marzec	1d4883f178	update docstrings (#8117 )	2024-07-30 11:10:36 +02:00
Agnieszka Marzec	42f59fc022	update docstrings (#8115 )	2024-07-30 11:08:45 +02:00
Daria Fokina	21de1f87d4	docs: clean up docstrings of AnswerBuilder (#8094 ) * answerbuilder docstrings * update the `replies` * Apply suggestions from code review Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com> * Update answer_builder.py --------- Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>	2024-07-30 11:06:39 +02:00
Agnieszka Marzec	e8598befb6	Docs: Update OpenAIGen docstrings and add missing headers (#8105 ) * update docstrings * Update haystack/components/generators/openai.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-30 11:06:17 +02:00
Daria Fokina	92e2377eff	docs: clean up docstrings of FileTypeRouter (#8098 ) * upd filetyperouter docstrings * Suggestions from code review Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com> * aga's suggestions --------- Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>	2024-07-30 08:39:08 +02:00
Agnieszka Marzec	8ce7bedf25	Docs: Update DocSplitter docstrings (#8081 ) * update docstrings * Update haystack/components/preprocessors/document_splitter.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_splitter.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * fix article --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-29 15:11:12 +02:00
Agnieszka Marzec	abb24c61c2	Docs: Update DocumentEmbedder docstrings (#8112 ) * update docstrings * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * fix casing --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-29 15:10:49 +02:00
Agnieszka Marzec	950c632009	Docs: Update DocumentCleaner docstrings (#8106 ) * update docstrings * Update haystack/components/preprocessors/document_cleaner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * fix article --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-29 14:45:15 +02:00
Agnieszka Marzec	da81d10060	Docs: Update DocumentJoiner docstrings (#8109 ) * update docstrings * Update haystack/components/joiners/document_joiner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/joiners/document_joiner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/joiners/document_joiner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/joiners/document_joiner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/joiners/document_joiner.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * fix typo * fix typo --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-29 14:39:44 +02:00
Corentin Meyer	1c53aae8f0	fix: Tika converter not yielding page break tags (`\f`) (#8082 ) * Fix TikaConverter not having \f page tag by using HTML mode of parsing and then parsing the HTML to text using the old Haystack 1.X integration as template. * Add Reno * Fix test by making Mock Tika return XML (before parsing) * refinements and test --------- Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2024-07-26 20:13:47 +02:00
Amna Mubashar	e0de423ee0	Rename SentenceWindowRetrieval to SentenceWindowRetriever	2024-07-26 17:46:44 +02:00
Silvano Cerza	3fed1366c4	fix: Fix issue that could lead to RCE if using unsecure Jinja templates (#8095 ) * Fix issue that could lead to RCE if using unsecure Jinja templates * Add comment explaining exception suppression * Update release note * Update release note	2024-07-26 14:02:09 +00:00
Nicola Procopio	47f4db8698	added truncate_dim to sentence transformers embedder (#8077 ) * added truncate_dim to sentence transformers embedder * Update haystack/components/embedders/sentence_transformers_document_embedder.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update releasenotes/notes/release-note-2b603a123cd36214.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * fixed parameter description * added test for truncation to text embedder * fix format --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-26 10:39:48 +02:00
Madeesh Kannan	b2aef217da	chore: Remove deprecated `DynamicPromptBuilder` and `DynamicChatPromptBuilder` components (#8085 )	2024-07-26 10:00:59 +02:00
Daria Fokina	f372ca443c	bm25 retriever docstrings (#8087 )	2024-07-25 17:28:21 +02:00
Agnieszka Marzec	1f58ec20a8	Docs: Standardize and improve SentenceTransformersTextEmbedder docstrings (#8060 ) * Update docstrings * format * add Daria's comments * Update haystack/components/embedders/sentence_transformers_text_embedder.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/embedders/sentence_transformers_text_embedder.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-07-25 13:56:51 +02:00
Agnieszka Marzec	de728b4877	Docs: Simplify lg + standardize docstrings (#8057 ) * Simplify lg + standardize * Format * Update formatting * Fix formatting again * Fix empty line * Change formatting * Format with black --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2024-07-25 13:24:42 +02:00
Agnieszka Marzec	855f8e61f3	Docs: Update InMemoryEmbeddingRetriever docstrings (#8068 ) * update docstrings * Update documents to lowercase	2024-07-25 13:24:00 +02:00
Madeesh Kannan	f9e4d5dc58	chore: Deprecate the `debug` parameter in `Pipeline.run` (#8075 )	2024-07-25 09:58:57 +00:00
Tobias Wochinger	4dde6fbaec	build: unpin structlog (#8071 )	2024-07-24 20:58:34 +02:00
Amna Mubashar	b374c528b2	Assign streaming_callback to OpenAIGenerator and OpenAIChatGenerator in run() method (#8054 ) * Add optional parameter for streaming_callback in run() method	2024-07-24 15:49:19 +02:00
Sebastian Husch Lee	baed478f23	fix: Fix `split_start_idx` and `_split_overlap` information in `DocumentSplitter` (#8046 ) * Fix bug in DocumentSplitter and expand tests to catch said bug * Fix split overlap information calc and actually test it * Add release notes * Remove comments * Same fix in SentenceWindowRetrieval --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2024-07-24 15:15:36 +02:00
Stefano Fiorucci	b36ec0a38c	fix release note (#8070 )	2024-07-24 15:03:01 +02:00
Tobias Wochinger	38d38678c7	fix: fix PPTX import (#8069 ) * fix: fix PPTX import * docs: add release notes	2024-07-24 14:50:47 +02:00
Agnieszka Marzec	a022af02bc	Update docstrings (#8066 )	2024-07-24 13:54:39 +02:00
Madeesh Kannan	4650263bc3	chore: Remove deprecated init paramters from `HTMLToDocument` (#8056 ) * chore: Remove deprecated init paramters from `HTMLToDocument` * Fix reno	2024-07-24 13:16:47 +02:00
David S. Batista	0c9dc008f0	fix: improve context relevancy metric (#7964 ) * fixing tests * fixing tests * updating tests * updating tests * updating docstring * adding release notes * making the insufficient information more robust * updating docstring and release notes * empty list instead of informative string * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * fixing tests * Update haystack/components/evaluators/context_relevance.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * reverting commit * reverting again commit * fixing docstrings * removing deprecation warning * removing warning import --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-22 15:13:46 +02:00
Vladimir Blagojevic	a59de1d7b3	chore: Combined main unblock (#8045 ) * Pin structlog to 24.2.0 due to unit test failures * Remove object init parameter in huggingface_hub unit tests * Use less restrictive structlog pin * Add release note	2024-07-19 10:39:10 +02:00
Daria Fokina	913078dfaa	docs: add sentence window retrieval to api reference (#8032 ) * docs: add sentence window retrieval to api reference * deprecating multiplexer	2024-07-17 11:16:58 +02:00
Amna Mubashar	3fa6c253c3	fix: Prevent `Pipeline.from_dict` from modifying the dictionary parameter passed to it (#8030 ) * Updated the pipeline deserialization	2024-07-17 10:28:29 +02:00
David S. Batista	431aa4a406	updating sentence window retriever tests (#8034 ) * updating sentence window retriever tests * fix	2024-07-16 22:10:55 +02:00
David S. Batista	3ed69c4aab	docs: adding example to docstring to SentenceWindowRetrieval (#8031 ) * adding example to docstring * small fix * Update haystack/components/retrievers/sentence_window_retrieval.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/retrievers/sentence_window_retrieval.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * PR comments * Update haystack/components/retrievers/sentence_window_retrieval.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * PR comments * PR comments --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2024-07-16 16:22:26 +02:00
Amna Mubashar	499fbcc59f	Remove Multiplexer and related tests (#8020 )	2024-07-16 15:39:40 +02:00
Silvano Cerza	0411cd938a	Fix bug in Pipeline.run() executing Components in a wrong and unexpected order (#8021 ) * Fix bug in Pipeline.run() executing Components in a wrong and unexpected order * Update haystack/core/pipeline/base.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-07-12 15:30:10 +00:00
Madeesh Kannan	94b806815c	refactor: Improve error messages shown during pipeline deserialization (#8016 ) * refactor: Improve error messages shown during pipeline deserialization * Add link to release notes * Update release notes link	2024-07-12 14:47:00 +00:00
Anushree Bannadabhavi	1f05e633a9	refactor: refactor DocumentJoiner to follow enum pattern for join_mode parameter (#8010 ) * refactor document joiner to follow enum pattern for join mode * Added to_dict and from_dict	2024-07-12 11:29:44 +02:00
Silvano Cerza	0cec82e55e	refactor: Pipeline.run() (#8019 ) * Move utility functions from _enqueue_next_runnable_component (#7895) * Isolate logic to check if we're stuck in a loop * Simplify for else * Add missing return in docstring * Emit warning when stuck in a loop * Fix docstring Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Add utility function to move Components in queues * Add function to find next Component to run * Comment update * Add missing break in loop * Make _add_missing_input_defaults less error prone and add tests * Fix tests * Update docstring * Simplify enqueue logic * Remove unused _enqueue_next_runnable_component function * Add method to find Component with lazy variadic input or all inputs with defaults * Simplify _find_next_runnable_lazy_variadic_or_default_component * Remove unnecessary type ignore * Split _dequeue_components_that_received_no_input into separate functions * Fix linting * Simplify variadic check when running Component * Simplify code * Reorganize functions used by Pipeline.run * Rename variables used in Pipeline.run() for clarity * Add comment clarifying last_waiting_queue and before_last_waiting_queue * Add functions to easily update waiting_queue --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-07-12 08:35:23 +00:00
David S. Batista	d02356fe7a	chore: normalise the use of `importlib` in getting an object from a qualified name string across the codebase (#8012 ) * initial import * cleaning up * removing unused imports	2024-07-11 16:14:00 +02:00
Madeesh Kannan	8faa3fa465	Revert "fix: make PyPDF backward compatible (#7996 )" (#8014 ) This reverts commit 58b48e36eb56a896365133ab4a9d8e327989948c.	2024-07-11 13:06:08 +00:00
Ulises M	6f8834d036	feat: add and expose api_params for OpenAIGenerator in LLMEvaluator based classes (#7987 ) * initial support for api_params * add tests and reno * resolve suggestions and add integration test * fix mypy --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-07-11 13:14:03 +02:00
David S. Batista	ebfeb571d7	feat: add sentence window retrieval (#7997 ) * initial import * adding tests * adding license and release notes * adding missing release notes * working with any type of doc store * nit * adding get_class_object to serialization package * nit * refactoring get_class_object() * refactoring get_class_object() * chaning type and var names * more refactoring * Update haystack/core/serialization.py Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> * Update haystack/core/serialization.py Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> * Update test/core/test_serialization.py Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> * more refactoring * more refactoring * Pydoc syntax --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2024-07-10 13:13:46 +00:00
Sebastian Husch Lee	c121c86c4c	fix: Fix from_dict methods of components using HF models to work with default values (#8003 ) * Fix from_dict to work if device isn't provided in init params * Minor refactoring of from_dict for components that load HF models * Add tests * Update tests to test loading with all default parameters * Add more tests * Add release notes * Add unit test for whisper local * Update reno * Add fix for ExtractiveReader * Fix NamedEntityExtractor	2024-07-10 12:18:05 +02:00
Madeesh Kannan	f19131f13a	chore: Deprecate legacy document/metadata filters (#8004 )	2024-07-09 16:18:38 +02:00
tstadel	7e35280d4f	fix: LinkContentFetcher html text encoding (#7975 ) * fix: content encoding of LinkContentFetcher * fix tests * add reno * only touch html	2024-07-09 15:28:49 +02:00
Sebastian Husch Lee	583eb8a293	fix: `TransformersZeroShotTextRouter` and `TransformersTextRouter` from_dict to work with default value for huggingface_pipeline_kwargs (#8002 ) * Fix default value for huggingface_pipeline_kwargs * Add reno note * Update HuggingFaceLocalGenerator.from_dict to use the same logic as HuggingFaceLocalChatGenerator.from_dict * Update tests slightly * Update release note	2024-07-09 13:32:44 +02:00
Tobias Wochinger	58b48e36eb	fix: make PyPDF backward compatible (#7996 ) * fix: make PyPDF backward compatible * Add release note --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2024-07-09 10:08:37 +02:00
Nitanshu Vashistha	cd8a5b98fe	feat: Configure max_retries & timeout for AzureOpenAITextEmbedder (#7993 ) max_retries: if not set is read from the OPENAI_MAX_RETRIES env variable or set to 5. timeout: if not set is read from the OPENAI_TIMEOUT env variable or set to 30. Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>	2024-07-09 09:56:46 +02:00
Nitanshu Vashistha	f9d53c5ca8	feat: Configure max_retries and timeout for AzureOpenAIDocumentEmbedder (#7994 ) * feat: Configure max_retries & timeout for AzureOpenAIDocumentEmbedder max_retries: if not set is read from the OPENAI_MAX_RETRIES env variable or set to 5. timeout: if not set is read from the OPENAI_TIMEOUT env variable or set to 30. Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> * Update retries-and-timeout-for-AzureOpenAIDocumentEmbedder-006fd84204942e43.yaml * Update haystack/components/embedders/azure_document_embedder.py * Update haystack/components/embedders/azure_document_embedder.py --------- Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>	2024-07-08 22:35:25 +02:00
Nitanshu Vashistha	376336686b	feat: Configure `max_retries` and timeout for `AzureOpenAIChatGenerator` (#7988 ) * feat: Configure max_retries & timeout for AzureOpenAIChatGenerator max_retries: if not set is read from the OPENAI_MAX_RETRIES env variable or set to 5. timeout: if not set is read from the OPENAI_TIMEOUT env variable or set to 30. Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> * Update haystack/components/generators/chat/azure.py * Update haystack/components/generators/chat/azure.py * Update max_retries-for-AzureOpenAIChatGenerator-9e49b4c7bec5c72b.yaml --------- Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>	2024-07-08 22:34:51 +02:00
Haystack Bot	d7a7d9c1fb	Update unstable version to 2.4.0-rc0 (#7992 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2024-07-08 14:32:56 +02:00

... 4 5 6 7 8 ...

3803 Commits