haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-06-26 22:00:13 +00:00

Author	SHA1	Message	Date
Julian Risch	1d1c13a8bc	chore: add DocusaurusRenderer and use description, title, id (#9538 )	2025-06-25 09:56:26 +02:00
Stefano Fiorucci	0d0a66b4f5	feat: add `LLMMessagesRouter`, a component to route Chat Messages using LLMs (#9540 ) * llmmessagesrouter - draft * serde methods * refinements, tests and release note * Apply suggestions from code review Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2025-06-24 14:54:20 +02:00
Sebastian Husch Lee	db359cff40	Add state to agent pydocs (#9486 )	2025-06-04 14:01:58 +02:00
atopx	3deaa20cb6	feat: Add HuggingFace API (text-embeddings-inference for rerank model) for component.rankers (#9414 ) * feat(component.rankers): Add HuggingFace API (text-embeddings-inference for rerank) ranker component * update test flow & doc loaders * Support run_async for HuggingFaceAPIRanker * Add release note for HuggingFace API support in component.rankers * Add release note for HuggingFace API support in component.rankers * Add release note for HuggingFace API support in component.rankers * Add release note for HuggingFace API support in component.rankers * fix: 1. `hugging_face_api.HuggingFaceAPIRanker` rename to `hugging_face_tei.HuggingFaceAPIRanker` 2. HuggingFaceAPIRanker: use our Secret API for token 3. add the missing modules for `docs/pydoc/config/rankers_api.yml` 4. added function `async_request_with_retry` for `haystack/utils/requests_utils.py` and added unittest on `test/utils/test_requests_utils.py` 4. HuggingFaceAPIRanker: refactor the retry function to support configuration based on attempts and status code. 5. HuggingFaceAPIRanker: refactor the test into unit tests using mocks * fix(HuggingFaceTEIRanker): change the token check logic to use the resolve_value method. * fix(format): run `hatch run format` * fix: - Force keyword-only arguments in __init__ method by adding , - Clarify token docstring that it's not always required - Copy documents to avoid modifying original objects - Remove test file from slow workflow - Add monkeypatch eånvironment variable cleanup in tests - Fix missing module in rankers_api.yml and sort modules alphabetically - Remove unnecessary test info from release notes fix HuggingFaceTEIRanker： - "None" of "Optional[Secret]" has no attribute "resolve_value" - run/run_async: too many parameters * fix(HuggingFaceTEIRanker) :Revise the docstring of the HuggingFaceTEIRanker, improve the parameter descriptions, ensure consistency and clarity. Add error handling information to enhance the readability of the API response. * fix：unit test for HuggingFaceTEIRanker raise message * fix fmt * minor refinements * refine release note --------- Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2025-05-27 12:44:54 +02:00
Stefano Fiorucci	17432f710d	feat: introduce `SentenceTransformersSimilarityRanker` (#9415 ) * new component + tests * soft deprecation of TransformersSimilarityRanker + reno * add comp files to slow workflow * Apply suggestions from code review Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com> * self.model -> self._cross_encoder * recommend installing sentence-transformers>=4.1.0 --------- Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>	2025-05-21 10:52:46 +02:00
Sebastian Husch Lee	19cf220136	feat: integrate two ready-made SuperComponents from haystack-experimental (#9235 ) * Add super component decorator * Add reno * MultiFileConverter * Add DocumentPreprocessor * Add reno * Add tests and change doc preprocessor to split first then clean * Remove code from merge * Add to pydoc and missing test file * PR comments * Lint fix * Fix mypy * Fix mypy * Add comment * PR comments * Update haystack/components/converters/multi_file_converter.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/preprocessors/document_preprocessor.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/converters/multi_file_converter.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * PR comments * PR comment --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2025-04-17 10:02:26 +00:00
Vladimir Blagojevic	c81d68402c	feat: Add Toolset to tooling architecture (#9161 ) * Add Toolset abstraction * Add reno note * More pydoc improvements * Update test * Simplify, Toolset is a dataclass * Wrap toolset instance with list * Add example * Toolset pydoc serde enhancement * Toolset as init param * Fix types * Linting * Minor updates * PR feedback * Add to pydoc config, minor import fixes * Improve pydoc example * Improve coverage for test_toolset.py * Improve test_toolset.py, test custom toolset serde properly * Update haystack/utils/misc.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Rework Toolset pydoc * Another minor pydoc improvement * Prevent single Tool instantiating Toolset * Reduce number of integration tests * Remove some toolset tests from openai * Rework tests --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2025-04-04 16:09:46 +02:00
Julian Risch	e483ec6f56	feat: integrate Agent from haystack-experimental (#9112 ) * add Agent * add Agent * update imports * add state tests * reno * remove State, its utils, and tests * add pydoc yml for agents * fix module path in serialization test * fix mypy error and use ChatGenerator protocol * remove unused import * address review feedback * remove unused _load_component	2025-03-28 14:23:39 +01:00
Julian Risch	657d09d7f1	feat: integrate updates of Tool, ToolInvoker, State, create_tool_from_function, ComponentTool from haystack-experimental (#9113 ) * update Tool,ToolInvoker,ComponentTool,create_tool_from_function * add State and its utils * add tests for State and its utils * update tests for Tool etc. * reno * fix circular imports * update experimental imports in tests * fix unit tests * fix ChatGenerator unit tests * mypy * add State to init and pydoc * explain State in more detail in release note * add test from #8913 * re-add _check_duplicate_tool_names and refactor imports * rename inputs and outputs	2025-03-28 10:49:23 +01:00
David S. Batista	be2d1fb303	feat: adding `AutoMergingRetriever` and `HierarchicalDocumentSplitter` (#9067 ) * adding Auto-Merging-Retriever * adding release notes * updating tests * adding renamed file * Update haystack/components/preprocessors/hierarchical_document_splitter.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/retrievers/auto_merging_retriever.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * fixing tests and imports * adding pydoc * adding to type checking --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2025-03-19 18:25:23 +00:00
Sebastian Husch Lee	99a998f90b	feat: Add MSGToDocument converter (#8868 ) * Initial commit of MSG converter from Bijay * Updates to the MSG converter * Add license header * Add tests for msg converter * Update converter * Expanding tests * Update docstrings * add license header * Add reno * Add to inits and pydocs * Add test for empty input * Fix types * Fix mypy --------- Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>	2025-02-24 08:12:32 +01:00
Stefano Fiorucci	0409e5da8f	remove base from evaluation pydoc config (#8867 )	2025-02-17 15:19:40 +01:00
David S. Batista	cee52435bf	adding to pydocs (#8846 )	2025-02-12 12:04:50 +01:00
Sebastian Husch Lee	f9e6e481a1	feat: Add new component CSVDocumentSplitter to recursively split CSV documents (#8815 ) * CSV Document Splitter * Add license header * Add newline * Add to docs * Add lineterminator * Updated csv splitter to allow user to specify to split by row, column or both * Adding more tests * Column tests * Some refactoring to remove incorrect dropna call * Fix * More complicated test * Adding more relevant metadata to match whats provided in our other splitters * value error tests * Fix mypy * Docstring updates * Add skip_blank_lines=False * Add to dict test * More from and to dict tests * Fixes * Move dict creation outside of for loop	2025-02-10 18:10:18 +01:00
David S. Batista	f798a9e935	feat: adding `LLMMetadataExtractor` (#8833 ) * fixing linting * adding release notes * updating tests * adding to pydocs * fixing typing due to Optional * fixing docstring	2025-02-10 16:54:25 +00:00
Vladimir Blagojevic	fd5040108a	feat: Add OpenAPIConnector component, improve OpenAPI integration (#8808 ) * Initial OpenAPIConnector * Add reno note * Format * Add headers * Add test dep * Use haystack logger * Fix test * Minor fix, spin CI * Update reno release note format * Add to docs, pydocs improvements	2025-02-10 10:34:37 +01:00
Sebastian Husch Lee	1785ea622e	feat: Add component CSVDocumentCleaner for removing empty rows and columns (#8816 ) * Initial commit for csv cleaner * Add release notes * Update lineterminator * Update releasenotes/notes/csv-document-cleaner-8eca67e884684c56.yaml Co-authored-by: David S. Batista <dsbatista@gmail.com> * alphabetize * Use lazy import * Some refactoring * Some refactoring --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-02-06 17:56:38 +01:00
Stefano Fiorucci	05300490a6	docs: add `ListJoiner` to pydoc configuration (#8821 ) * docs: add ListJoiner to pydoc configuration * Update docs/pydoc/config/joiners_api.yml Co-authored-by: David S. Batista <dsbatista@gmail.com> --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-02-06 08:52:24 +00:00
David S. Batista	26b80778f5	chore: removing NLTKDocumentSplitter (#8724 ) * removing NLTKDocumentSplitter * adding release notes * removing pydocs reference	2025-01-15 16:11:51 +00:00
David S. Batista	ec8666545d	docs: adding RecursiveSplitter to pydoc	2025-01-13 11:46:34 +01:00
Vladimir Blagojevic	d147c7658f	feat: Add `ComponentTool` to Haystack tools (#8693 ) * Initial ComponentTool --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2025-01-13 11:15:33 +01:00
Stefano Fiorucci	08cf09f83f	refactor: `create_tool_from_function` + `tool` decorator (#8697 ) * create_tool_from_function + decorator * release note * improve usage example * add imports to @tool usage example * clarify docstrings * small docstring addition	2025-01-10 12:15:15 +01:00
Stefano Fiorucci	3f15f38c51	refactor: move `Tool` to a separate package; refactor serde (#8690 ) * move tool to separate package; refactor serde * release note * rm unused import	2025-01-09 12:30:13 +01:00
Sebastian Husch Lee	28ad78c73d	feat: Add XLSXToDocument converter (#8522 ) * Add draft of the Excel To Document converter * Add license header * Add release note * Use Union instead of pipe * Add openpyxl as additional dep * Fix zip issue * few updates from Bijay * Update deps * Add markdown test * Adding more example excels and expanding tests * Added more tests * Fix windows test by setting lineterminator * Addressing PR comments * PR comments * Fix linting	2025-01-09 09:03:19 +01:00
Stefano Fiorucci	7dcbf25bd7	feat: add Tool Invoker component (#8664 ) * port toolinvoker * release note	2024-12-20 14:02:42 +01:00
Stefano Fiorucci	96b4a1d2fd	feat: `Tool` dataclass - unified abstraction to represent tools (#8652 ) * draft * del HF token in tests * adaptations * progress * fix type * import sorting * more control on deserialization * release note * improvements * support name field * fix chatpromptbuilder test * port Tool from experimental * release note * docs upd * Update tool.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-12-18 11:36:44 +00:00
Sebastian Husch Lee	e45d3329a1	feat: Adding DALLE image generator (#8448 ) * First pass at adding DALLE image generator * Add missing header * Fix tests * Add tests * Fix mypy * Make mypy happy * More unit tests * Adding release notes * Add a test for run * Update haystack/components/generators/openai_dalle.py Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * Fix pylint * Update haystack/components/generators/openai_dalle.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-11-14 16:19:49 +01:00
David S. Batista	e5a80722c2	feat: adding metadata grouper component (#8512 ) * initial import * making tests more readable; adding docstring * adding release notes * adding LICENSE header * Update test/components/rankers/test_metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * refactoring * fixing docstring * fixing types * test docstrings * renaming test * handling too-many-arguments * liting * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * changing name * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * assiging value inside function for re-use * improving docstring * updating name to MetaFieldGroupingRanker * adding to pydocs * fixing imports * adding output docstring * Update haystack/components/rankers/meta_field_grouper_ranker.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/rankers/__init__.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update releasenotes/notes/add-metadata-grouper-21ec05fd4a307425.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update test/components/rankers/test_metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * update docstring tests * fixing imports * rename modules for consistency * fix pydocs * simplification + more tests --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-11-12 16:01:53 +01:00
Sebastian Husch Lee	294a67e426	feat: Adding StringJoiner (#8357 ) * Adding StringJoiner * Release notes * Remove typing * Remove unused import * Try to fix header * Fix one test * Add to docs, move test to behavioral pipeline test * Undo changes * Fix test * Update haystack/components/joiners/string_joiner.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/joiners/string_joiner.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Provide usage example * Apply suggestions from code review Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2024-10-30 15:03:41 +00:00
Julian Risch	08686d90af	feat: Add DocumentNDCGEvaluator component (#8419 ) * draft new component and tests * draft new component and tests * fix tests, replace usage of get_attr * improve docstrings, refactor tests * add test for mixed documents w/wo scores * add test with multiple lists and update docstring * validate inputs, add tests, make methods static * change fallback to binary relevance * rename validate_init_parameters to validate_inputs	2024-10-01 16:15:02 +02:00
Silvano Cerza	29672d4b42	feat: Add `JSONConverter` Component (#8397 ) * Add JSONConverter Component * Handle some corner cases * Add JSONConverter to pydoc config * Add a way to extract all non content fields as metadata * Small fix in docstring * Fix tests * docstrings upd * Update json.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-09-25 12:34:51 +02:00
Daria Fokina	caf465b004	docs: add NLTKSplitter and ZeroShotClassifier to pydocs (#8384 ) * Update preprocessors_api.yml * Update classifiers_api.yml	2024-09-18 15:55:40 +02:00
Sriniketh J	e98a6fea04	Convertor: CSVToDocument (#8328 ) * carry forwarded initial commit * fix: doc strings * fix: update docstrings * fix: docstring update * fix: csv encoding in actions * fix: line endings through hooks * fix: converter docs addition	2024-09-06 10:59:12 +02:00
Stefano Fiorucci	842a7b80a8	rm sentence_window_retrieval (#8303 )	2024-08-28 10:51:07 +02:00
Amna Mubashar	373de97426	Deprecate SentenceWindowRetrieval (#8206 )	2024-08-13 13:49:41 +02:00
Vladimir Blagojevic	25d3520f5a	feat: Add `AnswerJoiner` new component (#8122 ) * Initial AnswerJoiner * Initial tests * Add release note * Resove mypy warning * Add custom join function * Serialize custom join function * Handle all Answer types, add integration test, improve pydoc * Make fixes * Add to API docs * Add more tests * Update haystack/components/joiners/answer_joiner.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update docstrings and release notes * update docstrings --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com> Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: Darja Fokina <daria.fokina@deepset.ai>	2024-08-01 12:51:17 +02:00
Amna Mubashar	e0de423ee0	Rename SentenceWindowRetrieval to SentenceWindowRetriever	2024-07-26 17:46:44 +02:00
Madeesh Kannan	b2aef217da	chore: Remove deprecated `DynamicPromptBuilder` and `DynamicChatPromptBuilder` components (#8085 )	2024-07-26 10:00:59 +02:00
Daria Fokina	913078dfaa	docs: add sentence window retrieval to api reference (#8032 ) * docs: add sentence window retrieval to api reference * deprecating multiplexer	2024-07-17 11:16:58 +02:00
Stefano Fiorucci	c59ad95f42	chore: remove deprecated TGI generators (#7908 ) * remove deprecated TGI generators * rm unused import	2024-06-21 11:15:13 +02:00
Stefano Fiorucci	75ad76a7ce	chore: remove deprecated TEI embedders (#7907 ) * remove deprecated TEI embedders * rm from the embedders init * rm related tests	2024-06-21 10:36:12 +02:00
Massimiliano Pippi	7c31d5f418	add docstrings for EvaluationRunResult (#7885 )	2024-06-19 11:49:41 +02:00
Carlos Fernández	c1c339923f	feat: add DocxToDocument converter (#7838 ) * first fucntioning DocxFileToDocument * fix lazy import message * add reno * Add license headder Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * change DocxFileToDocument to DocxToDocument * Update library install to the maintained version Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * clan try-exvept to only take non haystack errors into account * Add wanring on docstring of component ignoring page brakes, mark test as skip * make warnings lazy evaluations Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * make warnings lazy evaluations Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Make warnings lazy evaluated Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Solve f bug * Get more metadata from docx files * add 'python-docx' dependency and docs * Change logging import Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Fix typo Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * remake metadata extraction for docx * solve bug regarding _get_docx_metadata method * Update haystack/components/converters/docx.py Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Update haystack/components/converters/docx.py Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Delete unused test --------- Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>	2024-06-12 11:58:36 +02:00
Sebastian Husch Lee	2c2c7c9f56	feat: Add PPTXToDocument converter (#7808 ) * Add first pass at PPTXToDocument converter * Add test and update code * Add doc string * Update docstrings * Add release notes * remove unused imports, add to api docs, update pyproject.toml * Add a new test * Add dep so tests can run	2024-06-07 09:43:29 +00:00
Sebastian Husch Lee	d815c78198	feat: Add `TransformersTextRouter` component (#7801 ) * First pass at adding TransformerTextRouter * Fix tests * Add release notes * Add optional labels param * Add verification in the warm_up * Fix tests * Add labels to to_dict * Feedback from review * Add component to docs * Added extra tests	2024-06-05 15:28:53 +02:00
Stefano Fiorucci	55a657ba81	export ChatPromptBuilder and add it to pydoc config (#7796 )	2024-06-04 10:17:23 +02:00
Massimiliano Pippi	8d80ff86d9	Add BranchJoiner and deprecate Multiplexer (#7765 )	2024-05-30 15:34:52 +02:00
Daria Fokina	cc869b10ad	add pdfminer (#7688 )	2024-05-14 13:42:29 +02:00
Bilge Yücel	f14bc5330f	Add "SentenceTransformersDiversityRanker" api reference (#7659 )	2024-05-07 19:16:05 +02:00
Stefano Fiorucci	704293d491	add pydoc config for evaluation (#7602 )	2024-04-26 12:30:21 +02:00

1 2 3 4

152 Commits