haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-06 21:05:33 +00:00

Author	SHA1	Message	Date
Stefano Fiorucci	0409e5da8f	remove base from evaluation pydoc config (#8867 )	2025-02-17 15:19:40 +01:00
David S. Batista	cee52435bf	adding to pydocs (#8846 )	2025-02-12 12:04:50 +01:00
Sebastian Husch Lee	f9e6e481a1	feat: Add new component CSVDocumentSplitter to recursively split CSV documents (#8815 ) * CSV Document Splitter * Add license header * Add newline * Add to docs * Add lineterminator * Updated csv splitter to allow user to specify to split by row, column or both * Adding more tests * Column tests * Some refactoring to remove incorrect dropna call * Fix * More complicated test * Adding more relevant metadata to match whats provided in our other splitters * value error tests * Fix mypy * Docstring updates * Add skip_blank_lines=False * Add to dict test * More from and to dict tests * Fixes * Move dict creation outside of for loop	2025-02-10 18:10:18 +01:00
David S. Batista	f798a9e935	feat: adding `LLMMetadataExtractor` (#8833 ) * fixing linting * adding release notes * updating tests * adding to pydocs * fixing typing due to Optional * fixing docstring	2025-02-10 16:54:25 +00:00
Vladimir Blagojevic	fd5040108a	feat: Add OpenAPIConnector component, improve OpenAPI integration (#8808 ) * Initial OpenAPIConnector * Add reno note * Format * Add headers * Add test dep * Use haystack logger * Fix test * Minor fix, spin CI * Update reno release note format * Add to docs, pydocs improvements	2025-02-10 10:34:37 +01:00
Sebastian Husch Lee	1785ea622e	feat: Add component CSVDocumentCleaner for removing empty rows and columns (#8816 ) * Initial commit for csv cleaner * Add release notes * Update lineterminator * Update releasenotes/notes/csv-document-cleaner-8eca67e884684c56.yaml Co-authored-by: David S. Batista <dsbatista@gmail.com> * alphabetize * Use lazy import * Some refactoring * Some refactoring --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-02-06 17:56:38 +01:00
Stefano Fiorucci	05300490a6	docs: add `ListJoiner` to pydoc configuration (#8821 ) * docs: add ListJoiner to pydoc configuration * Update docs/pydoc/config/joiners_api.yml Co-authored-by: David S. Batista <dsbatista@gmail.com> --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-02-06 08:52:24 +00:00
David S. Batista	26b80778f5	chore: removing NLTKDocumentSplitter (#8724 ) * removing NLTKDocumentSplitter * adding release notes * removing pydocs reference	2025-01-15 16:11:51 +00:00
David S. Batista	ec8666545d	docs: adding RecursiveSplitter to pydoc	2025-01-13 11:46:34 +01:00
Vladimir Blagojevic	d147c7658f	feat: Add `ComponentTool` to Haystack tools (#8693 ) * Initial ComponentTool --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2025-01-13 11:15:33 +01:00
Stefano Fiorucci	08cf09f83f	refactor: `create_tool_from_function` + `tool` decorator (#8697 ) * create_tool_from_function + decorator * release note * improve usage example * add imports to @tool usage example * clarify docstrings * small docstring addition	2025-01-10 12:15:15 +01:00
Stefano Fiorucci	3f15f38c51	refactor: move `Tool` to a separate package; refactor serde (#8690 ) * move tool to separate package; refactor serde * release note * rm unused import	2025-01-09 12:30:13 +01:00
Sebastian Husch Lee	28ad78c73d	feat: Add XLSXToDocument converter (#8522 ) * Add draft of the Excel To Document converter * Add license header * Add release note * Use Union instead of pipe * Add openpyxl as additional dep * Fix zip issue * few updates from Bijay * Update deps * Add markdown test * Adding more example excels and expanding tests * Added more tests * Fix windows test by setting lineterminator * Addressing PR comments * PR comments * Fix linting	2025-01-09 09:03:19 +01:00
Stefano Fiorucci	7dcbf25bd7	feat: add Tool Invoker component (#8664 ) * port toolinvoker * release note	2024-12-20 14:02:42 +01:00
Stefano Fiorucci	96b4a1d2fd	feat: `Tool` dataclass - unified abstraction to represent tools (#8652 ) * draft * del HF token in tests * adaptations * progress * fix type * import sorting * more control on deserialization * release note * improvements * support name field * fix chatpromptbuilder test * port Tool from experimental * release note * docs upd * Update tool.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-12-18 11:36:44 +00:00
Sebastian Husch Lee	e45d3329a1	feat: Adding DALLE image generator (#8448 ) * First pass at adding DALLE image generator * Add missing header * Fix tests * Add tests * Fix mypy * Make mypy happy * More unit tests * Adding release notes * Add a test for run * Update haystack/components/generators/openai_dalle.py Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * Fix pylint * Update haystack/components/generators/openai_dalle.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/generators/openai_dalle.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-11-14 16:19:49 +01:00
David S. Batista	e5a80722c2	feat: adding metadata grouper component (#8512 ) * initial import * making tests more readable; adding docstring * adding release notes * adding LICENSE header * Update test/components/rankers/test_metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * refactoring * fixing docstring * fixing types * test docstrings * renaming test * handling too-many-arguments * liting * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * changing name * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Update haystack/components/rankers/metadata_grouper.py Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * assiging value inside function for re-use * improving docstring * updating name to MetaFieldGroupingRanker * adding to pydocs * fixing imports * adding output docstring * Update haystack/components/rankers/meta_field_grouper_ranker.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/rankers/__init__.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update releasenotes/notes/add-metadata-grouper-21ec05fd4a307425.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update test/components/rankers/test_metadata_grouper.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * update docstring tests * fixing imports * rename modules for consistency * fix pydocs * simplification + more tests --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-11-12 16:01:53 +01:00
Sebastian Husch Lee	294a67e426	feat: Adding StringJoiner (#8357 ) * Adding StringJoiner * Release notes * Remove typing * Remove unused import * Try to fix header * Fix one test * Add to docs, move test to behavioral pipeline test * Undo changes * Fix test * Update haystack/components/joiners/string_joiner.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Update haystack/components/joiners/string_joiner.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Provide usage example * Apply suggestions from code review Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2024-10-30 15:03:41 +00:00
Julian Risch	08686d90af	feat: Add DocumentNDCGEvaluator component (#8419 ) * draft new component and tests * draft new component and tests * fix tests, replace usage of get_attr * improve docstrings, refactor tests * add test for mixed documents w/wo scores * add test with multiple lists and update docstring * validate inputs, add tests, make methods static * change fallback to binary relevance * rename validate_init_parameters to validate_inputs	2024-10-01 16:15:02 +02:00
Silvano Cerza	29672d4b42	feat: Add `JSONConverter` Component (#8397 ) * Add JSONConverter Component * Handle some corner cases * Add JSONConverter to pydoc config * Add a way to extract all non content fields as metadata * Small fix in docstring * Fix tests * docstrings upd * Update json.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-09-25 12:34:51 +02:00
Daria Fokina	caf465b004	docs: add NLTKSplitter and ZeroShotClassifier to pydocs (#8384 ) * Update preprocessors_api.yml * Update classifiers_api.yml	2024-09-18 15:55:40 +02:00
Sriniketh J	e98a6fea04	Convertor: CSVToDocument (#8328 ) * carry forwarded initial commit * fix: doc strings * fix: update docstrings * fix: docstring update * fix: csv encoding in actions * fix: line endings through hooks * fix: converter docs addition	2024-09-06 10:59:12 +02:00
Stefano Fiorucci	842a7b80a8	rm sentence_window_retrieval (#8303 )	2024-08-28 10:51:07 +02:00
Amna Mubashar	373de97426	Deprecate SentenceWindowRetrieval (#8206 )	2024-08-13 13:49:41 +02:00
Vladimir Blagojevic	25d3520f5a	feat: Add `AnswerJoiner` new component (#8122 ) * Initial AnswerJoiner * Initial tests * Add release note * Resove mypy warning * Add custom join function * Serialize custom join function * Handle all Answer types, add integration test, improve pydoc * Make fixes * Add to API docs * Add more tests * Update haystack/components/joiners/answer_joiner.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update docstrings and release notes * update docstrings --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com> Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: Darja Fokina <daria.fokina@deepset.ai>	2024-08-01 12:51:17 +02:00
Amna Mubashar	e0de423ee0	Rename SentenceWindowRetrieval to SentenceWindowRetriever	2024-07-26 17:46:44 +02:00
Madeesh Kannan	b2aef217da	chore: Remove deprecated `DynamicPromptBuilder` and `DynamicChatPromptBuilder` components (#8085 )	2024-07-26 10:00:59 +02:00
Daria Fokina	913078dfaa	docs: add sentence window retrieval to api reference (#8032 ) * docs: add sentence window retrieval to api reference * deprecating multiplexer	2024-07-17 11:16:58 +02:00
Stefano Fiorucci	c59ad95f42	chore: remove deprecated TGI generators (#7908 ) * remove deprecated TGI generators * rm unused import	2024-06-21 11:15:13 +02:00
Stefano Fiorucci	75ad76a7ce	chore: remove deprecated TEI embedders (#7907 ) * remove deprecated TEI embedders * rm from the embedders init * rm related tests	2024-06-21 10:36:12 +02:00
Massimiliano Pippi	7c31d5f418	add docstrings for EvaluationRunResult (#7885 )	2024-06-19 11:49:41 +02:00
Carlos Fernández	c1c339923f	feat: add DocxToDocument converter (#7838 ) * first fucntioning DocxFileToDocument * fix lazy import message * add reno * Add license headder Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * change DocxFileToDocument to DocxToDocument * Update library install to the maintained version Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * clan try-exvept to only take non haystack errors into account * Add wanring on docstring of component ignoring page brakes, mark test as skip * make warnings lazy evaluations Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * make warnings lazy evaluations Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Make warnings lazy evaluated Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Solve f bug * Get more metadata from docx files * add 'python-docx' dependency and docs * Change logging import Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Fix typo Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * remake metadata extraction for docx * solve bug regarding _get_docx_metadata method * Update haystack/components/converters/docx.py Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Update haystack/components/converters/docx.py Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> * Delete unused test --------- Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>	2024-06-12 11:58:36 +02:00
Sebastian Husch Lee	2c2c7c9f56	feat: Add PPTXToDocument converter (#7808 ) * Add first pass at PPTXToDocument converter * Add test and update code * Add doc string * Update docstrings * Add release notes * remove unused imports, add to api docs, update pyproject.toml * Add a new test * Add dep so tests can run	2024-06-07 09:43:29 +00:00
Sebastian Husch Lee	d815c78198	feat: Add `TransformersTextRouter` component (#7801 ) * First pass at adding TransformerTextRouter * Fix tests * Add release notes * Add optional labels param * Add verification in the warm_up * Fix tests * Add labels to to_dict * Feedback from review * Add component to docs * Added extra tests	2024-06-05 15:28:53 +02:00
Stefano Fiorucci	55a657ba81	export ChatPromptBuilder and add it to pydoc config (#7796 )	2024-06-04 10:17:23 +02:00
Massimiliano Pippi	8d80ff86d9	Add BranchJoiner and deprecate Multiplexer (#7765 )	2024-05-30 15:34:52 +02:00
Daria Fokina	cc869b10ad	add pdfminer (#7688 )	2024-05-14 13:42:29 +02:00
Bilge Yücel	f14bc5330f	Add "SentenceTransformersDiversityRanker" api reference (#7659 )	2024-05-07 19:16:05 +02:00
Stefano Fiorucci	704293d491	add pydoc config for evaluation (#7602 )	2024-04-26 12:30:21 +02:00
Julian Risch	b12e0db134	feat: Add ContextRelevanceEvaluator component (#7519 ) * feat: Add ContextRelevanceEvaluator component * reno * fix expected inputs and example docstring * remove responses parameter from tests * specify inputs explicitly * add new evaluator to api reference docs	2024-04-22 14:10:00 +02:00
Daria Fokina	a5f6571cfb	docs: add evaluators component reference (#7532 )	2024-04-12 12:51:39 +02:00
Stefano Fiorucci	eff53a9131	feat: `HuggingFaceAPIDocumentEmbedder` (#7485 ) * add HuggingFaceAPITextEmbedder * add HuggingFaceAPITextEmbedder * rm unneeded else * wip * small fixes * deprecation; reno * Apply suggestions from code review Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * make params mandatory * changes requested * fix test * fix test --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-08 15:06:26 +02:00
Stefano Fiorucci	c91bd49cae	feat: `HuggingFaceAPITextEmbedder` (#7484 ) * add HuggingFaceAPITextEmbedder * add HuggingFaceAPITextEmbedder * rm unneeded else * small fixes * changes requested * fix test	2024-04-08 14:22:54 +02:00
Stefano Fiorucci	0dbb98c0a0	feat: `HuggingFaceAPIChatGenerator` (#7480 ) * draft * docstrings and more tests * deprecation; reno * pydoc config * better error messages * wip * add test * better docstrings * deprecation; reno * pylint * typo * rm unneeded else * rm unneeded else * fixes from feedback * docstring showing the enum * improve docstring * make params mandatory * Apply suggestions from code review Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * document enum * Update haystack/utils/hf.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * mandatory params * fix test * fix test --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-05 18:48:34 +02:00
Stefano Fiorucci	1d083861ff	feat: `HuggingFaceAPIGenerator` (#7464 ) * draft * docstrings and more tests * deprecation; reno * pydoc config * better error messages * rm unneeded else * make params mandatory * Apply suggestions from code review Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * document enum * Update haystack/utils/hf.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * fix test --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-04-05 18:48:13 +02:00
Vladimir Blagojevic	eb7974e78f	Add TransformersZeroShotTextRouter to docs (#7433 )	2024-03-27 13:20:34 +01:00
Stefano Fiorucci	dbfd351da7	feat: introduce `SparseEmbedding` (#7382 ) * introduce SparseEmbedding * reno * add to pydoc config	2024-03-19 18:04:16 +01:00
Silvano Cerza	2a83eccf99	Update docs renderer (#7349 )	2024-03-13 12:30:13 +01:00
Tobias Wochinger	a3a21947a4	docs: disable class def rendering (#7329 )	2024-03-07 15:54:16 +01:00
Madeesh Kannan	0db95fb7bd	docs: `haystack.utils` docfixes (#7318 )	2024-03-06 16:11:17 +01:00

1 2 3

141 Commits