Vladimir Blagojevic
c81d68402c
feat: Add Toolset to tooling architecture ( #9161 )
...
* Add Toolset abstraction
* Add reno note
* More pydoc improvements
* Update test
* Simplify, Toolset is a dataclass
* Wrap toolset instance with list
* Add example
* Toolset pydoc serde enhancement
* Toolset as init param
* Fix types
* Linting
* Minor updates
* PR feedback
* Add to pydoc config, minor import fixes
* Improve pydoc example
* Improve coverage for test_toolset.py
* Improve test_toolset.py, test custom toolset serde properly
* Update haystack/utils/misc.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Rework Toolset pydoc
* Another minor pydoc improvement
* Prevent single Tool instantiating Toolset
* Reduce number of integration tests
* Remove some toolset tests from openai
* Rework tests
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-04-04 16:09:46 +02:00
Julian Risch
e483ec6f56
feat: integrate Agent from haystack-experimental ( #9112 )
...
* add Agent
* add Agent
* update imports
* add state tests
* reno
* remove State, its utils, and tests
* add pydoc yml for agents
* fix module path in serialization test
* fix mypy error and use ChatGenerator protocol
* remove unused import
* address review feedback
* remove unused _load_component
2025-03-28 14:23:39 +01:00
Julian Risch
657d09d7f1
feat: integrate updates of Tool, ToolInvoker, State, create_tool_from_function, ComponentTool from haystack-experimental ( #9113 )
...
* update Tool,ToolInvoker,ComponentTool,create_tool_from_function
* add State and its utils
* add tests for State and its utils
* update tests for Tool etc.
* reno
* fix circular imports
* update experimental imports in tests
* fix unit tests
* fix ChatGenerator unit tests
* mypy
* add State to init and pydoc
* explain State in more detail in release note
* add test from #8913
* re-add _check_duplicate_tool_names and refactor imports
* rename inputs and outputs
2025-03-28 10:49:23 +01:00
David S. Batista
be2d1fb303
feat: adding AutoMergingRetriever
and HierarchicalDocumentSplitter
( #9067 )
...
* adding Auto-Merging-Retriever
* adding release notes
* updating tests
* adding renamed file
* Update haystack/components/preprocessors/hierarchical_document_splitter.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update haystack/components/retrievers/auto_merging_retriever.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* fixing tests and imports
* adding pydoc
* adding to type checking
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-19 18:25:23 +00:00
Sebastian Husch Lee
99a998f90b
feat: Add MSGToDocument converter ( #8868 )
...
* Initial commit of MSG converter from Bijay
* Updates to the MSG converter
* Add license header
* Add tests for msg converter
* Update converter
* Expanding tests
* Update docstrings
* add license header
* Add reno
* Add to inits and pydocs
* Add test for empty input
* Fix types
* Fix mypy
---------
Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>
2025-02-24 08:12:32 +01:00
Stefano Fiorucci
0409e5da8f
remove base from evaluation pydoc config ( #8867 )
2025-02-17 15:19:40 +01:00
David S. Batista
cee52435bf
adding to pydocs ( #8846 )
2025-02-12 12:04:50 +01:00
Sebastian Husch Lee
f9e6e481a1
feat: Add new component CSVDocumentSplitter to recursively split CSV documents ( #8815 )
...
* CSV Document Splitter
* Add license header
* Add newline
* Add to docs
* Add lineterminator
* Updated csv splitter to allow user to specify to split by row, column or both
* Adding more tests
* Column tests
* Some refactoring to remove incorrect dropna call
* Fix
* More complicated test
* Adding more relevant metadata to match whats provided in our other splitters
* value error tests
* Fix mypy
* Docstring updates
* Add skip_blank_lines=False
* Add to dict test
* More from and to dict tests
* Fixes
* Move dict creation outside of for loop
2025-02-10 18:10:18 +01:00
David S. Batista
f798a9e935
feat: adding LLMMetadataExtractor
( #8833 )
...
* fixing linting
* adding release notes
* updating tests
* adding to pydocs
* fixing typing due to Optional
* fixing docstring
2025-02-10 16:54:25 +00:00
Vladimir Blagojevic
fd5040108a
feat: Add OpenAPIConnector component, improve OpenAPI integration ( #8808 )
...
* Initial OpenAPIConnector
* Add reno note
* Format
* Add headers
* Add test dep
* Use haystack logger
* Fix test
* Minor fix, spin CI
* Update reno release note format
* Add to docs, pydocs improvements
2025-02-10 10:34:37 +01:00
Sebastian Husch Lee
1785ea622e
feat: Add component CSVDocumentCleaner for removing empty rows and columns ( #8816 )
...
* Initial commit for csv cleaner
* Add release notes
* Update lineterminator
* Update releasenotes/notes/csv-document-cleaner-8eca67e884684c56.yaml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* alphabetize
* Use lazy import
* Some refactoring
* Some refactoring
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 17:56:38 +01:00
Stefano Fiorucci
05300490a6
docs: add ListJoiner
to pydoc configuration ( #8821 )
...
* docs: add ListJoiner to pydoc configuration
* Update docs/pydoc/config/joiners_api.yml
Co-authored-by: David S. Batista <dsbatista@gmail.com>
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 08:52:24 +00:00
David S. Batista
26b80778f5
chore: removing NLTKDocumentSplitter ( #8724 )
...
* removing NLTKDocumentSplitter
* adding release notes
* removing pydocs reference
2025-01-15 16:11:51 +00:00
David S. Batista
ec8666545d
docs: adding RecursiveSplitter to pydoc
2025-01-13 11:46:34 +01:00
Vladimir Blagojevic
d147c7658f
feat: Add ComponentTool
to Haystack tools ( #8693 )
...
* Initial ComponentTool
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-01-13 11:15:33 +01:00
Stefano Fiorucci
08cf09f83f
refactor: create_tool_from_function
+ tool
decorator ( #8697 )
...
* create_tool_from_function + decorator
* release note
* improve usage example
* add imports to @tool usage example
* clarify docstrings
* small docstring addition
2025-01-10 12:15:15 +01:00
Stefano Fiorucci
3f15f38c51
refactor: move Tool
to a separate package; refactor serde ( #8690 )
...
* move tool to separate package; refactor serde
* release note
* rm unused import
2025-01-09 12:30:13 +01:00
Sebastian Husch Lee
28ad78c73d
feat: Add XLSXToDocument converter ( #8522 )
...
* Add draft of the Excel To Document converter
* Add license header
* Add release note
* Use Union instead of pipe
* Add openpyxl as additional dep
* Fix zip issue
* few updates from Bijay
* Update deps
* Add markdown test
* Adding more example excels and expanding tests
* Added more tests
* Fix windows test by setting lineterminator
* Addressing PR comments
* PR comments
* Fix linting
2025-01-09 09:03:19 +01:00
Stefano Fiorucci
7dcbf25bd7
feat: add Tool Invoker component ( #8664 )
...
* port toolinvoker
* release note
2024-12-20 14:02:42 +01:00
Stefano Fiorucci
96b4a1d2fd
feat: Tool
dataclass - unified abstraction to represent tools ( #8652 )
...
* draft
* del HF token in tests
* adaptations
* progress
* fix type
* import sorting
* more control on deserialization
* release note
* improvements
* support name field
* fix chatpromptbuilder test
* port Tool from experimental
* release note
* docs upd
* Update tool.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-12-18 11:36:44 +00:00
Sebastian Husch Lee
e45d3329a1
feat: Adding DALLE image generator ( #8448 )
...
* First pass at adding DALLE image generator
* Add missing header
* Fix tests
* Add tests
* Fix mypy
* Make mypy happy
* More unit tests
* Adding release notes
* Add a test for run
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Fix pylint
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-14 16:19:49 +01:00
David S. Batista
e5a80722c2
feat: adding metadata grouper component ( #8512 )
...
* initial import
* making tests more readable; adding docstring
* adding release notes
* adding LICENSE header
* Update test/components/rankers/test_metadata_grouper.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* refactoring
* fixing docstring
* fixing types
* test docstrings
* renaming test
* handling too-many-arguments
* liting
* Update haystack/components/rankers/metadata_grouper.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* changing name
* Update haystack/components/rankers/metadata_grouper.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/metadata_grouper.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* assiging value inside function for re-use
* improving docstring
* updating name to MetaFieldGroupingRanker
* adding to pydocs
* fixing imports
* adding output docstring
* Update haystack/components/rankers/meta_field_grouper_ranker.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update haystack/components/rankers/__init__.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update releasenotes/notes/add-metadata-grouper-21ec05fd4a307425.yaml
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update test/components/rankers/test_metadata_grouper.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* update docstring tests
* fixing imports
* rename modules for consistency
* fix pydocs
* simplification + more tests
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-12 16:01:53 +01:00
Sebastian Husch Lee
294a67e426
feat: Adding StringJoiner ( #8357 )
...
* Adding StringJoiner
* Release notes
* Remove typing
* Remove unused import
* Try to fix header
* Fix one test
* Add to docs, move test to behavioral pipeline test
* Undo changes
* Fix test
* Update haystack/components/joiners/string_joiner.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update haystack/components/joiners/string_joiner.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Provide usage example
* Apply suggestions from code review
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-10-30 15:03:41 +00:00
Julian Risch
08686d90af
feat: Add DocumentNDCGEvaluator component ( #8419 )
...
* draft new component and tests
* draft new component and tests
* fix tests, replace usage of get_attr
* improve docstrings, refactor tests
* add test for mixed documents w/wo scores
* add test with multiple lists and update docstring
* validate inputs, add tests, make methods static
* change fallback to binary relevance
* rename validate_init_parameters to validate_inputs
2024-10-01 16:15:02 +02:00
Silvano Cerza
29672d4b42
feat: Add JSONConverter
Component ( #8397 )
...
* Add JSONConverter Component
* Handle some corner cases
* Add JSONConverter to pydoc config
* Add a way to extract all non content fields as metadata
* Small fix in docstring
* Fix tests
* docstrings upd
* Update json.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-09-25 12:34:51 +02:00
Daria Fokina
caf465b004
docs: add NLTKSplitter and ZeroShotClassifier to pydocs ( #8384 )
...
* Update preprocessors_api.yml
* Update classifiers_api.yml
2024-09-18 15:55:40 +02:00
Sriniketh J
e98a6fea04
Convertor: CSVToDocument ( #8328 )
...
* carry forwarded initial commit
* fix: doc strings
* fix: update docstrings
* fix: docstring update
* fix: csv encoding in actions
* fix: line endings through hooks
* fix: converter docs addition
2024-09-06 10:59:12 +02:00
Stefano Fiorucci
842a7b80a8
rm sentence_window_retrieval ( #8303 )
2024-08-28 10:51:07 +02:00
Amna Mubashar
373de97426
Deprecate SentenceWindowRetrieval ( #8206 )
2024-08-13 13:49:41 +02:00
Vladimir Blagojevic
25d3520f5a
feat: Add AnswerJoiner
new component ( #8122 )
...
* Initial AnswerJoiner
* Initial tests
* Add release note
* Resove mypy warning
* Add custom join function
* Serialize custom join function
* Handle all Answer types, add integration test, improve pydoc
* Make fixes
* Add to API docs
* Add more tests
* Update haystack/components/joiners/answer_joiner.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update docstrings and release notes
* update docstrings
---------
Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Darja Fokina <daria.fokina@deepset.ai>
2024-08-01 12:51:17 +02:00
Amna Mubashar
e0de423ee0
Rename SentenceWindowRetrieval to SentenceWindowRetriever
2024-07-26 17:46:44 +02:00
Madeesh Kannan
b2aef217da
chore: Remove deprecated DynamicPromptBuilder
and DynamicChatPromptBuilder
components ( #8085 )
2024-07-26 10:00:59 +02:00
Daria Fokina
913078dfaa
docs: add sentence window retrieval to api reference ( #8032 )
...
* docs: add sentence window retrieval to api reference
* deprecating multiplexer
2024-07-17 11:16:58 +02:00
Stefano Fiorucci
c59ad95f42
chore: remove deprecated TGI generators ( #7908 )
...
* remove deprecated TGI generators
* rm unused import
2024-06-21 11:15:13 +02:00
Stefano Fiorucci
75ad76a7ce
chore: remove deprecated TEI embedders ( #7907 )
...
* remove deprecated TEI embedders
* rm from the embedders init
* rm related tests
2024-06-21 10:36:12 +02:00
Massimiliano Pippi
7c31d5f418
add docstrings for EvaluationRunResult ( #7885 )
2024-06-19 11:49:41 +02:00
Carlos Fernández
c1c339923f
feat: add DocxToDocument converter ( #7838 )
...
* first fucntioning DocxFileToDocument
* fix lazy import message
* add reno
* Add license headder
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* change DocxFileToDocument to DocxToDocument
* Update library install to the maintained version
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* clan try-exvept to only take non haystack errors into account
* Add wanring on docstring of component ignoring page brakes, mark test as skip
* make warnings lazy evaluations
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* make warnings lazy evaluations
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Make warnings lazy evaluated
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Solve f bug
* Get more metadata from docx files
* add 'python-docx' dependency and docs
* Change logging import
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Fix typo
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* remake metadata extraction for docx
* solve bug regarding _get_docx_metadata method
* Update haystack/components/converters/docx.py
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Update haystack/components/converters/docx.py
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
* Delete unused test
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-06-12 11:58:36 +02:00
Sebastian Husch Lee
2c2c7c9f56
feat: Add PPTXToDocument converter ( #7808 )
...
* Add first pass at PPTXToDocument converter
* Add test and update code
* Add doc string
* Update docstrings
* Add release notes
* remove unused imports, add to api docs, update pyproject.toml
* Add a new test
* Add dep so tests can run
2024-06-07 09:43:29 +00:00
Sebastian Husch Lee
d815c78198
feat: Add TransformersTextRouter
component ( #7801 )
...
* First pass at adding TransformerTextRouter
* Fix tests
* Add release notes
* Add optional labels param
* Add verification in the warm_up
* Fix tests
* Add labels to to_dict
* Feedback from review
* Add component to docs
* Added extra tests
2024-06-05 15:28:53 +02:00
Stefano Fiorucci
55a657ba81
export ChatPromptBuilder and add it to pydoc config ( #7796 )
2024-06-04 10:17:23 +02:00
Massimiliano Pippi
8d80ff86d9
Add BranchJoiner and deprecate Multiplexer ( #7765 )
2024-05-30 15:34:52 +02:00
Daria Fokina
cc869b10ad
add pdfminer ( #7688 )
2024-05-14 13:42:29 +02:00
Bilge Yücel
f14bc5330f
Add "SentenceTransformersDiversityRanker" api reference ( #7659 )
2024-05-07 19:16:05 +02:00
Stefano Fiorucci
704293d491
add pydoc config for evaluation ( #7602 )
2024-04-26 12:30:21 +02:00
Julian Risch
b12e0db134
feat: Add ContextRelevanceEvaluator component ( #7519 )
...
* feat: Add ContextRelevanceEvaluator component
* reno
* fix expected inputs and example docstring
* remove responses parameter from tests
* specify inputs explicitly
* add new evaluator to api reference docs
2024-04-22 14:10:00 +02:00
Daria Fokina
a5f6571cfb
docs: add evaluators component reference ( #7532 )
2024-04-12 12:51:39 +02:00
Stefano Fiorucci
eff53a9131
feat: HuggingFaceAPIDocumentEmbedder
( #7485 )
...
* add HuggingFaceAPITextEmbedder
* add HuggingFaceAPITextEmbedder
* rm unneeded else
* wip
* small fixes
* deprecation; reno
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* make params mandatory
* changes requested
* fix test
* fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-08 15:06:26 +02:00
Stefano Fiorucci
c91bd49cae
feat: HuggingFaceAPITextEmbedder
( #7484 )
...
* add HuggingFaceAPITextEmbedder
* add HuggingFaceAPITextEmbedder
* rm unneeded else
* small fixes
* changes requested
* fix test
2024-04-08 14:22:54 +02:00
Stefano Fiorucci
0dbb98c0a0
feat: HuggingFaceAPIChatGenerator
( #7480 )
...
* draft
* docstrings and more tests
* deprecation; reno
* pydoc config
* better error messages
* wip
* add test
* better docstrings
* deprecation; reno
* pylint
* typo
* rm unneeded else
* rm unneeded else
* fixes from feedback
* docstring showing the enum
* improve docstring
* make params mandatory
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* document enum
* Update haystack/utils/hf.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* mandatory params
* fix test
* fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-05 18:48:34 +02:00
Stefano Fiorucci
1d083861ff
feat: HuggingFaceAPIGenerator
( #7464 )
...
* draft
* docstrings and more tests
* deprecation; reno
* pydoc config
* better error messages
* rm unneeded else
* make params mandatory
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* document enum
* Update haystack/utils/hf.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* fix test
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-04-05 18:48:13 +02:00