1766 Commits

Author SHA1 Message Date
Jan Trienes
83b087caf4
feat: add local_files_only to sentence-transformers embedders (#9400)
* feat: add  to sentence-transformers embedders

* add release note

* Fix wording

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-05-19 16:11:49 +00:00
Sebastian Husch Lee
707573d967
feat: Streamline using Agent as a ComponentTool (#9388)
* Make agent as a tool more streamlined

* Add reno

* fix mypy
2025-05-16 13:11:43 +02:00
Sebastian Husch Lee
af073852d0
feat: Add usage when using HuggingFaceAPIChatGenerator with streaming (#9371)
* Small fix and update tests

* Add usage support to streaming for HuggingFaceAPIChatGenerator

* Add reno

* try using provider='auto'

* Undo provider

* Fix unit tests

* Update releasenotes/notes/add-usage-hf-api-chat-streaming-91fd04705f45d5b3.yaml

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-05-15 13:09:36 +02:00
Sebastian Husch Lee
9ae76e1653
Fix component tool parameters (#9342)
* Starting property schema refactor

* Adding more tests

* More tests

* Handle null type explicitly

* More updates of tests to accomodate Optional properly

* Fix more tests

* Remove unecessary check

* Some cleanup

* Update test

* Add reno

* Fix typing

* Add license header

* Use docstrings of dataclasses in parameter spec generation

* More tests of Haystack dataclass types

* Properly handle Sequence

* Fix license header

* Update OpenAI tests to add more complicated tool parameter signature

* Properly set required for dataclasses

* Add integration test for azure that includes additionalProperties

* Add more complicated integration test for HuggingFaceAPIChatGenerator

* Alternate approach using pydantic like we do in from_function.py

* Cleanup and fix other affected tests

* Fix mypy

* PR comments

* PR comment

* Remove test from HF API

* Update reno

* Update reno
2025-05-15 07:51:06 +00:00
David S. Batista
42b378950f
fix: DocumentRecallEvaluator changing division and adding checks for emptiness of documents (#9380)
* changing division and adding checks for emptiness of documents

* adding release notes

* adding tests

* Update releasenotes/notes/updated-doc-recall-eval-uniqueness-59b09082cf8e7593.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* attending PR comments

* Update releasenotes/notes/updated-doc-recall-eval-uniqueness-59b09082cf8e7593.yaml

* Update releasenotes/notes/updated-doc-recall-eval-uniqueness-59b09082cf8e7593.yaml

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* Update haystack/components/evaluators/document_recall.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* Update haystack/components/evaluators/document_recall.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* Update haystack/components/evaluators/document_recall.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* Update haystack/components/evaluators/document_recall.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* adding tests

* linting

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-05-14 11:37:47 +02:00
Sebastian Husch Lee
9f2c0679d4
Small fix and update tests (#9370) 2025-05-12 22:02:26 +02:00
David S. Batista
f233e06f0a
feat : adding a new Protocol for TextEmbedder (#9353)
* initial import

* removing unused imports

* adding an Embbeder Protocol

* adding tests

* adding tests

* adding release notes

* renaming dir

* removing dir

* cleaning

* adding clean tests

* dealing eith elipsis and pylint

* wip: extending tests

* cleaning extended tests

* adding an invalid TextEmbedder
2025-05-12 12:35:09 +02:00
Sebastian Husch Lee
6bef2c36bb
perf: Don't deepcopy Components, Tools, or Toolsets (#9356)
* Don't copy components

* Use deepcopy_with_fallback in more places and don't deepcopy Components, Tools or Toolsets

* Slight change

* Slightly update tests

* Refactor function based on PR feedback

* Add reno

* Fix lint

* Simplify tests, rename function, PR comments

* Fix mypy

* Undo typing
2025-05-08 12:48:08 +00:00
Stefano Fiorucci
4b4b0f0041
fix: HuggingFaceAPIChatGenerator - make tool conversion compatible with huggingface_hub>=0.31.0 (#9354)
* fix: HuggingFaceAPIChatGenerator - make tool conversion compatible with huggingface_hub>=0.31.0

* relnote
2025-05-07 18:37:05 +02:00
Sebastian Husch Lee
4ce6934dd9
fix: Update deepcopying in Pipeline to have a fallback in case of error (#9346)
* First pass at fix for deepcopying inputs and outputs

* Add reno

* Add recursion for dict objects

* Bump recursion depth

* More tests and some improvments

* Fix unit tests

* PR comments
2025-05-06 11:49:45 +00:00
Amna Mubashar
64f384b52d
feat: enable streaming ToolCall/Result from Agent (#9290)
* Testing solutions for streaming

* Remove unused methods

* Add fixes

* Update docstrings

* add release notes and test

* PR comments

* add a new util function

* Adjust emit_tool_info

* PR comments

* Remove emit function, add streaming for tool_call


---------

Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2025-05-05 16:23:44 +02:00
David S. Batista
0f00c1882e
fix: make SentenceSplitter QUOTE_SPANS_RE regex ReDoS-safe (#9338)
* fix: make QUOTE_SPANS_RE regex ReDoS-safe

* Removing the capture of leading non-character on double quotes, allowing quote with new lines, adding tests

* cleaning

* fixing release notes

* changing import

* adding test for Regex Denial of Service (ReDoS)

* reducing the size/time of tests

* Update test/components/preprocessors/test_sentence_tokenizer.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update test/components/preprocessors/test_sentence_tokenizer.py

---------

Co-authored-by: Waivey <waivey@proton.me>
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-05-02 15:40:17 +00:00
Mo Sriha
e5255d9061
feat: add visualization capabilities to SuperComponent (#9336)
* feat: add visualization methods to SuperComponent for pipeline representation

* refactor: update show and draw methods in SuperComponent to return None

* test: add unit tests

* add release note

* chore: update copyright year

* test: move unit tests to test_super_component

* Update releasenotes/notes/add-pipeline-viz-to-supercomponent-80165756cc777056.yaml

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-05-02 08:46:08 -05:00
Stefano Fiorucci
e3f9da13d0
test: fix test incorrectly marked as async (#9327)
* test: fix test incorrectly marked as async

* fix inmemory async tests
2025-04-30 14:07:30 +00:00
David S. Batista
201becd400
fix: RecursiveSplitter bug in the case when the recursive chunking is triggered (#9316)
* initial import

* adding release notes

* Update fixing-bug-recursive-splitter-88d5714529f84e4e.yaml
2025-04-30 13:03:23 +02:00
David S. Batista
04e4701a17
chore: cleaning unused imports from core tests 2025-04-29 18:09:11 +02:00
David S. Batista
07f4bf5522
chore: cleaning imports and unused variables in Component tests 2025-04-29 16:27:34 +02:00
David S. Batista
d61f9f7f68
feat: validation function for run() and run_async() parameters signature for (custom) components (#9322)
* adding tests

* adding release notes

* small improvements
2025-04-29 13:53:24 +00:00
Yassin Nouh
ed6176a8cb
fix: make HuggingFaceAPIChatGenerator convert Tool Call arguments from string (#9303)
* fix: sort imports in hugging_face_api.py

* fix: import logging in hugging_face_api.py

* fix: refactor HuggingFace API tool call handling for improved argument conversion

* Update haystack/components/generators/chat/hugging_face_api.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* refinements + tests + relnote

* simplify

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-04-28 15:36:19 +02:00
Mohammed Abdul Razak Wahab
53308a6294
feat: Add sanitization for Meta field during serialization (#9272)
* feat: Add sanitization for Meta field during serialization

* Revert "feat: Add sanitization for Meta field during serialization"

This reverts commit c529f7c25b69aed626bb2072c8bf171815b591cc.

* feat: add nested serialization in openai usage object

* add reno

* add nested serialization in OpenAiChatGenerator

* Update releasenotes/notes/nested-serialization-openai-usage-object-3817b07342999edf.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* merge tests

* Adjust the test

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2025-04-26 15:04:02 +05:00
Sebastian Husch Lee
0fdb88424b
fix: Fix Azure test on forks (#9312)
* Fix unit test

* Fix test
2025-04-25 11:10:59 +02:00
Stefano Fiorucci
38c39a49de
test: review integration tests (#9306)
* AzureOCR: convert integration test to unit test and simplify

* clean up HuggingFaceAPITextEmbedder

* clean up LinkContentFetcher

* simplify HuggingFaceLocalGenerator

* clean up OpenAIGenerator

* OpenAIChatGenerator

* SentenceTransformersDiversityRanker

* TransformersSimilarityRanker

* ChatMessage: rm outdated tests

* fail fast false

* typo
2025-04-25 09:07:57 +02:00
Mohammed Abdul Razak Wahab
f97472329f
feat: Add support for multiple outputs in ConditionalRouter (#9271)
* feat: Add support for multiple outputs in ConditionalRouter

* Update haystack/components/routers/conditional_router.py

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>

* add additional route

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2025-04-24 16:17:06 +02:00
Michele Pangrazzi
4a908d075e
Fix OpenAIGenerator and OpenAIChatGenerator to allow wrapped streaming objects usage (#9304)
* Fix for handling wrapped ChatCompletion instances in streaming (used by tools like weave)

* Add release note

* Applied same fix to OpenAIGenerator ; Refactoring ; Update release note

* Fix integration test error after refactoring
2025-04-24 16:16:41 +02:00
Stefano Fiorucci
e3d4e21237
test: mark more tests as slow (#9296)
* test: mark tests as slow

* alphabetical order; install xet

* revert pyproject

* Trigger Build

* simplify tests as suggested

* add comment to workflow
2025-04-24 10:25:13 +02:00
Stefano Fiorucci
df662daaef
test: improve some slow tests (#9297)
* test: improve slow tests

* rm leftover and improve test
2025-04-24 08:50:36 +02:00
Stefano Fiorucci
9ae7da8df3
test: workflow for slow/unstable integration tests (#9267)
* workflow for slow integration tests

* try changing skipper

* Trigger Build

* better names

* fix

* mv tika to slow

* try skipping slow workflow

* retry paths-ignore

* remove skipper

* Revert "remove skipper"

This reverts commit 302ed2f07f36b33fa61fde0843b5590d79b98d74.

* better skipper

* retry

* Revert "retry"

This reverts commit fe5dff68f496645cc45292d74fcd8d043e868392.

* try using one workflow

* trigger

* try to see if it fails

* cosmetic changes

* improvements

* try matrix

* retry

* fix

* clean up

* simplify datadog monitoring and trigger

* send event to datadog for nightly failures

* tests should run if: manual trigger, scheduled, PR has label, release branch, or relevant files changed

* clarify slow marker

* improve comments

* labels
2025-04-23 10:36:44 +02:00
Mohammed Abdul Razak Wahab
ddd7318ae8
fix: use coerce_tag_value in LoggingTracer to serialize tag values (#9251)
* fix: use coerce_tag_value in LoggingTracer to serialize tag values

* add rn

* fix tests

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2025-04-22 16:18:24 +02:00
Sebastian Husch Lee
d5ae46bc93
feat: Add Toolset to Agent (#9284)
* Add Toolset to Agent

* Add reno
2025-04-22 14:08:34 +02:00
Grig Alex
14669419f2
feat: Allow OpenAI client config in other components (#9270)
* Add http config to generators

* Add http config to RemoteWhisperTranscriber

* Add http config to embedders

* Add notes of http config

* disable linter too-many-positional-arguments

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-04-22 09:44:55 +00:00
Sebastian Husch Lee
114b4568ba
Fix state_schema serialization for agent tracing (#9278) 2025-04-22 09:39:41 +02:00
Sebastian Husch Lee
0f374e0563
Fix from_dict and update test (#9277) 2025-04-22 06:59:03 +00:00
Sebastian Husch Lee
19cf220136
feat: integrate two ready-made SuperComponents from haystack-experimental (#9235)
* Add super component decorator

* Add reno

* MultiFileConverter

* Add DocumentPreprocessor

* Add reno

* Add tests and change doc preprocessor to split first then clean

* Remove code from merge

* Add to pydoc and missing test file

* PR comments

* Lint fix

* Fix mypy

* Fix mypy

* Add comment

* PR comments

* Update haystack/components/converters/multi_file_converter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/converters/multi_file_converter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* PR comments

* PR comment

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2025-04-17 10:02:26 +00:00
Sebastian Husch Lee
c5684b64a6
fix: Fix datadog tracer tests (#9253)
* Fix tests

* Make tests work in old versions of datadog
2025-04-17 09:26:27 +02:00
Sebastian Husch Lee
5154d1c7eb
feat: Add super component decorator (#9233)
* Add super component decorator

* Add reno

* Update tests
2025-04-16 16:47:07 +00:00
Amna Mubashar
498637788a
feat: Allow OpenAI client config in OpenAIChatGenerator and AzureOpenAIChatGenerator (#9215)
* Allow OpenAI client config in chat generator

* Add init_http_client as a util method

* Update azure chat gen

* Fix linting
2025-04-16 18:32:13 +02:00
Mohammed Abdul Razak Wahab
c4689f16c9
feat: allow SuperComponent to Include outputs from non leaf pipeline components (#9242)
* allow non leaf outputs in supercomponents

* add rn

* add output default fallback

* Update releasenotes/notes/allow-non-leaf-outputs-in-supercomponents-outputs-adf29d68636c23ba.yaml

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2025-04-16 16:59:43 +02:00
Sebastian Husch Lee
cdc53cae78
fix: Add batch_size to to_dict of TransformersSimilarityRanker (#9248)
* Add missing batch_size to to_dict of similarity ranker

* Add reno
2025-04-16 12:16:59 +02:00
MetroCat69
f7ac4b35cb
feat: add run_async for HuggingFaceAPIDocumentEmbedder (#9226)
* added async support for HuggingFaceAPIDocumentEmbedder

* added type anotations, removed unused import

* Trigger mark test complited

* Apply suggestions from code review

* utility function

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-04-16 09:54:36 +02:00
Sebastian Husch Lee
f46bf14851
fix: Allow Agent to run with no tools (#9230)
* Fix

* Add reno

* Add test

* Update docstring and warning message

* Update docstring
2025-04-16 07:53:21 +02:00
Sebastian Husch Lee
185e1c79c9
feat: Agent tracing (#9240)
* Agent tracing

* Small changes

* Some changes and refactoring

* Refactoring to reuse code

* Fix

* Add reno

* Fix tests

* Fix tests

* Fix linting

* Refactor and add tracing support to run_async of Agent

* Reduce duplicate code

* Remove finalize_run

* Use break instead of copying code three times

* Adding a test

* Add tracing unit tests

* Make async tracing test actually run async

* Increase test coverage

* Unit test for traces in pipeline

* Add cleanup

* Fix proper indentation

* PR comments

* PR comments and new test

* Update warning message

* Update warning message

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2025-04-15 15:58:26 +02:00
Stefano Fiorucci
656fe6dc6e
chore: LLM Evaluators - remove deprecated parameters (#9219) 2025-04-15 09:26:31 +02:00
David S. Batista
d860a73ddb
chore: cleaning duplicated import (#9234) 2025-04-15 09:25:19 +02:00
Julian Risch
13780cfcc4
feat: Add run_async to Agent (#9239)
* add run_async

* refactor with _check_exit_conditions

* add run_async tests

* reno

* fix linting issues
2025-04-14 19:01:59 +00:00
Stefano Fiorucci
c67d1bf0e9
fix: make ChatMessage.from_dict handle cases where optional fields are missing (#9232)
* fix: make ChatMessage.from_dict handle cases where optional fields are missing

* one more test
2025-04-14 14:53:08 +02:00
Mohammed Abdul Razak Wahab
859e90cc61
fix: Document field precedence in to_dict() (#9227)
* fix: Document field precedence in to_dict()

* add test

* add release note
2025-04-14 13:53:12 +02:00
Stefano Fiorucci
dcba774e30
chore: LLMMetadataExtractor - remove deprecated parameters (#9218) 2025-04-11 15:50:52 +02:00
Stefano Fiorucci
8bf41a8510
test: create e2e environment; stop testing spacy in unit tests (#9212)
* ci: create e2e environment; stop testing spacy in unit tests

* try fix

* fix yml

* exclude test python files

* self-referential environment

* do not use self-referential environment
2025-04-11 10:28:53 +00:00
Stefano Fiorucci
81fbe546cb
feat: ChatGenerator protocol - do not require to_dict and from_dict methods (#9213)
* minimize protocol

* progress

* rm unneeded test changes

* reno

* use keywords arguments for clarity
2025-04-11 10:30:48 +02:00
David S. Batista
45aa9608b5
removing async test for non-existant model (#9208) 2025-04-10 12:38:35 +02:00