4166 Commits

Author SHA1 Message Date
Mohammed Abdul Razak Wahab
0d65b4caa7
feat: Enhance error handling in Azure document embedder (#8941)
* feat: Enhance error handling in Azure document embedder

* add release notes

* address review comments

* Update releasenotes/notes/add-azure-embedder-exception-handler-c10ea46fb536de3b.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* more alignment with OpenAI impl

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-04 11:16:08 +01:00
Amna Mubashar
28db039bca
feat: add run_async to HuggingfaceAPIChatGenerator (#8943)
* add run_async

* add release notes

* Add integration test
2025-03-03 16:51:30 +01:00
Amna Mubashar
1b2053b358
Small fix in the docstring example (#8950) 2025-03-03 16:26:42 +01:00
tstadel
13968cc15b
fix: in OpenAIChatGenerator set additionalProperties to False when tools_strict=True (#8913)
* fix: set ComponentTool addtionalProperties for OpenAI tools_strict=True

* add reno

* Move the additionalProperties into the OpenAIChatGenerator

* Remove

* Put additionalProperties into the correct place

* Fix test

* Update releasenotes/notes/fix-componenttool-for-openai-tools_strict-998e5cd7ebc6ec19.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Sebastian Husch Lee <sebastian.lee@deepset.ai>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-03 16:23:24 +01:00
Sebastian Husch Lee
296e31c182
feat: Add Type Validation parameter for Pipeline Connections (#8875)
* Starting to refactor type util tests to be more systematic

* refactoring

* Expand tests

* Update to type utils

* Add missing subclass check

* Expand and refactor tests, introduce type_validation Literal

* More test refactoring

* Test refactoring, adding type validation variable to pipeline base

* Update relaxed version of type checking to pass all newly added tests

* trim whitespace

* Add tests

* cleanup

* Updates docstrings

* Add reno

* docs

* Fix mypy and add docstrings

* Changes based on advice from Tobi

* Remove unused imports

* Doc strings

* Add connection type validation to to_dict and from_dict

* Update tests

* Fix test

* Also save connection_type_validation at global pipeline level

* Fix tests

* Remove connection type validation from the connect level, only keep at pipeline level

* Formatting

* Fix tests

* formatting
2025-03-03 16:00:22 +01:00
Sebastian Husch Lee
00fe4d157d
feat: Add run async for AzureOpenAIChatGenerator (#8948)
* Add tests for run_async

* Add reno

* Add async client

* Add init test

* Add comment

* Fix test

* Update releasenotes/notes/run-async-azure-54450f0c2495f5c8.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-03-03 14:17:18 +00:00
Sebastian Husch Lee
52a028251c
refactor!: update AzureOCRDocumentConverter to not use the dataframe field for tabular Documents (#8885)
* Save document as a csv table now

* Fix tests

* Fix tests

* Add reno
2025-03-03 12:45:02 +00:00
Michele Pangrazzi
209e6d5ff0
remove duplicate test (#8944) 2025-02-28 13:27:43 +00:00
mathislucka
ee81570f37
fix: only overwrite existing socket inputs when we provide a new value (#8940)
* fix: only overwrite existing socket inputs when we provide a new value

* chore: add release notes

* Apply suggestions from code review

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-02-27 09:13:41 +00:00
Michele Pangrazzi
db4f23771a
Avoid mutating self.routes in ConditionalRouter to_dict method (#8936)
* Avoid mutating self.routes in ConditionalRouter to_dict method

* Add release note

* Update releasenotes/notes/fix-conditional-router-to-dict-5af887da50effe11.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* Make test_router_to_dict_does_not_mutate_routes more robut (add another roundtrip)

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-26 12:34:35 +01:00
Michele Pangrazzi
d1e503e5c7
skip HF API integration test (#8938) 2025-02-26 12:10:54 +01:00
Julian Risch
6652dd7550
Revert "test: skip HF API live integration tests (#8889)" (#8914)
* Revert "test: skip HF API live integration tests (#8889)"

This reverts commit 56a3a9bd61b7391ae91e3d8179b3b33918ef4932.

* Replace zephyr-7b-beta model with SmolLM2-1.7B-Instruct

* Use zephyr-7b-beta model but extend instructions

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-25 09:03:20 +01:00
mathislucka
76753fd4c6
fix: reduce number of edge cases where lazy variadic components wait for inputs that can't arrive anymore (#8907)
* wip

* fix: running order with lazy variadic components

* fix: tests

* format

* comment

* fix: alternative approach to fixing running order

* unused imports

* revert fix

* remove unneeded return

* remove data based approach to tie breaking

* release note

* trailing spaces

* newline eof

* unused import

* add more explanations to release note
2025-02-24 15:17:17 +00:00
mathislucka
a902af1db2
fix: set tags before context manager closes (#8911) 2025-02-24 12:34:49 +01:00
Sebastian Husch Lee
af3c89a257
feat: In FileTypeRouter add .msg to "application/vnd.ms-outlook" mapping (#8910)
* Add .msg mimetype support in file type router

* Add reno

* Update tests
2025-02-24 09:10:17 +01:00
Sebastian Husch Lee
99a998f90b
feat: Add MSGToDocument converter (#8868)
* Initial commit of MSG converter from Bijay

* Updates to the MSG converter

* Add license header

* Add tests for msg converter

* Update converter

* Expanding tests

* Update docstrings

* add license header

* Add reno

* Add to inits and pydocs

* Add test for empty input

* Fix types

* Fix mypy

---------

Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>
2025-02-24 08:12:32 +01:00
Tobias Wochinger
d7dfc5222c
fix: right tracer import path (#8908) 2025-02-21 20:04:54 +01:00
Sebastian Husch Lee
a516672cfb
fix: Fix data dog tracing (#8900)
* Fix data dog tracing

* Add reno

* Update imports

* Fix
2025-02-21 14:35:04 +01:00
Stefano Fiorucci
3339097e99
ci: refactor job to check imports (#8892)
* refactor

* Trigger CI

* run tests if this file changes

* show failure

* revert

* rm duplicate subdir and explain in comment
2025-02-21 11:37:41 +01:00
Stefano Fiorucci
04c6136cc4
relax posthog pin (#8898) 2025-02-21 10:49:29 +01:00
Stefano Fiorucci
9546e69374
perf: Optimize import times (#8878)
* initial experiments

* progress

* draft

* fix header

* fix linting

* lot more lazy inits

* fixes to main init

* linting

* small refinements

* header fix

* release note

* improve consistency

* test: make sure no extra modules are being imported due to `__init__` definitions

* extend release note with an example

* refactoring import test

* updating release notes

* casting .keys() to list

* reverting to list

* Update haystack/__init__.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* fixing ident problem

* better comments

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-02-21 09:55:20 +01:00
Stefano Fiorucci
fcca7104d3
pin ddtrace<3.0.0 (#8897) 2025-02-21 08:14:41 +00:00
David S. Batista
7d51793727
chore: cleaning up unused imports in tests (#8887) 2025-02-20 16:56:16 +00:00
Michele Pangrazzi
44fb20c2d5
Add run_async to OpenAIChatGenerator (#8880)
* Implememntation of run_async (wip)

* Add missing tests ; Move async tests to test_openai_async.py

* Add release note

* Update docstring

* Alignments with haystack-experimental implementation

* Lint: removed unused imports

* Update haystack/components/generators/chat/openai.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-02-20 16:51:46 +00:00
Stefano Fiorucci
56a3a9bd61
test: skip HF API live integration tests (#8889)
* skip HF API integration tests

* better wording
2025-02-20 16:38:57 +00:00
Sebastian Husch Lee
62d0d5d3d5
Update default output type of list joiner to be correct (#8881) 2025-02-20 10:54:50 +01:00
matthias plasser
7c17ca0405
fix ResourceWarning: unclosed file when using telemetry (#8884) 2025-02-20 10:41:35 +01:00
Sebastian Husch Lee
8cafcddb00
chore: Remove print statements from tests and mention of old name (#8883)
* Remove print statements from tests

* Remove mention of Canals

* Remove another mention
2025-02-20 10:24:26 +01:00
Julian Risch
92c87805b8 Revert "build(deps): bump docker/bake-action from 5 to 6 (#8685)"
This reverts commit 687f7593c705271dd4225ded9f1cdf6e00efca3f.
2025-02-20 10:20:06 +01:00
dependabot[bot]
687f7593c7
build(deps): bump docker/bake-action from 5 to 6 (#8685)
* build(deps): bump docker/bake-action from 5 to 6

Bumps [docker/bake-action](https://github.com/docker/bake-action) from 5 to 6.
- [Release notes](https://github.com/docker/bake-action/releases)
- [Commits](https://github.com/docker/bake-action/compare/v5...v6)

---
updated-dependencies:
- dependency-name: docker/bake-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Remove checkout step

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-02-20 09:37:44 +01:00
Sebastian Husch Lee
52909a0c81
fix: Fix OpenAIChatGenerator + tools + streaming (#8879)
* Fix chat generator + tools + streaming

* Add reno

* Update docs

* Remove unused import

* add doc

* Fix test

* small cleanup

* PR comments

* fix test

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2025-02-20 08:40:22 +01:00
mathislucka
de3d0a23e8
fix: use dunder all (#8877) 2025-02-19 13:25:36 +00:00
mathislucka
8c54f06a19
fix: component checks failing for components that return dataframes (#8873)
* fix: use is not to compare to sentinel value

* chore: release notes

* Update releasenotes/notes/fix-component-checks-with-ambiguous-truth-values-949c447b3702e427.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* fix: another sentinel value

* test: also test base class

* add pandas as test dependency

* format

* Trigger CI

* mark test with xfail strict=False

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2025-02-19 09:10:48 +00:00
Sebastian Husch Lee
93f361e1e1
fix: Fix serialization of typing.Any when using serialize_type utility (#8853)
* Fix issue and expand tests

* Add reno
2025-02-18 17:26:56 +01:00
Sebastian Husch Lee
2ca32ff036
refactor: Refactor and expand tests for type utils (#8871)
* Refactored type utils tests

* minor changes
2025-02-18 11:54:37 +01:00
Sebastian Husch Lee
0c62087dd7
Make openai test more robust (#8872) 2025-02-18 11:38:16 +01:00
mathislucka
cd7c68372b
fix: ComponentTool description should not be truncated (#8870)
* fix: description should not be truncated

* chore: add release note
2025-02-18 11:23:26 +01:00
Stefano Fiorucci
0409e5da8f
remove base from evaluation pydoc config (#8867) 2025-02-17 15:19:40 +01:00
Hemanth Taduka
b5fb0d3ff8
fix: make pandas DataFrame optional in EvaluationRunResult (#8838)
* feat: AsyncPipeline that can schedule components to run concurrently (#8812)

* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* feat: add AsyncPipeline

* chore: Add release notes

* fix: format

* debug: switch run order to debug ubuntu and windows tests

* fix: consider priorities of other components while waiting for DEFER

* refactor: simplify code

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* fix: track pipeline type

* fix: and extend test

* fix: format

* style: sort alphabetically

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml

* fix: indentation, do not close loop

* fix: use asyncio.run

* fix: format

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>

* feat: AsyncPipeline that can schedule components to run concurrently (#8812)

* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* feat: add AsyncPipeline

* chore: Add release notes

* fix: format

* debug: switch run order to debug ubuntu and windows tests

* fix: consider priorities of other components while waiting for DEFER

* refactor: simplify code

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* fix: track pipeline type

* fix: and extend test

* fix: format

* style: sort alphabetically

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml

* fix: indentation, do not close loop

* fix: use asyncio.run

* fix: format

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>

* updated changes for refactoring evaluations without pandas package

* added release notes for eval_run_result.py for refactoring  EvaluationRunResult to work without pandas

* wip: cleaning and refactoring

* removing BaseEvaluationRunResult

* wip: fixing tests

* fixing tests and docstrings

* updating release notes

* fixing typing

* pylint fix

* adding deprecation warning

* fixing tests

* fixin types consistency

* adding stacklevel=2 to warning messages

* fixing docstrings

* fixing docstrings

* updating release notes

---------

Co-authored-by: mathislucka <mathis.lucka@gmail.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-17 14:43:54 +01:00
Sebastian Husch Lee
2f383bce25
feat: Update list joiner (#8851)
* Update ListJoiner to have default type List

* Add reno

* Add more tests

* Remove unused import

* Fix mypy

* Update docstrings

* Update haystack/components/joiners/list_joiner.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-02-14 09:47:19 +01:00
Sebastian Husch Lee
e6c503dbb9
Add __version__ to init.py (#8857) 2025-02-14 09:47:10 +01:00
Julian Risch
3e151620f5
chore: Add Hayhooks to README (#8854) 2025-02-13 15:05:43 +01:00
Ulises M
bfdad40a80
feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components (#8813)
* initial rough draft

* expose backend instead of extracting from model_kwargs

* explictly set backend model path

* add reno

* expose backend for ST diversity backend

* add dtype tests and expose kwargs to ST ranker for backend parameters

* skip dtype tests as torch isnt compiled with cuda

* add new openvino dependency release, unskip tests

* resolve suggestion

* mock calls, turn integrations into unit tests

* remove unnecessary test dependencies
2025-02-13 12:04:14 +01:00
Sebastian Husch Lee
71416c81bc
feat: Add store_full_path to converter (#8849)
* Add missing store_full_path to converter

* Add release note

* Fix pylint
2025-02-12 17:11:59 +01:00
Bilge Yücel
043b88f181
Remove run_inner code snippet from sync run (#8850) 2025-02-12 15:06:52 +01:00
Stefano Fiorucci
7b5b84d377
stop collecting transformers and torch in telemetry (#8847) 2025-02-12 11:36:29 +00:00
mathislucka
88ed301712
fix: AsyncPipeline logging name can't be overwritten (#8845)
* fix: component_name not name in logging statement

* fix: component_name not name in logging statement

* fix: don't use fstrings in logging

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-02-12 11:21:07 +00:00
David S. Batista
cee52435bf
adding to pydocs (#8846) 2025-02-12 12:04:50 +01:00
Amna Mubashar
dcefb48fba
fix: Improve docstrings of AsyncPipeline (#8843)
* Update the docstrings

* Add an example for run_async_generator in docstrings

* fixing docstring code

* fixing docstring code

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-12 10:23:00 +00:00
Haystack Bot
0d4720ef79
Update unstable version to 2.11.0-rc0 (#8842)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-02-11 13:54:52 +01:00