1585 Commits

Author SHA1 Message Date
Sebastian Husch Lee
99a998f90b
feat: Add MSGToDocument converter (#8868)
* Initial commit of MSG converter from Bijay

* Updates to the MSG converter

* Add license header

* Add tests for msg converter

* Update converter

* Expanding tests

* Update docstrings

* add license header

* Add reno

* Add to inits and pydocs

* Add test for empty input

* Fix types

* Fix mypy

---------

Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>
2025-02-24 08:12:32 +01:00
Sebastian Husch Lee
a516672cfb
fix: Fix data dog tracing (#8900)
* Fix data dog tracing

* Add reno

* Update imports

* Fix
2025-02-21 14:35:04 +01:00
Stefano Fiorucci
9546e69374
perf: Optimize import times (#8878)
* initial experiments

* progress

* draft

* fix header

* fix linting

* lot more lazy inits

* fixes to main init

* linting

* small refinements

* header fix

* release note

* improve consistency

* test: make sure no extra modules are being imported due to `__init__` definitions

* extend release note with an example

* refactoring import test

* updating release notes

* casting .keys() to list

* reverting to list

* Update haystack/__init__.py

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* fixing ident problem

* better comments

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-02-21 09:55:20 +01:00
David S. Batista
7d51793727
chore: cleaning up unused imports in tests (#8887) 2025-02-20 16:56:16 +00:00
Michele Pangrazzi
44fb20c2d5
Add run_async to OpenAIChatGenerator (#8880)
* Implememntation of run_async (wip)

* Add missing tests ; Move async tests to test_openai_async.py

* Add release note

* Update docstring

* Alignments with haystack-experimental implementation

* Lint: removed unused imports

* Update haystack/components/generators/chat/openai.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-02-20 16:51:46 +00:00
Stefano Fiorucci
56a3a9bd61
test: skip HF API live integration tests (#8889)
* skip HF API integration tests

* better wording
2025-02-20 16:38:57 +00:00
Sebastian Husch Lee
62d0d5d3d5
Update default output type of list joiner to be correct (#8881) 2025-02-20 10:54:50 +01:00
Sebastian Husch Lee
8cafcddb00
chore: Remove print statements from tests and mention of old name (#8883)
* Remove print statements from tests

* Remove mention of Canals

* Remove another mention
2025-02-20 10:24:26 +01:00
Sebastian Husch Lee
52909a0c81
fix: Fix OpenAIChatGenerator + tools + streaming (#8879)
* Fix chat generator + tools + streaming

* Add reno

* Update docs

* Remove unused import

* add doc

* Fix test

* small cleanup

* PR comments

* fix test

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2025-02-20 08:40:22 +01:00
mathislucka
8c54f06a19
fix: component checks failing for components that return dataframes (#8873)
* fix: use is not to compare to sentinel value

* chore: release notes

* Update releasenotes/notes/fix-component-checks-with-ambiguous-truth-values-949c447b3702e427.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* fix: another sentinel value

* test: also test base class

* add pandas as test dependency

* format

* Trigger CI

* mark test with xfail strict=False

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2025-02-19 09:10:48 +00:00
Sebastian Husch Lee
93f361e1e1
fix: Fix serialization of typing.Any when using serialize_type utility (#8853)
* Fix issue and expand tests

* Add reno
2025-02-18 17:26:56 +01:00
Sebastian Husch Lee
2ca32ff036
refactor: Refactor and expand tests for type utils (#8871)
* Refactored type utils tests

* minor changes
2025-02-18 11:54:37 +01:00
Sebastian Husch Lee
0c62087dd7
Make openai test more robust (#8872) 2025-02-18 11:38:16 +01:00
mathislucka
cd7c68372b
fix: ComponentTool description should not be truncated (#8870)
* fix: description should not be truncated

* chore: add release note
2025-02-18 11:23:26 +01:00
Hemanth Taduka
b5fb0d3ff8
fix: make pandas DataFrame optional in EvaluationRunResult (#8838)
* feat: AsyncPipeline that can schedule components to run concurrently (#8812)

* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* feat: add AsyncPipeline

* chore: Add release notes

* fix: format

* debug: switch run order to debug ubuntu and windows tests

* fix: consider priorities of other components while waiting for DEFER

* refactor: simplify code

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* fix: track pipeline type

* fix: and extend test

* fix: format

* style: sort alphabetically

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml

* fix: indentation, do not close loop

* fix: use asyncio.run

* fix: format

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>

* feat: AsyncPipeline that can schedule components to run concurrently (#8812)

* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* feat: add AsyncPipeline

* chore: Add release notes

* fix: format

* debug: switch run order to debug ubuntu and windows tests

* fix: consider priorities of other components while waiting for DEFER

* refactor: simplify code

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* fix: track pipeline type

* fix: and extend test

* fix: format

* style: sort alphabetically

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml

* fix: indentation, do not close loop

* fix: use asyncio.run

* fix: format

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>

* updated changes for refactoring evaluations without pandas package

* added release notes for eval_run_result.py for refactoring  EvaluationRunResult to work without pandas

* wip: cleaning and refactoring

* removing BaseEvaluationRunResult

* wip: fixing tests

* fixing tests and docstrings

* updating release notes

* fixing typing

* pylint fix

* adding deprecation warning

* fixing tests

* fixin types consistency

* adding stacklevel=2 to warning messages

* fixing docstrings

* fixing docstrings

* updating release notes

---------

Co-authored-by: mathislucka <mathis.lucka@gmail.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-17 14:43:54 +01:00
Sebastian Husch Lee
2f383bce25
feat: Update list joiner (#8851)
* Update ListJoiner to have default type List

* Add reno

* Add more tests

* Remove unused import

* Fix mypy

* Update docstrings

* Update haystack/components/joiners/list_joiner.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-02-14 09:47:19 +01:00
Ulises M
bfdad40a80
feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components (#8813)
* initial rough draft

* expose backend instead of extracting from model_kwargs

* explictly set backend model path

* add reno

* expose backend for ST diversity backend

* add dtype tests and expose kwargs to ST ranker for backend parameters

* skip dtype tests as torch isnt compiled with cuda

* add new openvino dependency release, unskip tests

* resolve suggestion

* mock calls, turn integrations into unit tests

* remove unnecessary test dependencies
2025-02-13 12:04:14 +01:00
Sebastian Husch Lee
71416c81bc
feat: Add store_full_path to converter (#8849)
* Add missing store_full_path to converter

* Add release note

* Fix pylint
2025-02-12 17:11:59 +01:00
Vladimir Blagojevic
a7c1661f13
fix: Look through all streaming chunks for tools calls (#8829)
* Look through all streaming chunks for tools calls

* Add reno note

* mypy fixes

* Improve robustness

* Don't concatenate, use the last value

* typing

* Update releasenotes/notes/improve-tool-call-chunk-search-986474e814af17a7.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* Small refactoring

* isort

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-11 13:25:39 +01:00
Sebastian Husch Lee
2c0a72844f
Fix splitter when table is only one row wide (#8839) 2025-02-11 09:55:35 +00:00
David S. Batista
f189a1c349
fix: LLMMetadataExtractor removing from_dict/to_dict AWS tests (#8840)
* removint from_dict/to_dict AWS tests

* removing boto3 import from tests
2025-02-11 09:40:58 +00:00
Sebastian Husch Lee
f9e6e481a1
feat: Add new component CSVDocumentSplitter to recursively split CSV documents (#8815)
* CSV Document Splitter

* Add license header

* Add newline

* Add to docs

* Add lineterminator

* Updated csv splitter to allow user to specify to split by row, column or both

* Adding more tests

* Column tests

* Some refactoring to remove incorrect dropna call

* Fix

* More complicated test

* Adding more relevant metadata to match whats provided in our other splitters

* value error tests

* Fix mypy

* Docstring updates

* Add skip_blank_lines=False

* Add to dict test

* More from and to dict tests

* Fixes

* Move dict creation outside of for loop
2025-02-10 18:10:18 +01:00
David S. Batista
f798a9e935
feat: adding LLMMetadataExtractor (#8833)
* fixing linting

* adding release notes

* updating tests

* adding to pydocs

* fixing typing due to Optional

* fixing docstring
2025-02-10 16:54:25 +00:00
Vladimir Blagojevic
b6ebd3cd77
fix: Update OpenAPIServiceConnector to new ChatMessage (#8817)
* Update OpenAPIServiceConnector to new ChatMessage, bypass model response validation

* Add reno

* Lint fixes

* Add serde pipeline test

* Update haystack/components/connectors/openapi_service.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/connectors/openapi_service.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/connectors/openapi_service.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/connectors/openapi_service.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/connectors/openapi_service.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Add edge case unit test

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2025-02-10 17:36:53 +01:00
Vladimir Blagojevic
fd5040108a
feat: Add OpenAPIConnector component, improve OpenAPI integration (#8808)
* Initial OpenAPIConnector

* Add reno note

* Format

* Add headers

* Add test dep

* Use haystack logger

* Fix test

* Minor fix, spin CI

* Update reno release note format

* Add to docs, pydocs improvements
2025-02-10 10:34:37 +01:00
Vladimir Blagojevic
73bfc08b71
feat: HuggingFaceLocalChatGenerator unified support for tools (#8827)
* Add tools to HuggingFaceLocalChatGenerator

* Add reno

* Fix types

* Small post merge fix

* Add unit tests

* Add tools serde and tests

* PR feedback

* PR feedback
2025-02-10 09:44:51 +01:00
mathislucka
e5b9bdeb66
feat: AsyncPipeline that can schedule components to run concurrently (#8812)
* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* feat: add AsyncPipeline

* chore: Add release notes

* fix: format

* debug: switch run order to debug ubuntu and windows tests

* fix: consider priorities of other components while waiting for DEFER

* refactor: simplify code

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* fix: track pipeline type

* fix: and extend test

* fix: format

* style: sort alphabetically

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update test/core/pipeline/features/conftest.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml

* fix: indentation, do not close loop

* fix: use asyncio.run

* fix: format

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-07 16:37:29 +01:00
Sebastian Husch Lee
35788a2d06
feat: Update csv cleaner (#8828)
* More refactoring

* Add more new options and more tests

* Improve docstrings

* Add release notes

* Fix pylint
2025-02-07 14:29:53 +01:00
Sebastian Husch Lee
1785ea622e
feat: Add component CSVDocumentCleaner for removing empty rows and columns (#8816)
* Initial commit for csv cleaner

* Add release notes

* Update lineterminator

* Update releasenotes/notes/csv-document-cleaner-8eca67e884684c56.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* alphabetize

* Use lazy import

* Some refactoring

* Some refactoring

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 17:56:38 +01:00
Stefano Fiorucci
1f257944a6
chore: fix Hugging Face components for mypy 1.15.0 (#8822)
* chore: fix Hugging Face components for mypy 1.15.0

* small fixes

* fix test

* rm print

* use cast and be more permissive
2025-02-06 16:25:59 +00:00
mathislucka
eec91824bc
fix: pipeline run bugs in cyclic and acyclic pipelines (#8707)
* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* chore: remove unused inputs, add missing license header

* chore: extend release notes

* Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* fix: format

* fix: format

* Update release note

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 14:19:47 +00:00
Amna Mubashar
b0809b75f5
feat: Add a ListJoiner component (#8810)
* Add a ListJoiner

* Add tests and release notes
2025-02-05 23:19:14 +01:00
György Orosz
d2348ad462
feat: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder can accept and pass any arguments to SentenceTransformer.encode (#8806)
* feat: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder can accept and pass any arguments to SentenceTransformer.encode

* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility reasons

* docs: added explanation for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder

* test: added tests for encode_kwargs in SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder

* doc: removed empty lines from docstrings of SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder

* refactor: encode_kwargs parameter of SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility (part II.)
2025-02-05 16:09:35 +00:00
Stefano Fiorucci
2828d9e4ae
refactor!: DOCXToDocument converter - store DOCX metadata as a dict (#8804)
* DOCXToDocument - store DOCX metadata as a dict

* do not export DOCXMetadata to converters package
2025-02-05 14:43:19 +01:00
Stefano Fiorucci
5ae94886b2
fix: fix test failures with Transformers models in PRs from forks (#8809)
* trigger

* try pinning sentence transformers

* make integr tests run right away

* pin transformers instead

* older transformers version

* rm transformers pin

* try ignoring cache

* change ubuntu version

* try removing token

* try again

* more HF_API_TOKEN local deletions

* restore test priority

* rm leftover

* more deletions

* moreee

* more

* deletions

* restore jobs order
2025-02-04 19:08:37 +01:00
Sebastian Husch Lee
1ee86b5041
fix: Fix filters to handle date times with timezones (loading and comparison) (#8800)
* Fix on date time parsing with timezones. And comparing naive and aware date times.

* Add release note

* Add more filter tests
2025-02-04 14:51:06 +01:00
Stefano Fiorucci
877f826da0
refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post (#8794)
* HF API Embedders: refactoring

* rename variables

* rm leftovers

* rm pin

* rm unused import

* relnote

* warning with truncate/normalize and serverless inference API

* test that warnings are raised
2025-02-03 15:11:16 +00:00
David S. Batista
f1652121ac
feat: Add support for custom (or offline) Mermaid.ink server and support all parameters (#8799)
* compress graph data to support pako endpoint

* support mermaid.ink parameters and custom servers

* dont try to resolve conflicts with the github web ui...

* avoid double graph copy

* fixing typing, improving docstrings and release notes

* reverting type

* nit - force type checker no cache

* nit - force type checker no cache

---------

Co-authored-by: Ulises M <ulises@lbux.org>
Co-authored-by: Ulises M <30765968+lbux@users.noreply.github.com>
2025-02-03 15:55:29 +01:00
mathislucka
1a91365cc8
fix: callables can be deserialized from fully qualified import path (#8788)
* fix: callables can be deserialized from fully qualified import path

* fix: license header

* fix: format

* fix: types

* fix? types

* test: extend test case

* format

* add release notes
2025-02-03 12:35:37 +01:00
Sebastian Husch Lee
bba84e5517
fix: Fix JSONConverter to properly skip files that are not utf-8 encoded (#8775)
* Small fix

* Add reno

* Trying out license header fix here
2025-01-28 10:29:55 +01:00
tstadel
3119ae1ec9
refactor: raise PipelineError when Pipeline.from_dict receives an invalid type (#8711)
* fix: error on invalid type

* add reno

* Update releasenotes/notes/fix-invalid-component-type-error-83ee00d820b63cc5.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update test/core/pipeline/test_pipeline.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* fix reno

* fix reno

* last reno fix

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-01-23 11:40:19 +00:00
tstadel
bf79f04932
feat: support streaming_callback as run param for HF Chat generators (#8763)
* feat: support streaming_callback as run param for HF Chat generators

* add tests
2025-01-23 12:14:32 +01:00
Stefano Fiorucci
c3d0643511
feat: AzureOpenAIChatGenerator - support for tools (#8757)
* feat: AzureOpenAIChatGenerator - support for tools

* release note

* feedback
2025-01-23 09:24:04 +00:00
Stefano Fiorucci
f96839e139
chore: update transformers test dependency (#8752)
* update transformers test dependency

* add pad_token_id to the mock tokenizer

* fix HFLocal test + new test
2025-01-21 14:43:27 +01:00
Nicola Procopio
542a7f7ef5
fix: update meta data before initializing new Document in DocumentSplitter (#8745)
* updated DocumentSplitter

issue #8741

* release note

* updated DocumentSplitter

in _create_docs_from_splits function initialize a new variable copied_mete instead to overwrite meta

* added test

test_duplicate_pages_get_different_doc_id

* fix fmt

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-01-20 09:51:47 +01:00
David S. Batista
5af2888e23
fix: PDFMinerToDocument convert function - adding double new lines between each container_text so that passages can be detected. (#8729)
* initial import

* adding double new lines between container_texts so that passages can be detected

* reducing type specification to avoid import error

* adding release notes

* renaming variable
2025-01-17 13:01:16 +00:00
Stefano Fiorucci
424bce2783
test: fix HF API flaky live test with tools (#8744)
* test: fix HF API flaky live test with tools

* rm print
2025-01-17 12:36:07 +00:00
David S. Batista
2c84266d8f
test: adding test for PyPDF to extract passages so that they are detect by DocumentSplitter (#8739) 2025-01-17 10:56:16 +01:00
Vladimir Blagojevic
21dd03d3e7
feat: Add completion start time timestamp to relevant generators (#8728)
* OpenAIChatGenerator - add completion_start_time

* HuggingFaceAPIChatGenerator - add completion_start_time

* Add tests

* Add reno note

* Relax condition for cached responses

* Add completion_start_time timestamping to non-chat generators

* Update haystack/components/generators/chat/hugging_face_api.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* PR feedback

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-01-17 09:58:45 +01:00
Stefano Fiorucci
62ac27c947
chore: remove deprecated function ChatRole and from_function class method in ChatMessage (#8725)
* rm deprecated function role and from_function class method in chatmessage

* release note
2025-01-15 18:55:22 +01:00