4147 Commits

Author SHA1 Message Date
Stefano Fiorucci
c5cde40d3a
unpin ruff and update code (#9040) 2025-03-14 14:53:25 +00:00
Sebastian Husch Lee
6366f6577e
chore: Use thread safe import in import_class_by_name utility function (#9028)
* Use thread safe import

* fix debug string
2025-03-14 12:31:06 +01:00
Sebastian Husch Lee
3d7d65a260
Pin ruff (#9038) 2025-03-14 12:00:21 +01:00
Sebastian Husch Lee
363ac504dc
feat: Add warning to ChatPromptBuilder and PromptBuilder if they have variables, but required_variables is not set (#9027)
* Add warning to ChatPromptBuilder and PromptBuilder if they have variables, but required variables is not set.

* Add reno
2025-03-12 15:35:19 +01:00
Sebastian Husch Lee
4edefe3e56
Feat: Support Azure Workload Identity Credential (#9012)
* Start adding support for passing callable to Azure components

* Add to chat version

* Fix test

* Add reno

* Add support to azure doc and text embedder

* Rename

* update llm metadata extractor

* Add tests for text embedder

* Update tests

* Remove unused fixture and import

* Update reno
2025-03-12 13:45:40 +01:00
Stefano Fiorucci
4c1facdfab
fix: add dataframe to legacy fields for Document (#9026)
* fix: add dataframe to legacy fields for Document

* fmt

* small fixes
2025-03-12 13:01:03 +01:00
Sebastian Husch Lee
9905e9fa17
fix: Fix logging test (#9024)
* Pin structlog

* Fix test
2025-03-12 10:54:52 +01:00
Julian Risch
195d4031b9
chore: remove mention of 2.0 from banner (#9010)
* chore: remove mention of 2.0 from  banner

* Update img alt text in readme

* Change the banner

---------

Co-authored-by: bilgeyucel <bilgeyucel96@gmail.com>
2025-03-11 09:53:51 +01:00
Mohammed Abdul Razak Wahab
7291134680
feat: Improve type validation for bare types (#8997)
* feat: Improve type validation for bare types

* added release notes

* refactor

* resolve review comments

* address review comments

* Update haystack/core/type_utils.py

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2025-03-11 08:48:26 +01:00
Sebastian Husch Lee
3d41c04130
fix: Fix type serialization and deserialization (#8993)
* Expand tests

* New version of type serialization

* Adding more tests

* More tests

* Fix type serialization when using python 3.9

* Deserialization works with Optional now and we don't require 'typing.' to be present anymore

* Don't worry about Literal

* Add reno

* Fix mypy

* Pylint

* Add additional test

* Simplify

* Add back comment

* Fix types

* Fix
2025-03-07 11:10:16 +01:00
David S. Batista
672ab09477
fix: cleaning up InMemoryDocumentStore executor when created inside the class (#8994)
* cleaning up executor when created inside the class

* adding missed tests
2025-03-07 11:01:29 +01:00
David S. Batista
c037052581
feat: adding function to detect unmapped CID characters in PDFMinerToDocument (#8992)
* adding function to detect unmapped CID characters

* adding release notes

* adding test for logs
2025-03-06 15:44:06 +00:00
David S. Batista
4c9d08add5
feat: async support for the HuggingFaceLocalChatGenerator (#8981)
* adding async run method

* passing an optional ThreadExecutor

* adding tests

* adding release notes

* nit: license

* fixing linting

* Update releasenotes/notes/adding-async-huggingface-local-chat-generator-962512f52282d12d.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Use Phi isntead (#8982)

* build: drop Python 3.8 support (#8978)

* draft

* readd typing_extensions

* small fix + release note

* remove ruff target-version

* Update releasenotes/notes/drop-python-3.8-868710963e794c83.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* Update unstable version to 2.12.0-rc0 (#8983)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: allow support for `include_usage` in streaming using OpenAIChatGenerator (#8968)

* fix error in handling usage completion chunk

* ci: improve release notes format checking (#8984)

* chore: fix invalid release note

* try improving relnote linting

* add relnotes path

* fix bad release note

* improve reno config

* fix: handle async tests in`HuggingFaceAPIChatGenerator` to prevent error (#8986)

* add missing asyncio

* explicitly close connection in the test

* Fix tests (#8990)

* docs: Update docstrings of `BranchJoiner` (#8988)

* Update docstrings

* Add a bit more explanatory text

* Add reno

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Fix formatting

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* PR comments

* destroying ThreadPoolExecutor when the generator instance is being destroyied, only if it was not passed externally

* fixing bug in streaming_callback

* PR comments

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Haystack Bot <73523382+HaystackBot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2025-03-06 15:57:11 +01:00
Sebastian Husch Lee
c4fafd9b04
docs: Update docstrings of BranchJoiner (#8988)
* Update docstrings

* Add a bit more explanatory text

* Add reno

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/joiners/branch.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Fix formatting

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2025-03-06 13:08:07 +01:00
Sebastian Husch Lee
24084e6431
Fix tests (#8990) 2025-03-06 11:17:57 +01:00
Amna Mubashar
ae26e7580b
fix: handle async tests inHuggingFaceAPIChatGenerator to prevent error (#8986)
* add missing asyncio

* explicitly close connection in the test
2025-03-06 10:55:01 +01:00
Stefano Fiorucci
40798bc4f2
ci: improve release notes format checking (#8984)
* chore: fix invalid release note

* try improving relnote linting

* add relnotes path

* fix bad release note

* improve reno config
2025-03-05 19:07:46 +01:00
Amna Mubashar
13c3768d49
fix: allow support for include_usage in streaming using OpenAIChatGenerator (#8968)
* fix error in handling usage completion chunk
2025-03-05 18:30:26 +01:00
Haystack Bot
ab67a76ecd
Update unstable version to 2.12.0-rc0 (#8983)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-05 16:15:09 +01:00
Stefano Fiorucci
c04c900f26
build: drop Python 3.8 support (#8978)
* draft

* readd typing_extensions

* small fix + release note

* remove ruff target-version

* Update releasenotes/notes/drop-python-3.8-868710963e794c83.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
v2.12.0-rc0
2025-03-05 14:59:56 +00:00
Sebastian Husch Lee
4a87ceb0ed
Use Phi isntead (#8982) 2025-03-05 15:53:26 +01:00
Sebastian Husch Lee
f741df88df
fix: Update flaky HugginFace Generator tests to use more reliable model and add instruction tokens (#8980)
* Fix test

* Make other HF tests more reliable

* Add back test
2025-03-05 15:26:17 +01:00
Stefano Fiorucci
ec97f4d991
update transformers test dependency to 4.48.3 (#8979) 2025-03-05 14:49:34 +01:00
Julian Risch
b77f2bad79
feat: Add async run to DocumentWriter (#8962)
* add async run to DocumentWriter

* reno
2025-03-05 11:53:35 +01:00
Stefano Fiorucci
bb0e36f712
feat: increase Mermaid timeout and make it configurable (#8973)
* increase Mermaid timeout and make it configurable

* rm e2e trigger

* simplify test
2025-03-05 10:49:34 +00:00
David S. Batista
9581fea3bc
feat: adding async version of InMemoryDocumentStore and associated retrievers (#8963)
* adding classes from experimental

* adding release notes

* adding tests

* merging all into a single class

* adding async retriever methods

* Update haystack/document_stores/in_memory/document_store.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* adding missed tests

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-05 11:36:24 +01:00
Stefano Fiorucci
9da6696a45
chore: make openapi-llm an optional dependency (#8958)
* openapi-llm should be and optional dependency

* rm empty line
2025-03-05 11:15:19 +01:00
Stefano Fiorucci
10f11d40d4
build: support python 3.13 (#8965)
* support python 3.13

* release note

* add python version info to contributing guide

* better explanation
2025-03-05 09:49:10 +00:00
Mohammed Abdul Razak Wahab
e33a9e46ed
fix: add chat message name field (#8969)
* add-chat-message-name-field

* add release notes

* Update add-chat-message-name-field-a8ae96fb9ff13f7b.yaml

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-05 09:29:36 +00:00
Julian Risch
830e7497c3
chore: update EvalRunResult deprecation warning to 2.12 (#8957)
* chore: update EvalRunResult deprecation warning to 2.12

* Update warning message

* format
2025-03-04 11:38:18 +00:00
Stefano Fiorucci
f3c44be904
refactor!: remove dataframe field from Document and ExtractedTableAnswer; make pandas optional (#8906)
* remove dataframe

* release note

* small fix

* group imports

* Update pyproject.toml

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* Update pyproject.toml

Co-authored-by: Julian Risch <julian.risch@deepset.ai>

* address feedback

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-03-04 11:06:07 +00:00
Mohammed Abdul Razak Wahab
0d65b4caa7
feat: Enhance error handling in Azure document embedder (#8941)
* feat: Enhance error handling in Azure document embedder

* add release notes

* address review comments

* Update releasenotes/notes/add-azure-embedder-exception-handler-c10ea46fb536de3b.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* more alignment with OpenAI impl

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-04 11:16:08 +01:00
Amna Mubashar
28db039bca
feat: add run_async to HuggingfaceAPIChatGenerator (#8943)
* add run_async

* add release notes

* Add integration test
2025-03-03 16:51:30 +01:00
Amna Mubashar
1b2053b358
Small fix in the docstring example (#8950) 2025-03-03 16:26:42 +01:00
tstadel
13968cc15b
fix: in OpenAIChatGenerator set additionalProperties to False when tools_strict=True (#8913)
* fix: set ComponentTool addtionalProperties for OpenAI tools_strict=True

* add reno

* Move the additionalProperties into the OpenAIChatGenerator

* Remove

* Put additionalProperties into the correct place

* Fix test

* Update releasenotes/notes/fix-componenttool-for-openai-tools_strict-998e5cd7ebc6ec19.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Sebastian Husch Lee <sebastian.lee@deepset.ai>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-03-03 16:23:24 +01:00
Sebastian Husch Lee
296e31c182
feat: Add Type Validation parameter for Pipeline Connections (#8875)
* Starting to refactor type util tests to be more systematic

* refactoring

* Expand tests

* Update to type utils

* Add missing subclass check

* Expand and refactor tests, introduce type_validation Literal

* More test refactoring

* Test refactoring, adding type validation variable to pipeline base

* Update relaxed version of type checking to pass all newly added tests

* trim whitespace

* Add tests

* cleanup

* Updates docstrings

* Add reno

* docs

* Fix mypy and add docstrings

* Changes based on advice from Tobi

* Remove unused imports

* Doc strings

* Add connection type validation to to_dict and from_dict

* Update tests

* Fix test

* Also save connection_type_validation at global pipeline level

* Fix tests

* Remove connection type validation from the connect level, only keep at pipeline level

* Formatting

* Fix tests

* formatting
2025-03-03 16:00:22 +01:00
Sebastian Husch Lee
00fe4d157d
feat: Add run async for AzureOpenAIChatGenerator (#8948)
* Add tests for run_async

* Add reno

* Add async client

* Add init test

* Add comment

* Fix test

* Update releasenotes/notes/run-async-azure-54450f0c2495f5c8.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
2025-03-03 14:17:18 +00:00
Sebastian Husch Lee
52a028251c
refactor!: update AzureOCRDocumentConverter to not use the dataframe field for tabular Documents (#8885)
* Save document as a csv table now

* Fix tests

* Fix tests

* Add reno
2025-03-03 12:45:02 +00:00
Michele Pangrazzi
209e6d5ff0
remove duplicate test (#8944) 2025-02-28 13:27:43 +00:00
mathislucka
ee81570f37
fix: only overwrite existing socket inputs when we provide a new value (#8940)
* fix: only overwrite existing socket inputs when we provide a new value

* chore: add release notes

* Apply suggestions from code review

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2025-02-27 09:13:41 +00:00
Michele Pangrazzi
db4f23771a
Avoid mutating self.routes in ConditionalRouter to_dict method (#8936)
* Avoid mutating self.routes in ConditionalRouter to_dict method

* Add release note

* Update releasenotes/notes/fix-conditional-router-to-dict-5af887da50effe11.yaml

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* Make test_router_to_dict_does_not_mutate_routes more robut (add another roundtrip)

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-26 12:34:35 +01:00
Michele Pangrazzi
d1e503e5c7
skip HF API integration test (#8938) 2025-02-26 12:10:54 +01:00
Julian Risch
6652dd7550
Revert "test: skip HF API live integration tests (#8889)" (#8914)
* Revert "test: skip HF API live integration tests (#8889)"

This reverts commit 56a3a9bd61b7391ae91e3d8179b3b33918ef4932.

* Replace zephyr-7b-beta model with SmolLM2-1.7B-Instruct

* Use zephyr-7b-beta model but extend instructions

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-25 09:03:20 +01:00
mathislucka
76753fd4c6
fix: reduce number of edge cases where lazy variadic components wait for inputs that can't arrive anymore (#8907)
* wip

* fix: running order with lazy variadic components

* fix: tests

* format

* comment

* fix: alternative approach to fixing running order

* unused imports

* revert fix

* remove unneeded return

* remove data based approach to tie breaking

* release note

* trailing spaces

* newline eof

* unused import

* add more explanations to release note
2025-02-24 15:17:17 +00:00
mathislucka
a902af1db2
fix: set tags before context manager closes (#8911) 2025-02-24 12:34:49 +01:00
Sebastian Husch Lee
af3c89a257
feat: In FileTypeRouter add .msg to "application/vnd.ms-outlook" mapping (#8910)
* Add .msg mimetype support in file type router

* Add reno

* Update tests
2025-02-24 09:10:17 +01:00
Sebastian Husch Lee
99a998f90b
feat: Add MSGToDocument converter (#8868)
* Initial commit of MSG converter from Bijay

* Updates to the MSG converter

* Add license header

* Add tests for msg converter

* Update converter

* Expanding tests

* Update docstrings

* add license header

* Add reno

* Add to inits and pydocs

* Add test for empty input

* Fix types

* Fix mypy

---------

Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>
2025-02-24 08:12:32 +01:00
Tobias Wochinger
d7dfc5222c
fix: right tracer import path (#8908) 2025-02-21 20:04:54 +01:00
Sebastian Husch Lee
a516672cfb
fix: Fix data dog tracing (#8900)
* Fix data dog tracing

* Add reno

* Update imports

* Fix
2025-02-21 14:35:04 +01:00
Stefano Fiorucci
3339097e99
ci: refactor job to check imports (#8892)
* refactor

* Trigger CI

* run tests if this file changes

* show failure

* revert

* rm duplicate subdir and explain in comment
2025-02-21 11:37:41 +01:00