* Fixes to setting StreamingChunk.index properly and refactoring tests for conversion
* Make _convert_chat_completion_chunk_to_streaming_chunk a member of OpenAIChatGenerator so we can overwrite it in integrations that inherit from it
* Fixes
* Modify streaming chunk to accept a list of tool call deltas.
* Fix tests
* Fix mypy and update original reno
* Undo change
* Update conversion to return a single streaming chunk
* update to print streaming chunk
* Fix types
* PR comments
* Fixes and tests
* Add reno
* Change variable name
* Add test and fix for passing streaming_callback to a component tool
* Add unit test
* Remove unused import
* Fix reno
* Start expanding StreamingChunk
* First pass at expanding Streaming Chunk
* Working version!
* Some tweaks and also make ToolInvoker stream a chunk with a finish reason
* Properly update test
* Change to tool_name, remove kw_only since its python 3.10 only and update HuggingFaceAPIChatGenerator to start following new StreamingChunk
* Add reno
* Some cleanup
* Fix unit tests
* Fix mypy and integration test
* Fix pylint
* Start refactoring huggingface local api
* Refactor openai generator and chat generator to reuse util methods
* Did some reorg
* Reusue utility method in HuggingFaceAPI
* Get rid of unneeded default values in tests
* Update conversion of streaming chunks to chat message to not rely on openai dataclass anymore
* Fix tests and loosen check in StreamingChunk post_init
* Fixes
* Fix license header
* Add start and index to HFAPIGenerator
* Fix mypy
* Clean up
* Update haystack/components/generators/utils.py
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* Update haystack/components/generators/utils.py
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* Change StreamingChunk.start to only a bool
* PR comments
* Fix unit test
* PR comment
* Fix test
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* Refactor HFAPI Chat Generator
* Add component info to generators
* Fix type hint
* Add reno
* Fix unit tests
* Remove incorrect dev comment
* Move _convert_streaming_chunks_to_chat_message to utils file
* feat(component.rankers): Add HuggingFace API (text-embeddings-inference for rerank) ranker component
* update test flow & doc loaders
* Support run_async for HuggingFaceAPIRanker
* Add release note for HuggingFace API support in component.rankers
* Add release note for HuggingFace API support in component.rankers
* Add release note for HuggingFace API support in component.rankers
* Add release note for HuggingFace API support in component.rankers
* fix:
1. `hugging_face_api.HuggingFaceAPIRanker` rename to `hugging_face_tei.HuggingFaceAPIRanker`
2. HuggingFaceAPIRanker: use our Secret API for token
3. add the missing modules for `docs/pydoc/config/rankers_api.yml`
4. added function `async_request_with_retry` for `haystack/utils/requests_utils.py` and added unittest on `test/utils/test_requests_utils.py`
4. HuggingFaceAPIRanker: refactor the retry function to support configuration based on attempts and status code.
5. HuggingFaceAPIRanker: refactor the test into unit tests using mocks
* fix(HuggingFaceTEIRanker): change the token check logic to use the resolve_value method.
* fix(format): run `hatch run format`
* fix:
- Force keyword-only arguments in __init__ method by adding *,
- Clarify token docstring that it's not always required
- Copy documents to avoid modifying original objects
- Remove test file from slow workflow
- Add monkeypatch eånvironment variable cleanup in tests
- Fix missing module in rankers_api.yml and sort modules alphabetically
- Remove unnecessary test info from release notes
* fix HuggingFaceTEIRanker:
- "None" of "Optional[Secret]" has no attribute "resolve_value"
- run/run_async: too many parameters
* fix(HuggingFaceTEIRanker) :Revise the docstring of the HuggingFaceTEIRanker, improve the parameter descriptions, ensure consistency and clarity. Add error handling information to enhance the readability of the API response.
* fix:unit test for HuggingFaceTEIRanker raise message
* fix fmt
* minor refinements
* refine release note
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* test(extractors): Add unit test for LLMMetadataExtractor with no content
Adds a new unit test `test_run_with_document_content_none` to `TestLLMMetadataExtractor`.
This test verifies that `LLMMetadataExtractor` correctly handles documents where `document.content` is None or an empty string.
It ensures that:
- Such documents are added to the `failed_documents` list.
- The correct error message ("Document has no content, skipping LLM call.") is present in their metadata.
- No actual LLM call is attempted for these documents.
This test provides coverage for the fix that prevents an AttributeError when processing documents with no content.
* chore: update comment to reflect new behavior in _run_on_thread method
* docs: Add release note for LLMMetadataExtractor no content fix
* Update releasenotes/notes/fix-llm-metadata-extractor-no-content-910067ea72094f18.yaml
* Update fix-llm-metadata-extractor-no-content-910067ea72094f18.yaml
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
* Add serialization to State
* Add release notes
* Deprecate State in dataclasses
* Fix tests
* Remove state_utils test
* Fix linting
* Fix formating
* Update tests and remove old state utils
* Update agents test
* Update deserilaization per review
* Linting
* Add tests for edge case (custom class types)
* Fix type serialization
* PR comments
* Move State to agents
* Fix tests
* Update utils init
* Improve seriliaztion/deser
* Update the release notes
* Minor fix in docstrings
* PR comments
* Add deprecation warnign for state utils
* Recreate the serialization methods to use schema
* Update key names
* Make serialization methods private
* Starting property schema refactor
* Adding more tests
* More tests
* Handle null type explicitly
* More updates of tests to accomodate Optional properly
* Fix more tests
* Remove unecessary check
* Some cleanup
* Update test
* Add reno
* Fix typing
* Add license header
* Use docstrings of dataclasses in parameter spec generation
* More tests of Haystack dataclass types
* Properly handle Sequence
* Fix license header
* Update OpenAI tests to add more complicated tool parameter signature
* Properly set required for dataclasses
* Add integration test for azure that includes additionalProperties
* Add more complicated integration test for HuggingFaceAPIChatGenerator
* Alternate approach using pydantic like we do in from_function.py
* Cleanup and fix other affected tests
* Fix mypy
* PR comments
* PR comment
* Remove test from HF API
* Update reno
* Update reno
* fix: make QUOTE_SPANS_RE regex ReDoS-safe
* Removing the capture of leading non-character on double quotes, allowing quote with new lines, adding tests
* cleaning
* fixing release notes
* changing import
* adding test for Regex Denial of Service (ReDoS)
* reducing the size/time of tests
* Update test/components/preprocessors/test_sentence_tokenizer.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* Update test/components/preprocessors/test_sentence_tokenizer.py
---------
Co-authored-by: Waivey <waivey@proton.me>
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* feat: Add sanitization for Meta field during serialization
* Revert "feat: Add sanitization for Meta field during serialization"
This reverts commit c529f7c25b69aed626bb2072c8bf171815b591cc.
* feat: add nested serialization in openai usage object
* add reno
* add nested serialization in OpenAiChatGenerator
* Update releasenotes/notes/nested-serialization-openai-usage-object-3817b07342999edf.yaml
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* merge tests
* Adjust the test
---------
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>