8 Commits

Author SHA1 Message Date
David S. Batista
da60156174
chore: removing unused imports from tests (#9446) 2025-05-26 16:22:51 +00:00
Seth Peters
f025501792
fix: LLMMetadataExtractor bug in handling Document objects with no content
* test(extractors): Add unit test for LLMMetadataExtractor with no content

Adds a new unit test `test_run_with_document_content_none` to `TestLLMMetadataExtractor`.

This test verifies that `LLMMetadataExtractor` correctly handles documents where `document.content` is None or an empty string.

It ensures that:

- Such documents are added to the `failed_documents` list.

- The correct error message ("Document has no content, skipping LLM call.") is present in their metadata.

- No actual LLM call is attempted for these documents.

This test provides coverage for the fix that prevents an AttributeError when processing documents with no content.

* chore: update comment to reflect new behavior in _run_on_thread method

* docs: Add release note for LLMMetadataExtractor no content fix

* Update releasenotes/notes/fix-llm-metadata-extractor-no-content-910067ea72094f18.yaml

* Update fix-llm-metadata-extractor-no-content-910067ea72094f18.yaml

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-05-23 18:57:39 +02:00
Stefano Fiorucci
dcba774e30
chore: LLMMetadataExtractor - remove deprecated parameters (#9218) 2025-04-11 15:50:52 +02:00
Bilge Yücel
d977b262b6
replace all gpt-3.5-turbo with gpt-4o-mini (#9165) 2025-04-04 12:07:55 +02:00
Stefano Fiorucci
6db8f0a40d
refactor: LLMMetadataExtractor - adopt ChatGenerator protocol: deprecate generator_api, generator_api_params and LLMProvider (#9099)
* draft

* improvements + tests

* release note

* mypy fixes

* improve relnote

* serialize chat_generator only

* small simplification

* clarify that also LLMProvider is deprecated

* revert from_dict

* test_from_dict_openai_using_chat_generator
2025-03-24 17:38:09 +00:00
Sebastian Husch Lee
4edefe3e56
Feat: Support Azure Workload Identity Credential (#9012)
* Start adding support for passing callable to Azure components

* Add to chat version

* Fix test

* Add reno

* Add support to azure doc and text embedder

* Rename

* update llm metadata extractor

* Add tests for text embedder

* Update tests

* Remove unused fixture and import

* Update reno
2025-03-12 13:45:40 +01:00
David S. Batista
f189a1c349
fix: LLMMetadataExtractor removing from_dict/to_dict AWS tests (#8840)
* removint from_dict/to_dict AWS tests

* removing boto3 import from tests
2025-02-11 09:40:58 +00:00
David S. Batista
f798a9e935
feat: adding LLMMetadataExtractor (#8833)
* fixing linting

* adding release notes

* updating tests

* adding to pydocs

* fixing typing due to Optional

* fixing docstring
2025-02-10 16:54:25 +00:00