haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2026-01-08 13:06:29 +00:00

Author	SHA1	Message	Date
Julian Risch	f687d49fec	feat: Add option to split by number of tokens to RecursiveDocumentSplitter (#9143 ) * add token split_unit * fix overlap with fallback * reno * mark as integration tests * use type ignore instead of assert * Update releasenotes/notes/recursive-splitter-token-df56428887ac45bd.yaml Co-authored-by: David S. Batista <dsbatista@gmail.com> --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-04-01 09:48:59 +02:00
Vladimir Blagojevic	13941d8bd9	feat: LinkContentFetcher - replace requests with httpx, add async and http/2 (#9034 ) * LinkContentFetcher - replace requests with httpx, add async and http/2 * Update haystack/components/fetchers/link_content.py Co-authored-by: Julian Risch <julian.risch@deepset.ai> * Update haystack/components/fetchers/link_content.py Co-authored-by: Julian Risch <julian.risch@deepset.ai> * PR feedback * Merge sync and async --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2025-03-26 14:55:08 +01:00
Stefano Fiorucci	c5cde40d3a	unpin ruff and update code (#9040 )	2025-03-14 14:53:25 +00:00
Sebastian Husch Lee	3d7d65a260	Pin ruff (#9038 )	2025-03-14 12:00:21 +01:00
Sebastian Husch Lee	4edefe3e56	Feat: Support Azure Workload Identity Credential (#9012 ) * Start adding support for passing callable to Azure components * Add to chat version * Fix test * Add reno * Add support to azure doc and text embedder * Rename * update llm metadata extractor * Add tests for text embedder * Update tests * Remove unused fixture and import * Update reno	2025-03-12 13:45:40 +01:00
Stefano Fiorucci	c04c900f26	build: drop Python 3.8 support (#8978 ) * draft * readd typing_extensions * small fix + release note * remove ruff target-version * Update releasenotes/notes/drop-python-3.8-868710963e794c83.yaml Co-authored-by: David S. Batista <dsbatista@gmail.com> --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-03-05 14:59:56 +00:00
Stefano Fiorucci	ec97f4d991	update transformers test dependency to 4.48.3 (#8979 )	2025-03-05 14:49:34 +01:00
Stefano Fiorucci	9da6696a45	chore: make `openapi-llm` an optional dependency (#8958 ) * openapi-llm should be and optional dependency * rm empty line	2025-03-05 11:15:19 +01:00
Stefano Fiorucci	10f11d40d4	build: support python 3.13 (#8965 ) * support python 3.13 * release note * add python version info to contributing guide * better explanation	2025-03-05 09:49:10 +00:00
Stefano Fiorucci	f3c44be904	refactor!: remove `dataframe` field from `Document` and `ExtractedTableAnswer`; make `pandas` optional (#8906 ) * remove dataframe * release note * small fix * group imports * Update pyproject.toml Co-authored-by: Julian Risch <julian.risch@deepset.ai> * Update pyproject.toml Co-authored-by: Julian Risch <julian.risch@deepset.ai> * address feedback --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2025-03-04 11:06:07 +00:00
Amna Mubashar	28db039bca	feat: add run_async to `HuggingfaceAPIChatGenerator` (#8943 ) * add run_async * add release notes * Add integration test	2025-03-03 16:51:30 +01:00
Sebastian Husch Lee	99a998f90b	feat: Add MSGToDocument converter (#8868 ) * Initial commit of MSG converter from Bijay * Updates to the MSG converter * Add license header * Add tests for msg converter * Update converter * Expanding tests * Update docstrings * add license header * Add reno * Add to inits and pydocs * Add test for empty input * Fix types * Fix mypy --------- Co-authored-by: Bijay Gurung <bijay.learning@gmail.com>	2025-02-24 08:12:32 +01:00
Sebastian Husch Lee	a516672cfb	fix: Fix data dog tracing (#8900 ) * Fix data dog tracing * Add reno * Update imports * Fix	2025-02-21 14:35:04 +01:00
Stefano Fiorucci	04c6136cc4	relax posthog pin (#8898 )	2025-02-21 10:49:29 +01:00
Stefano Fiorucci	fcca7104d3	pin ddtrace<3.0.0 (#8897 )	2025-02-21 08:14:41 +00:00
Michele Pangrazzi	44fb20c2d5	Add `run_async` to `OpenAIChatGenerator` (#8880 ) * Implememntation of run_async (wip) * Add missing tests ; Move async tests to test_openai_async.py * Add release note * Update docstring * Alignments with haystack-experimental implementation * Lint: removed unused imports * Update haystack/components/generators/chat/openai.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>	2025-02-20 16:51:46 +00:00
mathislucka	8c54f06a19	fix: component checks failing for components that return dataframes (#8873 ) * fix: use is not to compare to sentinel value * chore: release notes * Update releasenotes/notes/fix-component-checks-with-ambiguous-truth-values-949c447b3702e427.yaml Co-authored-by: David S. Batista <dsbatista@gmail.com> * fix: another sentinel value * test: also test base class * add pandas as test dependency * format * Trigger CI * mark test with xfail strict=False --------- Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com> Co-authored-by: David S. Batista <dsbatista@gmail.com> Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2025-02-19 09:10:48 +00:00
Stefano Fiorucci	b5d2854b93	pin posthog<3.12.0 (#8841 )	2025-02-11 10:30:57 +00:00
David S. Batista	f189a1c349	fix: LLMMetadataExtractor removing from_dict/to_dict AWS tests (#8840 ) * removint from_dict/to_dict AWS tests * removing boto3 import from tests	2025-02-11 09:40:58 +00:00
David S. Batista	f798a9e935	feat: adding `LLMMetadataExtractor` (#8833 ) * fixing linting * adding release notes * updating tests * adding to pydocs * fixing typing due to Optional * fixing docstring	2025-02-10 16:54:25 +00:00
Vladimir Blagojevic	fd5040108a	feat: Add OpenAPIConnector component, improve OpenAPI integration (#8808 ) * Initial OpenAPIConnector * Add reno note * Format * Add headers * Add test dep * Use haystack logger * Fix test * Minor fix, spin CI * Update reno release note format * Add to docs, pydocs improvements	2025-02-10 10:34:37 +01:00
mathislucka	eec91824bc	fix: pipeline run bugs in cyclic and acyclic pipelines (#8707 ) * add component checks * pipeline should run deterministically * add FIFOQueue * add agent tests * add order dependent tests * run new tests * remove code that is not needed * test: intermediate from cycle outputs are available outside cycle * add tests for component checks (Claude) * adapt tests for component checks (o1 review) * chore: format * remove tests that aren't needed anymore * add _calculate_priority tests * revert accidental change in pyproject.toml * test format conversion * adapt to naming convention * chore: proper docstrings and type hints for PQ * format * add more unit tests * rm unneeded comments * test input consumption * lint * fix: docstrings * lint * format * format * fix license header * fix license header * add component run tests * fix: pass correct input format to tracing * fix types * format * format * types * add defaults from Socket instead of signature - otherwise components with dynamic inputs would fail * fix test names * still wait for optional inputs on greedy variadic sockets - mirrors previous behavior * fix format * wip: warn for ambiguous running order * wip: alternative warning * fix license header * make code more readable Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Introduce content tracing to a behavioral test * Fixing linting * Remove debug print statements * Fix tracer tests * remove print * test: test for component inputs * test: remove testing for run order * chore: update component checks from experimental * chore: update pipeline and base from experimental * refactor: remove unused method * refactor: remove unused method * refactor: outdated comment * refactor: inputs state is updated as side effect - to prepare for AsyncPipeline implementation * format * test: add file conversion test * format * fix: original implementation deepcopies outputs * lint * fix: from_dict was updated * fix: format * fix: test * test: add test for thread safety * remove unused imports * format * test: FIFOPriorityQueue * chore: add release note * fix: resolve merge conflict with mermaid changes * fix: format * fix: remove unused import * refactor: rename to avoid accidental conflicts * chore: remove unused inputs, add missing license header * chore: extend release notes * Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * fix: format * fix: format * Update release note --------- Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-02-06 14:19:47 +00:00
Stefano Fiorucci	877f826da0	refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` (#8794 ) * HF API Embedders: refactoring * rename variables * rm leftovers * rm pin * rm unused import * relnote * warning with truncate/normalize and serverless inference API * test that warnings are raised	2025-02-03 15:11:16 +00:00
Amna Mubashar	379711f63e	fix: Pin nltk version for sentence tokenizer (#8786 ) * Pin nltk version for sentence tokenizer * Update pyproject.toml * Update haystack/components/preprocessors/sentence_tokenizer.py --------- Co-authored-by: David S. Batista <dsbatista@gmail.com>	2025-01-31 17:01:00 +01:00
Stefano Fiorucci	3ef609a3e8	temporarily pin huggingface_hub<0.28.0 (#8790 )	2025-01-31 10:35:15 +01:00
Stefano Fiorucci	0ac47b0064	pin numba>=0.54.0 (#8773 )	2025-01-27 11:55:18 +01:00
Stefano Fiorucci	f96839e139	chore: update `transformers` test dependency (#8752 ) * update transformers test dependency * add pad_token_id to the mock tokenizer * fix HFLocal test + new test	2025-01-21 14:43:27 +01:00
Stefano Fiorucci	2bf6bf6a45	build: add `jsonschema` library to core dependencies (#8753 ) * add jsonschema to core dependencies * release note	2025-01-21 10:07:56 +01:00
Vladimir Blagojevic	d147c7658f	feat: Add `ComponentTool` to Haystack tools (#8693 ) * Initial ComponentTool --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2025-01-13 11:15:33 +01:00
Sebastian Husch Lee	28ad78c73d	feat: Add XLSXToDocument converter (#8522 ) * Add draft of the Excel To Document converter * Add license header * Add release note * Use Union instead of pipe * Add openpyxl as additional dep * Fix zip issue * few updates from Bijay * Update deps * Add markdown test * Adding more example excels and expanding tests * Added more tests * Fix windows test by setting lineterminator * Addressing PR comments * PR comments * Fix linting	2025-01-09 09:03:19 +01:00
Stefano Fiorucci	2bc58d2987	feat: support for tools in `HuggingFaceAPIChatGenerator` (#8661 ) * message conversion function * hfapi w tools * right test file + hf_hub version * release note * feedback	2024-12-19 15:04:37 +01:00
Stefano Fiorucci	96b4a1d2fd	feat: `Tool` dataclass - unified abstraction to represent tools (#8652 ) * draft * del HF token in tests * adaptations * progress * fix type * import sorting * more control on deserialization * release note * improvements * support name field * fix chatpromptbuilder test * port Tool from experimental * release note * docs upd * Update tool.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-12-18 11:36:44 +00:00
Stefano Fiorucci	2a9a6401d2	chore: pin `openai>=1.56.1` (#8632 ) * pin openai>=1.56.1 * release note	2024-12-12 16:26:38 +01:00
David S. Batista	248dccbdd3	chore: fixing `pylint` issues (#8610 ) * initial import * fixing internal methods * fixing some internal methods * modify _preprocess * fixed internal methods --------- Co-authored-by: anakin87 <stefanofiorucci@gmail.com>	2024-12-09 16:53:37 +00:00
Stefano Fiorucci	de7099e560	ci: add job to check imports (#8594 ) * try checking imports * clarify error message * better fmt * do not show complete list of successfully imported packages * refinements * relnote * add missing forward references * better function name * linting * fix linting * Update .github/utils/check_imports.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-11-29 14:00:59 +00:00
Stefano Fiorucci	f085959067	chore: declare `requires-python<3.13` in pyproject (#8547 ) * restrict to python<3.13 * try unpinning dulwich * reintroduce dulwich pin	2024-11-15 09:28:39 +00:00
Silvano Cerza	ebb45d3d1e	Remove ddtrace version pin (#8529 )	2024-11-11 11:21:10 +01:00
Stefano Fiorucci	c7b898994e	build: unpin `numpy` + use Python 3.9 in CI (#8492 ) * try unpinning numpy * try python 3.9 * release note	2024-10-28 12:15:17 +01:00
Silvano Cerza	0157459a7b	Pin ddtrace test dependency to fix tests (#8478 )	2024-10-22 10:19:25 +00:00
Stefano Fiorucci	f6935d1456	ci: add `pip` to `test` dependencies (#8475 ) * add pip to test dependencies * trigger * release note * rm trigger	2024-10-22 08:35:30 +00:00
Stefano Fiorucci	7788bfe558	ci: upgrade Hatch to 1.13.0 and adopt uv as installer (#8313 ) * try uv * upgrade hatch * rm unnecessary specification * release note	2024-10-17 10:32:14 +02:00
Silvano Cerza	29672d4b42	feat: Add `JSONConverter` Component (#8397 ) * Add JSONConverter Component * Handle some corner cases * Add JSONConverter to pydoc config * Add a way to extract all non content fields as metadata * Small fix in docstring * Fix tests * docstrings upd * Update json.py --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>	2024-09-25 12:34:51 +02:00
Silvano Cerza	4b77ec1b6f	Fix codespell config (#8392 )	2024-09-24 12:00:45 +02:00
Vladimir Blagojevic	badd0594cc	feat: Port NLTKDocumentSplitter from dC to Haystack (#8350 ) * Port NLTKDocumentSplitter from dC to Haystack * Improve pydocs * Use haystack logging * Add NLTKDocumentSplitter to __init__.py * Use haystack logging, rename test classes * Fixing _needs_join return * Linting * PR feedback * More static methods * Increase test coverage * Compile pattern --------- Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>	2024-09-17 13:59:19 +02:00
Silvano Cerza	da49e782e2	chore: Make `arrow` an optional dependency (#8345 ) * Make arrow an optional dependency * Fix imports	2024-09-09 16:09:51 +02:00
Mo Sriha	75955922b9	feat: Add current date in UTC to PromptBuilder (#8233 ) * initial commit * add unit tests * add release notes * update function name	2024-09-09 09:47:03 +02:00
Stefano Fiorucci	25d333bed3	update transformers (#8296 )	2024-08-27 16:04:11 +00:00
Stefano Fiorucci	6b0ee4c193	chore: update test dependency and `LazyImport` block to make compatibility with `sentence-transformers>=3.0.0` explicit (#8295 ) * sentence-transformers-3 update test dep and lazyimport block * clearer release note	2024-08-27 15:51:03 +00:00
Tobias Wochinger	5a3ea75196	docs: document Python 3.11 and 3.12 support (#8159 ) * docs: add Python 3.11 and 3.12 to supported versions * docs: add release notes	2024-08-02 14:46:20 +02:00
Tobias Wochinger	4dde6fbaec	build: unpin structlog (#8071 )	2024-07-24 20:58:34 +02:00

1 2 3 4 5 ...

267 Commits