3803 Commits

Author SHA1 Message Date
Vladimir Blagojevic
b3b3f89302
feat: Add haystack-experimental dependency (#7921)
* Add haystack-experimental dependency

* Add reno note
v2.4.0-rc0
2024-07-08 14:07:15 +02:00
David S. Batista
ff75444f53
fix: adding deprecation warning ContextEvaluator (#7991)
* adding deprecation warning

* adding to release notes

* Update releasenotes/notes/add-deprecation-warning-context-relevance-937df7e807ac1a8d.yaml

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-07-08 10:18:39 +00:00
Nitanshu Vashistha
167e886f2c
feat: Configure max_retries & timeout for AzureOpenAIGenerator (#7983)
max_retries: if not set is read from the OPENAI_MAX_RETRIES
env variable or set to 5.

timeout: if not set is read from the OPENAI_TIMEOUT
env variable or set to 30.

Signed-off-by: Nitanshu Vashistha <nitanshu.vzard@gmail.com>
2024-07-08 11:16:26 +02:00
Ulises M
e92a0e4beb
feat: Allow Connection of ChatGenerator to AnswerBuilder (#7897)
* initial implementation

* add support for meta and add ChatMessage tests

* explictly cast types for mypy and update reno

* leave inputs unchanged avoiding side effects

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-07-05 08:21:53 +02:00
Vladimir Blagojevic
61de1dcc61
Fix project image and pypi url in slack configuration (#7980) 2024-07-04 17:27:28 +02:00
Silvano Cerza
06aafa18fc
Update pre-commit hooks versions (#7979) 2024-07-04 17:11:28 +02:00
Vladimir Blagojevic
0255422eb3
chore: Mark AzureOCRDocumentConverter test_run_with_pdf_file flaky (#7978)
* Disable AzureOCRDocumentConverter test_run_with_pdf_file on osx

* Mark test flaky instead

* Remove import
2024-07-04 16:36:32 +02:00
tstadel
aa46466894
fix: meta from ByteStream input for AzureOCRDocumentConverter (#7955)
* fix: meta from ByteStream input for AzureOCRDocumentConverter

* add test

* add reno

* fix test
2024-07-04 14:42:30 +02:00
Chris Pappalardo
7178aa0253
feat: add custom jinja filter handling to ConditionalRouter (#7957)
* add custom jinja filter handling to ConditionalRouter

* add release notes for custom filters

* align sede to existing patterns and update docstring example

* update sede unit test route condition to be more explicit

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-07-04 10:08:12 +02:00
Nicola Procopio
cafcf51cb0
Fixed ZeroDivisionError in JoinDocuments (#7972)
* added new strategy DBRF

* fix hook

* fix typos

* added test for DBRF

* fix format

* new release note

* reformatted with black

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* updated comments

* added type-hint and return type

* fix

* revert for lint problems

* fix

* fix

* fix

* fix

* another tentative

* dict out file

* only output

* fix output

* revert

* removed unused imports

* fix typing

* fixed ZeroDivisionError

* added test

* add release note

* removed try - except

* renamed test

* Update test/components/joiners/test_document_joiner.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* fix format error

* removed releasenotes/notes/release-note-9b2bc03a8a398078.yaml

* added comment

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-07-04 10:07:26 +02:00
Nicola Procopio
03d9057e64
Add Distribution based rank fusion mode (#7915)
* added new strategy DBRF

* fix hook

* fix typos

* added test for DBRF

* fix format

* new release note

* reformatted with black

* Update haystack/components/joiners/document_joiner.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* updated comments

* added type-hint and return type

* fix

* revert for lint problems

* fix

* fix

* fix

* fix

* another tentative

* dict out file

* only output

* fix output

* revert

* removed unused imports

* fix typing

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-07-03 13:55:17 +02:00
David S. Batista
a67872e7a8
docs: correcting CONTRIBUTING hatch commands (#7963)
* fixing CONTRIBUTING hatch commands

* updating CONTRIBUTING.MD

* Update CONTRIBUTING.md

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-07-03 10:59:21 +02:00
David S. Batista
186512459d
feat: LLM-based evaluators return meta info from OpenAI (#7947)
* LLM-Evaluator returns metadata from OpenAI

* adding tests

* adding release notes

* updating test

* updating release notes

* fixing live tests

* attending PR comments

* fixing tests

* Update releasenotes/notes/adding-metadata-info-from-OpenAI-f5309af5f59bb6a7.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update llm_evaluator.py

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-07-02 11:31:51 +02:00
Vladimir Blagojevic
3068ea258b
Fix whisper test (#7959) 2024-07-01 10:10:19 +02:00
Silvano Cerza
a86bf963a0
ci: Fix workflows_linting.yml never running (#7941)
* Fix workflows_linting.yml never running

* Add setup-go step

* Ignore SC2102 rule
2024-06-28 11:01:52 +02:00
David S. Batista
5b9e989f9a
fix: adjusting code due to new ruff version enforcing more strict linting (#7948)
* initial import

* fixing if clause
2024-06-28 10:51:57 +02:00
David S. Batista
91f57015c0
feat : adding split_id and split_overlap to DocumentSplitter (#7933)
* wip: adding _split_overlapp

* fixing join issue for _split_overlap

* adding tests

* adding release notes

* cleaning and fixing tests

* making mypy happy

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* adding docstrings

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-06-27 15:07:43 +02:00
Vladimir Blagojevic
569b2a87cb
feat: Update LocalWhisperTranscriber, add tests (#7935)
* Update LocalWhisperTranscriber, add tests

* Final touches

* Update haystack/components/audio/whisper_local.py

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* Fix prev commit

* Relax test for tiny model to work

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-06-27 12:53:41 +02:00
Stefano Fiorucci
1f7786d6dd
replace expired Discord invitation link (#7944) 2024-06-27 11:47:50 +02:00
Vladimir Blagojevic
c2ed275a2d
feat: Improve LinkContentFetcher content type handling (#7920)
* LinkContentFetcher: add more default content type handlers

* Update/add unit test

* Add reno note

* Add image content handler

* Update unit test
2024-06-27 11:45:20 +02:00
Vladimir Blagojevic
535a281eec
feat: Add option to use HF_TOKEN as env var for authentication across all HF components (#7942)
* Read both HF_API_TOKEN and HF_TOKEN env vars in all HF related components

* Add reno note

* Test fixes

* More test updates

* More test updates
2024-06-27 10:31:58 +02:00
Sebastian Husch Lee
6836079686
chore: Capitalize DOCX in DOCXToDocument converter (#7931)
* Capitalize DOCX in DOCXToDocument converter

* Update docstrings

* Update test class name

* add releease notes
2024-06-27 08:19:01 +02:00
Silvano Cerza
fd1a06d171
Disable tracing when running tests (#7934) 2024-06-26 12:32:05 +02:00
Amna Mubashar
866e6c8fc2
Add the missing parameter for serialization (#7929)
* Add the missing parameter for serialization

* Updated test

---------

Co-authored-by: Amna Mubashar <amna.mubashar@Amnas-MBP.fritz.box>
2024-06-26 11:07:00 +02:00
David S. Batista
8b9eddcd94
fix: explicitly tell ContextRelevanceEvaluator that each statement should be scored (#7904)
* initial import

* adding release notes

* adding pytest decorator for live test

* make examples more readable

* updating tests

* reverting progress_bar = False
2024-06-25 16:59:37 +02:00
dependabot[bot]
8583d8c6a6
chore(deps): bump actions/add-to-project from 1.0.1 to 1.0.2 (#7926)
Bumps [actions/add-to-project](https://github.com/actions/add-to-project) from 1.0.1 to 1.0.2.
- [Release notes](https://github.com/actions/add-to-project/releases)
- [Commits](https://github.com/actions/add-to-project/compare/v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: actions/add-to-project
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-25 16:27:43 +02:00
Amna Mubashar
fc011d7b04
bug: fix MRR and MAP calculations (#7841)
* bug: fix MRR and MAP calculations
2024-06-25 12:07:11 +02:00
Stefano Fiorucci
c51f8ffb86
PyPDFToDocument: remove deprecated converter_name and CONVERTERS_REGISTRY (#7910) 2024-06-21 16:52:03 +02:00
David Berenstein
08104e0042
feat: InMemoryDocumentStore serialization (#7888)
* Add: InMemoryDocumentStore serialization

* Add: additional chek to test if path exists

* Fix: failing test
2024-06-21 16:45:25 +02:00
Ulises M
9c45203a76
fix: check for None in SAS eval input (#7909)
* check for None in SAS input

* Update releasenotes/notes/check-for-None-SAS-eval-0b982ccc1491ee83.yaml

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-06-21 14:22:33 +02:00
Vedant Naik
f5a34d4d5c
fix: CacheChecker filters syntax (#7898)
* fix: cache_checker filters

* add release notes

* test: add test for cache checker filters syntax
2024-06-21 12:31:29 +02:00
Stefano Fiorucci
c59ad95f42
chore: remove deprecated TGI generators (#7908)
* remove deprecated TGI generators

* rm unused import
2024-06-21 11:15:13 +02:00
Corentin
d158be40b3
fix(JsonSchemaValidator): fix recursive loop and general LLM (claude, mistral...) compatibility (#7556)
* Feat: Fix recursive conversion in JsonSchemaValidator (autofix generated by ClaudeOpus). Modify the behaviour to build the error template in a single user_message instead of two separate. Modify the behaviour to only include latest message instead of full history (very costly if long looping pipeline)

* Feat: Fix recursive conversion in JsonSchemaValidator (autofix generated by ClaudeOpus). Modify the behaviour to build the error template in a single user_message instead of two separate. Modify the behaviour to only include latest message instead of full history (very costly if long looping pipeline)

* reno

* fix test

* Verify provided message contains JSON object to begin with

* Minor detail

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2024-06-21 11:10:34 +02:00
Stefano Fiorucci
75ad76a7ce
chore: remove deprecated TEI embedders (#7907)
* remove deprecated TEI embedders

* rm from the embedders init

* rm related tests
2024-06-21 10:36:12 +02:00
Madeesh Kannan
d1f8c0dcd6
fix: Prevent component pre-init hook from being called recursively (#7894) 2024-06-21 10:29:37 +02:00
Stefano Fiorucci
d80e01492b
update sentence transformers import error message (#7906) 2024-06-20 18:15:01 +02:00
Vladimir Blagojevic
4c59000c21
feat: Add apply_filter_policy function (#7902)
* Add apply_filter_policy

* Add release note
2024-06-20 13:44:23 +02:00
Carlos Fernández
57c1d47c7d
fix: ChatPromptBuilder Fails to JSON Serialize (#7849)
* implement serialization for chat messages and add tests

* implement serialization for ChatPromptBuilder and test it

* add reno

* solve mypy type error

* solve mypy type error

* remove flattening parameter in to_dict

* simplify to jus non-flat metadata

* try to fix linting issue

* solve format issues

* update test for ChatPromptBuilder

* remove unused import

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-06-20 13:20:52 +02:00
Sebastian Husch Lee
3db56d9066
refactor: DocxToDocument update (#7857)
* Some changes

Use tests file path

* Update tests

* Add another unit test

* Shorten _get_docx_metadata

* Update tests

* Remove try block

* Add a dataclass

* Add a to dict unit test

* Remove unused import

* Add release notes

* Update docstrings

* Use optional instead of pipe

* Update docstring

* Remove file
2024-06-19 15:48:31 +02:00
Madeesh Kannan
fe60eedee9
fix: Fix deserialization of pipelines that contain LLMEvaluator subclasses (#7891) 2024-06-19 13:47:38 +02:00
Massimiliano Pippi
7c31d5f418
add docstrings for EvaluationRunResult (#7885) 2024-06-19 11:49:41 +02:00
Silvano Cerza
28902c4c65 Fix mypy 2024-06-18 18:09:17 +02:00
Massimiliano Pippi
3a03fce71c
ci: Add code formatting checks (#7882)
* ruff settings

enable ruff format and re-format outdated files

feat: `EvaluationRunResult` add parameter to specify columns to keep in the comparative `Dataframe`  (#7879)

* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* updating tests

* adding release notes

* Update haystack/evaluation/eval_run_result.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update releasenotes/notes/add-keep-columns-to-EvalRunResult-comparative-be3e15ce45de3e0b.yaml

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* updating docstring

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

add format-check

fail on format and linting failures

fix string formatting

reformat long lines

fix tests

fix typing

linter

pull from main

* reformat

* lint -> check

* lint -> check
2024-06-18 15:52:46 +00:00
Silvano Cerza
15ee622b3c
refactor: Isolate logic that finds next runnable component waiting for input (#7880)
* Fix formatting

* Isolate logic that finds next runnable component waiting for input

* Explain more lazy variadics

* Enhance logic following review suggestions

* Simplify code to use a single for

* Fix test
2024-06-18 16:43:19 +02:00
Massimiliano Pippi
ff79da5f55
chore: remove unused import (#7886) 2024-06-18 15:16:31 +02:00
Tobias Wochinger
96cda5d3b6
fix: enable tracing upon import / improve logging setup (#7859)
* fix: fix auto-tracing

* feat: add context var logging to structlog

* docs: add release notes
2024-06-18 12:37:16 +02:00
Massimiliano Pippi
5ae14cde6b
Update README.md (#7883) 2024-06-18 09:56:11 +02:00
David S. Batista
55513f7521
feat: EvaluationRunResult add parameter to specify columns to keep in the comparative Dataframe (#7879)
* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* updating tests

* adding release notes

* Update haystack/evaluation/eval_run_result.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update releasenotes/notes/add-keep-columns-to-EvalRunResult-comparative-be3e15ce45de3e0b.yaml

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* updating docstring

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-06-17 18:08:52 +02:00
Vedant Naik
2725ffffd1
docs: update AzureOpenAIDocumentEmbedder docstring (#7875) 2024-06-17 17:17:53 +02:00
dependabot[bot]
a277af45a5
chore(deps): bump docker/bake-action from 4 to 5 (#7881)
Bumps [docker/bake-action](https://github.com/docker/bake-action) from 4 to 5.
- [Release notes](https://github.com/docker/bake-action/releases)
- [Commits](https://github.com/docker/bake-action/compare/v4...v5)

---
updated-dependencies:
- dependency-name: docker/bake-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-17 17:05:00 +02:00