3677 Commits

Author SHA1 Message Date
Silvano Cerza
fd838fc573
Update indexing and rag default templates to use InMemoryDocumentStore (#7782) 2024-06-04 12:57:33 +02:00
Stefano Fiorucci
55a657ba81
export ChatPromptBuilder and add it to pydoc config (#7796) 2024-06-04 10:17:23 +02:00
Silvano Cerza
26b263e349
Fix InMemoryDocumentStore not sharing some document stats with other instances (#7792) 2024-06-04 10:15:50 +02:00
Silvano Cerza
74df8ed937
test: Rework Pipeline.run() tests to ease declaration with dataclasses (#7790)
* Rework boilerplate function that run Pipeline in scenarios testing

* Update tests to use new dataclasses

* Update README.md to reflect dataclass changes

* Use absolute import from conftest
2024-06-03 15:59:42 +02:00
Daria Fokina
67abe5576b
add examples to preprocessors (#7780) 2024-06-03 15:42:21 +02:00
Silvano Cerza
07ae45e0c2
test: Migrate Pipeline.run() tests with run arguments (#7777)
* Support Pipeline.run() arguments in tests

* Move intermediate outputs
2024-06-03 12:36:04 +02:00
Silvano Cerza
854c4173f2
feat: Add memory sharing between different instances of InMemoryDocumentStore (#7781)
* Add memory sharing between different instances of InMemoryDocumentStore

* Fix FilterRetriever tests

* Fix InMemoryBM25Retriever tests
2024-05-31 16:44:14 +02:00
Silvano Cerza
d81af81fbb
test: Migrate pipeline run tests (#7775)
* Move complex pipeline

* Move pipeline with default

* Move pipeline with distinct loops

* Move pipeline with double loop

* Move pipeline with dynamic inputs

* Move fixed decision pipeline

* Move fixed merging pipeline

* Move fixed decision and merge pipeline

* Remove test_joiners.py

* Move looping and merge pipeline

* Remove test_looping.py

* Move mutable input pipeline

* Move parallel branches pipeline

* Move same input different components pipeline

* Move test_run_with_greedy_variadic_after_component_with_default_input_simple

* Remove test_run_raises_if_max_visits_reached

* Move test_run_with_component_that_does_not_return_dict

* Move test_correct_execution_order_of_components_with_only_defaults

* Move test_pipeline_is_not_stuck_with_components_with_only_defaults

* Move test_pipeline_is_not_stuck_with_components_with_only_defaults_as_first_components

* Move self loop pipeline

* Move variable decision and merge pipeline

* Remove test_variable_decision_pipeline

* Move variable merging pipeline

* Add FakeComponent removed by mistake
2024-05-31 13:00:29 +02:00
Massimiliano Pippi
aa767ae142
ignore rc0 (#7776) 2024-05-31 12:32:08 +02:00
Silvano Cerza
a9f989d756
test: Support multiple runs for Pipeline run tests (#7762)
* Support multiple runs for Pipeline run tests

* Apply suggestions from code review

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-05-31 11:58:49 +02:00
Julian Risch
6723dc3801
check for RuntimeError instead of ComponentError in test (#7769) 2024-05-31 08:42:40 +02:00
Massimiliano Pippi
8e3a8999de
fix release workflow 2024-05-30 18:47:42 +02:00
Haystack Bot
6425b05e50
Update unstable version to 2.3.0-rc0 (#7774)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
v2.3.0-rc0
2024-05-30 18:09:29 +02:00
Massimiliano Pippi
131e3498cd
fix release workflow 2024-05-30 18:08:06 +02:00
Massimiliano Pippi
c96741796a
fix release workflow 2024-05-30 18:06:15 +02:00
Daria Fokina
f8646e1186
update version when the components are to be removed (#7773) 2024-05-30 15:35:20 +00:00
Massimiliano Pippi
8d80ff86d9
Add BranchJoiner and deprecate Multiplexer (#7765) 2024-05-30 15:34:52 +02:00
Silvano Cerza
5c468feecf
test: Update Pipeline.run() tests README.md (#7757)
* Update Pipeline.run() tests README.md

* Add suggestion from review

* Fix typos

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-05-29 14:28:42 +02:00
Massimiliano Pippi
0ceeb733ba
chore: make warm_up() usage consistent (#7752)
* make  usage consistent

* fix error type

* release notes

* pylint fix

* change of plan

* revert

* fix test

* revert

* fix HF tests

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* fix formatting

* reformat

* fix regex match with the new error message

* fix integration test

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-05-29 10:54:21 +02:00
Silvano Cerza
15aa4217bd
Install hatch in testing jobs (#7755) 2024-05-28 17:04:21 +02:00
Massimiliano Pippi
cc521f42ef
ci: remove dependency cache job (#7754)
* remove dependency cache job

* leftover
2024-05-28 16:03:59 +02:00
Silvano Cerza
3dcc21fd73
test: Pipeline run tests rework (#7748)
* Rework Pipeline.run() tests

* Remove test_linear_pipeline.py

* Add test for components execution order

* Add new pytest-bdd tests dependency

* Update README.md

* Add function to dinamically add integration marker

* Fix marking tests as integration
2024-05-28 15:42:47 +02:00
Luke Bentley-Fox
9fe7eff42c
fix: use correct output annotation for pdfminer converter (#7750) 2024-05-27 21:04:40 +02:00
Alessio Cesaretti
d0da31a047
feat: Add split_threshold to DocumentSplitter to avoid excessively short splits (#7721)
* feat: add split_threshold to document splitter to avoid excessively small splits

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* extend release note

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2024-05-27 14:48:38 +02:00
Silvano Cerza
22289f590f
Move tests from test_connect.py in test_pipeline.py and test_utils.py (#7742) 2024-05-24 16:41:38 +02:00
Silvano Cerza
f5becf2ac0
Fix NamedEntityExtractor crashing in Python 3.12 if constructed using a string backend argument. (#7743) 2024-05-24 16:41:29 +02:00
tstadel
98fd270428
feat: add ChatPromptBuilder, deprecate DynamicChatPromptBuilder (#7663) 2024-05-23 19:04:55 +02:00
Silvano Cerza
4bc62854a9
test: Fix telemetry tests so they don't fail (#7708)
* Fix telemetry tests so they don't fail

* Remove test

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-05-23 18:02:25 +02:00
David S. Batista
38747ff7a3
fix: failsafe for non-valid json and failed LLM calls (#7723)
* wip

* initial import

* adding tests

* adding params

* adding safeguards for nan in evaluators

* adding docstrings

* fixing tests

* removing unused imports

* adding tests to context and faithfullness evaluators

* fixing docstrings

* nit

* removing unused imports

* adding release notes

* attending PR comments

* fixing tests

* fixing tests

* adding types

* removing unused imports

* Update haystack/components/evaluators/context_relevance.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update haystack/components/evaluators/faithfulness.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* attending PR comments

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-05-23 15:41:29 +00:00
Massimiliano Pippi
e3dccf4406
add timeout to AzureOpenAIGenerator (#7724)
* add timeout to AzureOpenAIGenerator

* add to chat also

* Update azure-openai-generator-timeout-c39ecd6d4b0cdb4b.yaml
2024-05-23 16:28:24 +02:00
tstadel
83d3970405
feat: extend PromptBuilder and deprecate DynamicPromptBuilder (#7655)
* feat: add default template to DynamicPromptBuilder

* fix mypy

* fix mypy

* extend PromptBuilder and deprecate DynamicPromptBuilder

* make backward-compatible: optional -> required

* make backward-compatible: _template_string

* make backward-compatible: missing_required_vars error

* add test for no template case

* better docstrings

* some chors

* some chors

* add reno

* revert test_dynamic_prompt_builder.py

* better docstring

* make backward-compatible: reorder init args

* fix tests

* add raises docstring

* make default template required and rework docstrings

* docs chores

* keep to_dict in place for easier review

* remove unnecessary logger

* update docstring
2024-05-23 16:03:39 +02:00
Varun Krishnan
badb05b3ab
feat: allow DocumentJoiner to accept top_k parameter in run method (#7709)
* feat: allow DocumentJoiner to accept top_k parameter in run method

* Added release note for DocumentJoiner top_k fix
2024-05-23 16:03:26 +02:00
Massimiliano Pippi
482f60ec99
fix: exit early if the component receives no documents (#7732)
* exit early if the component receives no documents

* relnote
2024-05-23 09:35:10 +02:00
David S. Batista
a4fc2b66e6
style: adding progress bar to llm-based evaluators (#7726)
* adding progress bar

* fixing typo

* fixing tests

* Update test_llm_evaluator.py

* fixing missing colon

* passing directly to parent

* adding docstrings
2024-05-23 09:22:14 +02:00
Massimiliano Pippi
76224fc781
make SerperDevWebSearch more robust (#7725) 2024-05-22 13:14:39 +02:00
Silvano Cerza
da088140ab
Group up Pipeline unit tests in a single class (#7706) 2024-05-21 16:12:28 +02:00
David S. Batista
e6db1502e6
initial import (#7720) 2024-05-21 15:08:03 +02:00
Stefano Fiorucci
6d27de0b40
fix release note (#7711) 2024-05-17 16:06:03 +02:00
Stefano Fiorucci
7181f6b7e9
feat: change HTML conversion backend from boilerpy3 to Trafilatura (#7705)
* change HTML conversion backed to Trafilatura

* rm unused var
2024-05-17 10:38:47 +02:00
Carlos Fernández
57af95d7ea
add keep-id to DocumentCleaner (#7703) 2024-05-16 19:18:48 +02:00
Carlos Fernández
686a4999cf
feat: widen support of env vars in OpenAI components (#7653)
* add enviroment variables to the _enviroment.py file

* add support for two of the three variables

* Add support for 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' on OpenAIDocument Ebedder.

* Replicate support for env vars in OpenAITextEmbedder.

* Add support for env vars in OpenAIGenerator..

* Add support for env vars in OpenAIChatGenerator.

* add docstrings and reno

* add params to __init__ in OpenAIDocumentEmbedder

* add params to __init__ in OpenAITextEmbedder

* make fully functional implementation of env vars and unit tests

* update reno

* Update haystack/components/embedders/openai_text_embedder.py

* reverse changes to telemetry/_enviroment.py

* Update haystack/components/embedders/openai_text_embedder.py

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-05-15 21:58:41 +00:00
Sebastian Husch Lee
af53e8430d
feat: Add inference mode to ExtractiveReader (#7699)
* Add inference mode to ExtractiveReader

* Add release notes
2024-05-15 19:33:57 +00:00
Vladimir Blagojevic
c8d53b3ebf
fix: Adjust serialization to handle PEP-585 generic types (#7690)
* Adjust serialization to handle PEP-585 generic types

* Add reno note

* Simplify

* PEP 585 serialization handling in sys.version_info < (3, 9)
2024-05-15 14:25:19 +02:00
David S. Batista
96b9d3e32a
fix: Adding missing component decorator to AzureOpenAIGenerator (#7698)
* initial import

* adding release notes

* tests avoiding I/O operations

* Update fix-azure-generators-serialization-18fcdc9cbcb3732e.yaml
2024-05-15 10:00:38 +02:00
Massimiliano Pippi
cc1d4b1c80
chore: Simplify Pipeline.run method by moving code to the base class (#7680)
* move graph initialization to the base class

* simplify data normalization

* deepcopy data in base class

* initialize inputs state

* move to_run preparation to the base class

* Test Pipeline._init_to_run()

* Test Pipeline._init_inputs_state()

* Test Pipeline._prepare_component_input_data()

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-05-14 23:25:46 +02:00
David S. Batista
798dc4a4a5
fix: avoid FaithfulnessEvaluator and ContextRelevanceEvaluator return Nan (#7685)
* initial import

* fixing tests

* relaxing condition

* adding safeguard for ContextRelevanceEvaluator as well

* adding release notes
2024-05-14 17:08:51 +02:00
Daria Fokina
cc869b10ad
add pdfminer (#7688) 2024-05-14 13:42:29 +02:00
Madeesh Kannan
2428bc2a92
fix: Pipeline.run correctly returns all outputs when the include_outputs_from parameter is used (#7697)
* fix: `Pipeline.run` correctly returns all outputs when the `include_outputs_from` parameter is used

* Add release note
2024-05-14 12:29:41 +02:00
Vladimir Blagojevic
4352b1688e
fix: Fix NamedEntityExtractor serde (#7684)
* Fix NamedEntityExtractor serde

* Add release note

* Linting, remove unit markers
2024-05-14 12:24:55 +02:00
David S. Batista
75cf35c743
fix: forcing response format to be JSON valid (#7692)
* forcing response format to be JSON valid

* adding release notes

* cleaning up

* Update haystack/components/evaluators/llm_evaluator.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-05-14 10:22:38 +00:00