49 Commits

Author SHA1 Message Date
mathislucka
eec91824bc
fix: pipeline run bugs in cyclic and acyclic pipelines (#8707)
* add component checks

* pipeline should run deterministically

* add FIFOQueue

* add agent tests

* add order dependent tests

* run new tests

* remove code that is not needed

* test: intermediate from cycle outputs are available outside cycle

* add tests for component checks (Claude)

* adapt tests for component checks (o1 review)

* chore: format

* remove tests that aren't needed anymore

* add _calculate_priority tests

* revert accidental change in pyproject.toml

* test format conversion

* adapt to naming convention

* chore: proper docstrings and type hints for PQ

* format

* add more unit tests

* rm unneeded comments

* test input consumption

* lint

* fix: docstrings

* lint

* format

* format

* fix license header

* fix license header

* add component run tests

* fix: pass correct input format to tracing

* fix types

* format

* format

* types

* add defaults from Socket instead of signature

- otherwise components with dynamic inputs would fail

* fix test names

* still wait for optional inputs on greedy variadic sockets

- mirrors previous behavior

* fix format

* wip: warn for ambiguous running order

* wip: alternative warning

* fix license header

* make code more readable

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Introduce content tracing to a behavioral test

* Fixing linting

* Remove debug print statements

* Fix tracer tests

* remove print

* test: test for component inputs

* test: remove testing for run order

* chore: update component checks from experimental

* chore: update pipeline and base from experimental

* refactor: remove unused method

* refactor: remove unused method

* refactor: outdated comment

* refactor: inputs state is updated as side effect

- to prepare for AsyncPipeline implementation

* format

* test: add file conversion test

* format

* fix: original implementation deepcopies outputs

* lint

* fix: from_dict was updated

* fix: format

* fix: test

* test: add test for thread safety

* remove unused imports

* format

* test: FIFOPriorityQueue

* chore: add release note

* fix: resolve merge conflict with mermaid changes

* fix: format

* fix: remove unused import

* refactor: rename to avoid accidental conflicts

* chore: remove unused inputs, add missing license header

* chore: extend release notes

* Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* fix: format

* fix: format

* Update release note

---------

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 14:19:47 +00:00
tstadel
3119ae1ec9
refactor: raise PipelineError when Pipeline.from_dict receives an invalid type (#8711)
* fix: error on invalid type

* add reno

* Update releasenotes/notes/fix-invalid-component-type-error-83ee00d820b63cc5.yaml

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* Update test/core/pipeline/test_pipeline.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* fix reno

* fix reno

* last reno fix

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2025-01-23 11:40:19 +00:00
Silvano Cerza
8205724395
feat: Rework Pipeline.run() to better handle cycles (#8431)
* draft

* Enhance

* Almost works

* Simplify some parts and handle intermediate outputs

* Handle connections with default

* Handle cycles with multiple connections from two components

* Update distributed outputs at the correct time

* Remove Component inputs after it runs

* Add agent pipeline test case

* Fix infite loop test

* Handle some corner cases with loops checking and inputs deletion

* Fix tests

* Add new behavioral test

* Remove unused code in behavioural test

* Fix behavioural test

* Fix max run check

* Simplify outputs distribution

* Simplify subgraph run check

* Remove unused _init_run_queue function

* Remove commented code

* Add some missing type hints

* Simplify cycles breaking

* Fix _distribute_output test

* Fix _find_components_that_will_receive_no_input test

* Fix validation test

* Fix tracer losing Component inputs

* Fix some linting issues

* Remove ignore pylint rule

* Rename method that break cycles and make it raise

* Add docstring to _run_subgraph

* Update Pipeline.run() docstring

* Update comment to clarify cycles execution

* Remove SelfLoop sample Component

* Add behavioural test for unsupported cycles

* Rename behavioural test to be more specific

* Add new behavioural test

* Add release notes

* Remove commented out code and random pass

* Use more efficient function to find cycles

* Simplify _break_supported_cycles_in_graph by using defaultdict

* Stop breaking edges as soon as we make the graph acyclic

* Fix docstring and add some more comments

* Fix _distribute_output docstring

* Fix _find_receivers_from docstring

* More detailed release notes

* Minimize calls to networkx.is_directed_acyclic_graph

* Add some more info on edges keys

* Adjust components_in_cycles comment

* Add new Pipeline behavioural test

* Enhance _find_components_that_will_receive_no_input to cover more cases

* Explain why run_queue is reset after running a subgraph cycle

* Rename _init_inputs_state to _normalize_input_data

* Better explain the subgraph output distribution

* Remove for else

* Fix some comments and docstrings

* Fix linting

* Add missing return type

* Fix typo

* Rename _normalize_input_data to _normalize_varidiac_input_data and add more documentation

* Remove unused import

---------

Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2024-10-29 15:43:16 +01:00
Ajit Singh
2dd8089409
chore: Removed deprecated max_loop_allowed argument from Pipeline init (#8409)
* Added equality check for sender and receiver in connection function of pipeline

* Update base.py

irrelevant changes reverted

* added release note

* removed deprecated param max_loops_allowed from pipeline init

* added release note

* revert non relevant test

* Delete releasenotes/notes/remove-support-to-connect-component-to-self-6eedfb287f2a2a02.yaml

* revery non relevant change

* Remove unused test_pipeline_deprecated.yaml

* Remove PipelineMaxLoops error

* Update release notes

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-09-30 15:58:05 +02:00
Ajit Singh
7ba30d5691
feat: Pipeline.connect() will now raise a PipelineConnectError if sender and receiver are the same Component (#8403)
* Added equality check for sender and receiver in connection function of pipeline

* Update base.py

irrelevant changes reverted

* added release note

* altered a walk with cycle test

* added a test to verify that pipeline raises PipelineConnectError when adding a component to itself

* Update release notes

* Remove self connection feature tests

* Tidy up connect unit test

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-09-30 15:52:36 +02:00
Silvano Cerza
0df379e6a2
feat: Deprecate @component decorator is_greedy argument (#8400)
* Deprecate @component decorator is_greedy argument

* Fix some typos and docstrings

* Add _is_lazy_variadic test
2024-09-25 11:28:30 +02:00
Silvano Cerza
5514676b5e
feat: Deprecate max_loops_allowed in favour of new argument max_runs_per_component (#8354)
* Deprecate max_loops_allowed in favour of new argument max_runs_per_component

* Add missing test file

* Some enhancements

* Add version that will remove deprecate stuff
2024-09-12 11:00:12 +02:00
Silvano Cerza
4d67b552e1
Fix Pipeline skipping a Component with Variadic input (#8347)
* Fix Pipeline skipping a Component with Variadic input

* Simplify _find_components_that_will_receive_no_input
2024-09-10 14:59:53 +02:00
Silvano Cerza
c7e29a83c1
fix: Fix infinite loop when running Pipeline (#8123)
* Fix infinite loop when running Pipeline

* Simplify if
2024-07-30 15:00:12 +02:00
Amna Mubashar
499fbcc59f
Remove Multiplexer and related tests (#8020) 2024-07-16 15:39:40 +02:00
Silvano Cerza
0411cd938a
Fix bug in Pipeline.run() executing Components in a wrong and unexpected order (#8021)
* Fix bug in Pipeline.run() executing Components in a wrong and unexpected order

* Update haystack/core/pipeline/base.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-07-12 15:30:10 +00:00
Madeesh Kannan
94b806815c
refactor: Improve error messages shown during pipeline deserialization (#8016)
* refactor: Improve error messages shown during pipeline deserialization

* Add link to release notes

* Update release notes link
2024-07-12 14:47:00 +00:00
Silvano Cerza
0cec82e55e
refactor: Pipeline.run() (#8019)
* Move utility functions from _enqueue_next_runnable_component (#7895)

* Isolate logic to check if we're stuck in a loop

* Simplify for else

* Add missing return in docstring

* Emit warning when stuck in a loop

* Fix docstring

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Add utility function to move Components in queues

* Add function to find next Component to run

* Comment update

* Add missing break in loop

* Make _add_missing_input_defaults less error prone and add tests

* Fix tests

* Update docstring

* Simplify enqueue logic

* Remove unused _enqueue_next_runnable_component function

* Add method to find Component with lazy variadic input or all inputs with defaults

* Simplify _find_next_runnable_lazy_variadic_or_default_component

* Remove unnecessary type ignore

* Split _dequeue_components_that_received_no_input into separate functions

* Fix linting

* Simplify variadic check when running Component

* Simplify code

* Reorganize functions used by Pipeline.run

* Rename variables used in Pipeline.run() for clarity

* Add comment clarifying last_waiting_queue and before_last_waiting_queue

* Add functions to easily update waiting_queue

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-07-12 08:35:23 +00:00
Madeesh Kannan
d1f8c0dcd6
fix: Prevent component pre-init hook from being called recursively (#7894) 2024-06-21 10:29:37 +02:00
Massimiliano Pippi
3a03fce71c
ci: Add code formatting checks (#7882)
* ruff settings

enable ruff format and re-format outdated files

feat: `EvaluationRunResult` add parameter to specify columns to keep in the comparative `Dataframe`  (#7879)

* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* adding param to explictily state which cols to keep

* updating tests

* adding release notes

* Update haystack/evaluation/eval_run_result.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Update releasenotes/notes/add-keep-columns-to-EvalRunResult-comparative-be3e15ce45de3e0b.yaml

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* updating docstring

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

add format-check

fail on format and linting failures

fix string formatting

reformat long lines

fix tests

fix typing

linter

pull from main

* reformat

* lint -> check

* lint -> check
2024-06-18 15:52:46 +00:00
Silvano Cerza
15ee622b3c
refactor: Isolate logic that finds next runnable component waiting for input (#7880)
* Fix formatting

* Isolate logic that finds next runnable component waiting for input

* Explain more lazy variadics

* Enhance logic following review suggestions

* Simplify code to use a single for

* Fix test
2024-06-18 16:43:19 +02:00
Silvano Cerza
1b4bd173b8
refactor: Isolate logic that distributes Components output after run (#7845)
* Isolate logic that distributes Component outputs

* Handle variadic reset in correct place

* Move methods to PipelineBase

* Enhance variables and method names

* Add missing return type

* Update comment with correct variable name

* Add comment explaining conditional outputs

* Add variadic list assertion and enhance comment explaining the need of a list

* Rename to_remove_from_res to to_remove_from_component_result and enhance comment

* Split elif

* Enhance code to enqueue greedy variadic components

* Revert "Enhance code to enqueue greedy variadic components"

This reverts commit 052ceb889ec8ea100be6eab810cb06d5febea6fe.

* Enhance variadic greedy enqueue comment
2024-06-14 15:53:28 +02:00
Silvano Cerza
14c7b02a4c
refactor: Isolate logic to check if a Component can run (#7840)
* Isolate run check

* Update docstrings and remove unnecessary set

* Rename argument
2024-06-11 16:14:04 +02:00
Silvano Cerza
58dd972d1a
refactor: Isolate code that runs single Pipeline Component (#7837)
* Isolate code that runs single Pipeline Component

* Fix mypy
2024-06-10 16:03:14 +00:00
Carlos Fernández
7fe0244258
feat: add methods to remove and replace components in a pipeline (#7820)
* add remove_component method plus unit tests

* add docstrings

* add reno

* add type annotation to remove_component method

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* solve bug not allowing a component to be reatached to a pipeline after being removed

* Properly remove Component from Pipeline

* Ignore mypy

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-06-10 14:54:07 +02:00
Silvano Cerza
3c8569e12c
fix: Fix running Pipeline with conditional branch and Component with default inputs (#7799)
* Fix running Pipeline with conditional branch and Component with default inputs

* Add release notes

* Change arg name of _init_to_run so it's clearer

* Enhance release note
2024-06-06 13:19:07 +00:00
Silvano Cerza
fd838fc573
Update indexing and rag default templates to use InMemoryDocumentStore (#7782) 2024-06-04 12:57:33 +02:00
Silvano Cerza
d81af81fbb
test: Migrate pipeline run tests (#7775)
* Move complex pipeline

* Move pipeline with default

* Move pipeline with distinct loops

* Move pipeline with double loop

* Move pipeline with dynamic inputs

* Move fixed decision pipeline

* Move fixed merging pipeline

* Move fixed decision and merge pipeline

* Remove test_joiners.py

* Move looping and merge pipeline

* Remove test_looping.py

* Move mutable input pipeline

* Move parallel branches pipeline

* Move same input different components pipeline

* Move test_run_with_greedy_variadic_after_component_with_default_input_simple

* Remove test_run_raises_if_max_visits_reached

* Move test_run_with_component_that_does_not_return_dict

* Move test_correct_execution_order_of_components_with_only_defaults

* Move test_pipeline_is_not_stuck_with_components_with_only_defaults

* Move test_pipeline_is_not_stuck_with_components_with_only_defaults_as_first_components

* Move self loop pipeline

* Move variable decision and merge pipeline

* Remove test_variable_decision_pipeline

* Move variable merging pipeline

* Add FakeComponent removed by mistake
2024-05-31 13:00:29 +02:00
Silvano Cerza
22289f590f
Move tests from test_connect.py in test_pipeline.py and test_utils.py (#7742) 2024-05-24 16:41:38 +02:00
Silvano Cerza
da088140ab
Group up Pipeline unit tests in a single class (#7706) 2024-05-21 16:12:28 +02:00
Massimiliano Pippi
cc1d4b1c80
chore: Simplify Pipeline.run method by moving code to the base class (#7680)
* move graph initialization to the base class

* simplify data normalization

* deepcopy data in base class

* initialize inputs state

* move to_run preparation to the base class

* Test Pipeline._init_to_run()

* Test Pipeline._init_inputs_state()

* Test Pipeline._prepare_component_input_data()

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-05-14 23:25:46 +02:00
Massimiliano Pippi
1d20ac3c5e
chore: extract BasePipeline (#7673)
* extract BasePipeline

* release note

* add missing headers

* move __eq__ to the base class

* proper check type equality, bless the tests
2024-05-10 11:35:15 +02:00
Madeesh Kannan
ec0e22265a
feat: Expand Pipeline.inputs and Pipeline.outputs to include connected sockets (#7586) 2024-04-24 12:27:18 +02:00
Silvano Cerza
6a8834e43e
fix: Fix corner case when running Pipeline that causes it to get stuck in a loop (#7531)
* Fix corner case when running Pipeline that causes it to get stuck in a loop

* Update haystack/core/pipeline/pipeline.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-04-11 16:39:38 +02:00
Madeesh Kannan
b1760add56
feat: Add support for pipeline deserialization callbacks (#7518)
* feat: Add support for deserialization callbacks

* Lint

* Fix type hint for older Python versions

* Apply suggestions from code review

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Lint

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2024-04-10 17:47:14 +02:00
Silvano Cerza
6e289698e9
fix: Fix Pipeline.run() getting stuck in a loop even though there are components that can run (#7434) 2024-03-28 12:31:36 +01:00
Silvano Cerza
58d91b64dc
Fix: Fix Pipeline.run() running components with only defaults in the wrong order (#7426)
* Fix Pipeline.run() running components with only defaults in the wrong order

* Add release notes
2024-03-26 16:55:31 +01:00
Stefano Fiorucci
6e69d4f188
fix: Pipeline - disable autoshow on Jupyter (#7397)
* try

* fix docstring

* simplify tests

* add release note
2024-03-21 12:55:06 +01:00
Silvano Cerza
de4fca4526
ci: Skip collection of test_json_schema.py to fix CI failures (#7353)
* Skip collection of test_json_schema.py to fix CI failures

* mock chroma instance

* revert

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-03-13 16:59:26 +01:00
Stefano Fiorucci
3dbde84a28
test: monkeypatch some env vars in Predefined Pipelines tests (#7321)
* ci: skip some tests if the OPENAI API key is not set

* better idea: monkeypatch the env var
2024-03-07 08:52:25 +01:00
Julian Risch
50ad1fa2c4
fix: Remove pipeline serialization from telemetry code (#7289)
* remove pipeline serialization from telemetry

* simplify getting component instance from pipeline

* reno

* add unit test with non-serializable component

* generate qualified class names

* added pipeline.walk()

* fix imports

* sort Iterator import

* remove bfs

* add test for pipeline.walk() with cycles

* Apply suggestions from code review

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* raise TypeError if telemetry_data is no dict

* Update haystack/telemetry/_telemetry.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-03-05 12:45:53 +01:00
Silvano Cerza
72d776c390
fix: Fix run order of variadic greedy components in Pipeline.run() (#7258)
* Fix run order of variadic greedy components in Pipeline.run()

* Add release notes
2024-03-01 17:39:13 +01:00
Massimiliano Pippi
34dac5f86f
Update test_pipeline.py (#7284) 2024-03-01 14:20:15 +01:00
Massimiliano Pippi
e7809b6fea
feat: Add from_template class method to Pipeline (#7240)
* move templating code under the core package

* make from_predefined part of the Pipeline API

* add tests

* amend release notes

* import under haystack package

* Apply suggestions from code review

Co-authored-by: David S. Batista <dsbatista@gmail.com>

* from_predefined -> from_template

* remove template inheritance for more readability

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-02-29 12:23:32 +01:00
Silvano Cerza
5f97e08feb
feat: Reintroduce max_loops_allowed check in Pipeline.run() (#7010)
* Reintroduce max_loops_allowed check in Pipeline.run()

* Add release notes
2024-02-19 10:05:35 +01:00
Silvano Cerza
f96eb3847f
refactor: Merge Pipelines definition in core package (#6973)
* Move marshalling functions in core Pipeline

* Move telemetry gathering in core Pipeline

* Move run logic in core Pipeline

* Update root Pipeline import

* Add release notes

* Update Pipeline docs path

* Update releasenotes/notes/merge-pipeline-definitions-1da80e9803e2a8bb.yaml

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-02-12 18:25:28 +01:00
Silvano Cerza
d2d01f9fe1
feat: Enhance Pipeline.__repr__() (#6963)
* Enhance Pipeline.draw() to show image directly in Jupyter notebook

* Add util method to check if we're in a Jupyter notebook

* Split Pipeline.draw() in two methods

* Update tests

* Update releasenotes

* Enhance Pipeline.__repr__

* Simplify Pipeline.__repr__

* Update release notes
2024-02-09 14:44:34 +01:00
Silvano Cerza
a7f36fdd32
feat: Enhance Pipeline.draw() to show image directly in Jupyter notebook (#6961)
* Enhance Pipeline.draw() to show image directly in Jupyter notebook

* Add util method to check if we're in a Jupyter notebook

* Split Pipeline.draw() in two methods

* Update tests

* Update releasenotes
2024-02-09 14:44:24 +01:00
Silvano Cerza
0191b1e6e4
feat: Change Component's I/O dunder type (#6916)
* Add Pipeline.get_component_name() method

* Add utility class to ease discoverability of Component I/O

* Move InputOutput in component package

* Rename InputOutput to _InputOutput

* Raise if inputs or outputs field already exist

* Fix tests

* Add release notes

* Move InputSocket and OutputSocket in types package

* Move _InputOutput in socket package

* Rename _InputOutput class to Sockets

* Simplify Sockets class

* Dictch I/O dunder fields in favour of inputs and outputs fields

* Update Sockets docstrings

* Update release notes

* Fix mypy

* Remove unnecessary assignment

* Remove unused logging

* Change SocketsType to SocketsIOType to avoid confusion

* Change sockets type and name

* Change Sockets.__repr__ to return component instance

* Fix linting

* Fix sockets tests

* Revert to dunder fields for Component IO

* Use singular in IO dunder fields

* Delete release notes

* Update haystack/core/component/types.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-02-05 17:46:45 +01:00
Silvano Cerza
76d324a149
feat: Change Pipeline.add_component to fail when reusing Component instances (#6847)
* Change Pipeline.add_component to fail when reusing Component instances

* Change variable name and store Pipeline instance in it

* Fix tests
2024-01-30 11:15:26 +01:00
Silvano Cerza
d4f6531c52
feat: Refactor Pipeline.run() (#6729)
* First rough implementation of refactored run

* Further improve run logic

* Properly handle variadic input in run

* Further work

* Enhance names and add more documentation

* Fix issue with output distribution

* This works

* Enhance run comments

* Mark Multiplexer as greedy

* Remove MergeLoop in favour of Multiplexer in tests

* Remove FirstIntSelector in favour of Multiplexer

* Handle corner when waiting for input is stuck

* Remove unused import

* Handle mutable input data in run and misbehaving components

* Handle run input validation

* Test validation

* Fix pylint

* Fix mypy

* Call warm_up in run to fix tests
2024-01-18 17:53:47 +01:00
Stefano Fiorucci
8eba053dbc
fix pipeline test (#6741) 2024-01-15 13:59:11 +01:00
Massimiliano Pippi
9ace6bf63d
feat: store input's default value in InputSocket (#6651)
* track default value in sockets

* remove dead code

* include default value in socket description

* add unit test

* add relnote

* unused import

* clarify
2024-01-09 12:17:46 +01:00
Massimiliano Pippi
84da80c1f3
chore: make core tests layout consistent (#6449)
* move unit tests up

* move tests up one dir, make them unit
2023-11-29 18:58:44 +01:00