haystack/test/core/pipeline/features/pipeline_run.feature

72 lines
4.2 KiB
Gherkin
Raw Normal View History

Feature: Pipeline running
Scenario Outline: Running a correct Pipeline
Given a pipeline <kind>
When I run the Pipeline
feat: AsyncPipeline that can schedule components to run concurrently (#8812) * add component checks * pipeline should run deterministically * add FIFOQueue * add agent tests * add order dependent tests * run new tests * remove code that is not needed * test: intermediate from cycle outputs are available outside cycle * add tests for component checks (Claude) * adapt tests for component checks (o1 review) * chore: format * remove tests that aren't needed anymore * add _calculate_priority tests * revert accidental change in pyproject.toml * test format conversion * adapt to naming convention * chore: proper docstrings and type hints for PQ * format * add more unit tests * rm unneeded comments * test input consumption * lint * fix: docstrings * lint * format * format * fix license header * fix license header * add component run tests * fix: pass correct input format to tracing * fix types * format * format * types * add defaults from Socket instead of signature - otherwise components with dynamic inputs would fail * fix test names * still wait for optional inputs on greedy variadic sockets - mirrors previous behavior * fix format * wip: warn for ambiguous running order * wip: alternative warning * fix license header * make code more readable Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Introduce content tracing to a behavioral test * Fixing linting * Remove debug print statements * Fix tracer tests * remove print * test: test for component inputs * test: remove testing for run order * chore: update component checks from experimental * chore: update pipeline and base from experimental * refactor: remove unused method * refactor: remove unused method * refactor: outdated comment * refactor: inputs state is updated as side effect - to prepare for AsyncPipeline implementation * format * test: add file conversion test * format * fix: original implementation deepcopies outputs * lint * fix: from_dict was updated * fix: format * fix: test * test: add test for thread safety * remove unused imports * format * test: FIFOPriorityQueue * chore: add release note * feat: add AsyncPipeline * chore: Add release notes * fix: format * debug: switch run order to debug ubuntu and windows tests * fix: consider priorities of other components while waiting for DEFER * refactor: simplify code * fix: resolve merge conflict with mermaid changes * fix: format * fix: remove unused import * refactor: rename to avoid accidental conflicts * fix: track pipeline type * fix: and extend test * fix: format * style: sort alphabetically * Update test/core/pipeline/features/conftest.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update test/core/pipeline/features/conftest.py Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml * fix: indentation, do not close loop * fix: use asyncio.run * fix: format --------- Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-07 16:37:29 +01:00
Then components are called with the expected inputs
And it should return the expected result
Examples:
| kind |
| that has no components |
| that is linear |
| that is really complex with lots of components, forks, and loops |
| that has a single component with a default input |
| that has two loops of identical lengths |
| that has two loops of different lengths |
| that has a single loop with two conditional branches |
| that has a component with dynamic inputs defined in init |
| that has two branches that don't merge |
| that has three branches that don't merge |
| that has two branches that merge |
| that has different combinations of branches that merge and do not merge |
| that has two branches, one of which loops back |
| that has a component with mutable input |
| that has a component with mutable output sent to multiple inputs |
| that has a greedy and variadic component after a component with default input |
| that has a component with only default inputs |
feat: Rework `Pipeline.run()` to better handle cycles (#8431) * draft * Enhance * Almost works * Simplify some parts and handle intermediate outputs * Handle connections with default * Handle cycles with multiple connections from two components * Update distributed outputs at the correct time * Remove Component inputs after it runs * Add agent pipeline test case * Fix infite loop test * Handle some corner cases with loops checking and inputs deletion * Fix tests * Add new behavioral test * Remove unused code in behavioural test * Fix behavioural test * Fix max run check * Simplify outputs distribution * Simplify subgraph run check * Remove unused _init_run_queue function * Remove commented code * Add some missing type hints * Simplify cycles breaking * Fix _distribute_output test * Fix _find_components_that_will_receive_no_input test * Fix validation test * Fix tracer losing Component inputs * Fix some linting issues * Remove ignore pylint rule * Rename method that break cycles and make it raise * Add docstring to _run_subgraph * Update Pipeline.run() docstring * Update comment to clarify cycles execution * Remove SelfLoop sample Component * Add behavioural test for unsupported cycles * Rename behavioural test to be more specific * Add new behavioural test * Add release notes * Remove commented out code and random pass * Use more efficient function to find cycles * Simplify _break_supported_cycles_in_graph by using defaultdict * Stop breaking edges as soon as we make the graph acyclic * Fix docstring and add some more comments * Fix _distribute_output docstring * Fix _find_receivers_from docstring * More detailed release notes * Minimize calls to networkx.is_directed_acyclic_graph * Add some more info on edges keys * Adjust components_in_cycles comment * Add new Pipeline behavioural test * Enhance _find_components_that_will_receive_no_input to cover more cases * Explain why run_queue is reset after running a subgraph cycle * Rename _init_inputs_state to _normalize_input_data * Better explain the subgraph output distribution * Remove for else * Fix some comments and docstrings * Fix linting * Add missing return type * Fix typo * Rename _normalize_input_data to _normalize_varidiac_input_data and add more documentation * Remove unused import --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2024-10-29 15:43:16 +01:00
| that has a component with only default inputs as first to run and receives inputs from a loop |
| that has multiple branches that merge into a component with a single variadic input |
| that has multiple branches of different lengths that merge into a component with a single variadic input |
| that is linear and returns intermediate outputs |
| that has a loop and returns intermediate outputs from it |
| that is linear and returns intermediate outputs from multiple sockets |
| that has a component with default inputs that doesn't receive anything from its sender |
| that has a component with default inputs that doesn't receive anything from its sender but receives input from user |
| that has a loop and a component with default inputs that doesn't receive anything from its sender but receives input from user |
| that has multiple components with only default inputs and are added in a different order from the order of execution |
| that is linear with conditional branching and multiple joins |
feat: Rework `Pipeline.run()` to better handle cycles (#8431) * draft * Enhance * Almost works * Simplify some parts and handle intermediate outputs * Handle connections with default * Handle cycles with multiple connections from two components * Update distributed outputs at the correct time * Remove Component inputs after it runs * Add agent pipeline test case * Fix infite loop test * Handle some corner cases with loops checking and inputs deletion * Fix tests * Add new behavioral test * Remove unused code in behavioural test * Fix behavioural test * Fix max run check * Simplify outputs distribution * Simplify subgraph run check * Remove unused _init_run_queue function * Remove commented code * Add some missing type hints * Simplify cycles breaking * Fix _distribute_output test * Fix _find_components_that_will_receive_no_input test * Fix validation test * Fix tracer losing Component inputs * Fix some linting issues * Remove ignore pylint rule * Rename method that break cycles and make it raise * Add docstring to _run_subgraph * Update Pipeline.run() docstring * Update comment to clarify cycles execution * Remove SelfLoop sample Component * Add behavioural test for unsupported cycles * Rename behavioural test to be more specific * Add new behavioural test * Add release notes * Remove commented out code and random pass * Use more efficient function to find cycles * Simplify _break_supported_cycles_in_graph by using defaultdict * Stop breaking edges as soon as we make the graph acyclic * Fix docstring and add some more comments * Fix _distribute_output docstring * Fix _find_receivers_from docstring * More detailed release notes * Minimize calls to networkx.is_directed_acyclic_graph * Add some more info on edges keys * Adjust components_in_cycles comment * Add new Pipeline behavioural test * Enhance _find_components_that_will_receive_no_input to cover more cases * Explain why run_queue is reset after running a subgraph cycle * Rename _init_inputs_state to _normalize_input_data * Better explain the subgraph output distribution * Remove for else * Fix some comments and docstrings * Fix linting * Add missing return type * Fix typo * Rename _normalize_input_data to _normalize_varidiac_input_data and add more documentation * Remove unused import --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2024-10-29 15:43:16 +01:00
| that is a simple agent |
| that has a variadic component that receives partial inputs |
fix: pipeline run bugs in cyclic and acyclic pipelines (#8707) * add component checks * pipeline should run deterministically * add FIFOQueue * add agent tests * add order dependent tests * run new tests * remove code that is not needed * test: intermediate from cycle outputs are available outside cycle * add tests for component checks (Claude) * adapt tests for component checks (o1 review) * chore: format * remove tests that aren't needed anymore * add _calculate_priority tests * revert accidental change in pyproject.toml * test format conversion * adapt to naming convention * chore: proper docstrings and type hints for PQ * format * add more unit tests * rm unneeded comments * test input consumption * lint * fix: docstrings * lint * format * format * fix license header * fix license header * add component run tests * fix: pass correct input format to tracing * fix types * format * format * types * add defaults from Socket instead of signature - otherwise components with dynamic inputs would fail * fix test names * still wait for optional inputs on greedy variadic sockets - mirrors previous behavior * fix format * wip: warn for ambiguous running order * wip: alternative warning * fix license header * make code more readable Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Introduce content tracing to a behavioral test * Fixing linting * Remove debug print statements * Fix tracer tests * remove print * test: test for component inputs * test: remove testing for run order * chore: update component checks from experimental * chore: update pipeline and base from experimental * refactor: remove unused method * refactor: remove unused method * refactor: outdated comment * refactor: inputs state is updated as side effect - to prepare for AsyncPipeline implementation * format * test: add file conversion test * format * fix: original implementation deepcopies outputs * lint * fix: from_dict was updated * fix: format * fix: test * test: add test for thread safety * remove unused imports * format * test: FIFOPriorityQueue * chore: add release note * fix: resolve merge conflict with mermaid changes * fix: format * fix: remove unused import * refactor: rename to avoid accidental conflicts * chore: remove unused inputs, add missing license header * chore: extend release notes * Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * fix: format * fix: format * Update release note --------- Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 15:19:47 +01:00
| that has a variadic component that receives partial inputs in a different order |
| that has an answer joiner variadic component |
feat: Rework `Pipeline.run()` to better handle cycles (#8431) * draft * Enhance * Almost works * Simplify some parts and handle intermediate outputs * Handle connections with default * Handle cycles with multiple connections from two components * Update distributed outputs at the correct time * Remove Component inputs after it runs * Add agent pipeline test case * Fix infite loop test * Handle some corner cases with loops checking and inputs deletion * Fix tests * Add new behavioral test * Remove unused code in behavioural test * Fix behavioural test * Fix max run check * Simplify outputs distribution * Simplify subgraph run check * Remove unused _init_run_queue function * Remove commented code * Add some missing type hints * Simplify cycles breaking * Fix _distribute_output test * Fix _find_components_that_will_receive_no_input test * Fix validation test * Fix tracer losing Component inputs * Fix some linting issues * Remove ignore pylint rule * Rename method that break cycles and make it raise * Add docstring to _run_subgraph * Update Pipeline.run() docstring * Update comment to clarify cycles execution * Remove SelfLoop sample Component * Add behavioural test for unsupported cycles * Rename behavioural test to be more specific * Add new behavioural test * Add release notes * Remove commented out code and random pass * Use more efficient function to find cycles * Simplify _break_supported_cycles_in_graph by using defaultdict * Stop breaking edges as soon as we make the graph acyclic * Fix docstring and add some more comments * Fix _distribute_output docstring * Fix _find_receivers_from docstring * More detailed release notes * Minimize calls to networkx.is_directed_acyclic_graph * Add some more info on edges keys * Adjust components_in_cycles comment * Add new Pipeline behavioural test * Enhance _find_components_that_will_receive_no_input to cover more cases * Explain why run_queue is reset after running a subgraph cycle * Rename _init_inputs_state to _normalize_input_data * Better explain the subgraph output distribution * Remove for else * Fix some comments and docstrings * Fix linting * Add missing return type * Fix typo * Rename _normalize_input_data to _normalize_varidiac_input_data and add more documentation * Remove unused import --------- Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
2024-10-29 15:43:16 +01:00
| that is linear and a component in the middle receives optional input from other components and input from the user |
| that has a loop in the middle |
| that has variadic component that receives a conditional input |
| that has a string variadic component |
fix: pipeline run bugs in cyclic and acyclic pipelines (#8707) * add component checks * pipeline should run deterministically * add FIFOQueue * add agent tests * add order dependent tests * run new tests * remove code that is not needed * test: intermediate from cycle outputs are available outside cycle * add tests for component checks (Claude) * adapt tests for component checks (o1 review) * chore: format * remove tests that aren't needed anymore * add _calculate_priority tests * revert accidental change in pyproject.toml * test format conversion * adapt to naming convention * chore: proper docstrings and type hints for PQ * format * add more unit tests * rm unneeded comments * test input consumption * lint * fix: docstrings * lint * format * format * fix license header * fix license header * add component run tests * fix: pass correct input format to tracing * fix types * format * format * types * add defaults from Socket instead of signature - otherwise components with dynamic inputs would fail * fix test names * still wait for optional inputs on greedy variadic sockets - mirrors previous behavior * fix format * wip: warn for ambiguous running order * wip: alternative warning * fix license header * make code more readable Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * Introduce content tracing to a behavioral test * Fixing linting * Remove debug print statements * Fix tracer tests * remove print * test: test for component inputs * test: remove testing for run order * chore: update component checks from experimental * chore: update pipeline and base from experimental * refactor: remove unused method * refactor: remove unused method * refactor: outdated comment * refactor: inputs state is updated as side effect - to prepare for AsyncPipeline implementation * format * test: add file conversion test * format * fix: original implementation deepcopies outputs * lint * fix: from_dict was updated * fix: format * fix: test * test: add test for thread safety * remove unused imports * format * test: FIFOPriorityQueue * chore: add release note * fix: resolve merge conflict with mermaid changes * fix: format * fix: remove unused import * refactor: rename to avoid accidental conflicts * chore: remove unused inputs, add missing license header * chore: extend release notes * Update releasenotes/notes/fix-pipeline-run-2fefeafc705a6d91.yaml Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> * fix: format * fix: format * Update release note --------- Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com> Co-authored-by: David S. Batista <dsbatista@gmail.com>
2025-02-06 15:19:47 +01:00
| that is an agent that can use RAG |
| that has a feedback loop |
| created in a non-standard order that has a loop |
| that has an agent with a feedback cycle |
| that passes outputs that are consumed in cycle to outside the cycle |
| with a component that has dynamic default inputs |
| with a component that has variadic dynamic default inputs |
| that is a file conversion pipeline with two joiners |
| that is a file conversion pipeline with three joiners |
| that is a file conversion pipeline with three joiners and a loop |
| that has components returning dataframes |
| where a single component connects multiple sockets to the same receiver socket |
| where a component in a cycle provides inputs for a component outside the cycle in one iteration and no input in another iteration |
| that is blocked because not enough component inputs |
Scenario Outline: Running a bad Pipeline
Given a pipeline <kind>
When I run the Pipeline
Then it must have raised <exception>
Examples:
| kind | exception |
| that has an infinite loop | PipelineMaxComponentRuns |
| that has a component that doesn't return a dictionary | PipelineRuntimeError |
| that has a cycle that would get it stuck | PipelineComponentsBlockedError |