haystack/test/core/pipeline/test_pipeline.py

386 lines
14 KiB
Python
Raw Normal View History

chore: merge canals into Haystack codebase (#6422) * Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-11-27 15:16:35 +01:00
# SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai>
#
# SPDX-License-Identifier: Apache-2.0
from typing import Optional
import logging
import pytest
from haystack.core.pipeline import Pipeline
from haystack.core.component.sockets import InputSocket, OutputSocket
from haystack.core.errors import PipelineMaxLoops, PipelineError, PipelineRuntimeError
from haystack.testing.sample_components import AddFixedValue, Threshold, Double, Sum
from haystack.testing.factory import component_class
chore: merge canals into Haystack codebase (#6422) * Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-11-27 15:16:35 +01:00
logging.basicConfig(level=logging.DEBUG)
def test_max_loops():
pipe = Pipeline(max_loops_allowed=10)
pipe.add_component("add", AddFixedValue())
pipe.add_component("threshold", Threshold(threshold=100))
pipe.add_component("sum", Sum())
pipe.connect("threshold.below", "add.value")
pipe.connect("add.result", "sum.values")
pipe.connect("sum.total", "threshold.value")
with pytest.raises(PipelineMaxLoops):
pipe.run({"sum": {"values": 1}})
def test_run_with_component_that_does_not_return_dict():
BrokenComponent = component_class(
"BrokenComponent", input_types={"a": int}, output_types={"b": int}, output=1 # type:ignore
)
pipe = Pipeline(max_loops_allowed=10)
pipe.add_component("comp", BrokenComponent())
with pytest.raises(
PipelineRuntimeError, match="Component 'comp' returned a value of type 'int' instead of a dict."
):
pipe.run({"comp": {"a": 1}})
def test_to_dict():
add_two = AddFixedValue(add=2)
add_default = AddFixedValue()
double = Double()
pipe = Pipeline(metadata={"test": "test"}, max_loops_allowed=42)
pipe.add_component("add_two", add_two)
pipe.add_component("add_default", add_default)
pipe.add_component("double", double)
pipe.connect("add_two", "double")
pipe.connect("double", "add_default")
res = pipe.to_dict()
expected = {
"metadata": {"test": "test"},
"max_loops_allowed": 42,
"components": {
"add_two": {
"type": "haystack.testing.sample_components.add_value.AddFixedValue",
"init_parameters": {"add": 2},
},
"add_default": {
"type": "haystack.testing.sample_components.add_value.AddFixedValue",
"init_parameters": {"add": 1},
},
"double": {"type": "haystack.testing.sample_components.double.Double", "init_parameters": {}},
chore: merge canals into Haystack codebase (#6422) * Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-11-27 15:16:35 +01:00
},
"connections": [
{"sender": "add_two.result", "receiver": "double.value"},
{"sender": "double.value", "receiver": "add_default.value"},
],
}
assert res == expected
def test_from_dict():
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 101,
"components": {
"add_two": {
"type": "haystack.testing.sample_components.add_value.AddFixedValue",
"init_parameters": {"add": 2},
},
"add_default": {
"type": "haystack.testing.sample_components.add_value.AddFixedValue",
"init_parameters": {"add": 1},
},
"double": {"type": "haystack.testing.sample_components.double.Double", "init_parameters": {}},
chore: merge canals into Haystack codebase (#6422) * Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-11-27 15:16:35 +01:00
},
"connections": [
{"sender": "add_two.result", "receiver": "double.value"},
{"sender": "double.value", "receiver": "add_default.value"},
],
}
pipe = Pipeline.from_dict(data)
assert pipe.metadata == {"test": "test"}
assert pipe.max_loops_allowed == 101
# Components
assert len(pipe.graph.nodes) == 3
## add_two
add_two = pipe.graph.nodes["add_two"]
assert add_two["instance"].add == 2
assert add_two["input_sockets"] == {
"value": InputSocket(name="value", type=int),
"add": InputSocket(name="add", type=Optional[int], is_mandatory=False),
}
assert add_two["output_sockets"] == {"result": OutputSocket(name="result", type=int, receivers=["double"])}
assert add_two["visits"] == 0
## add_default
add_default = pipe.graph.nodes["add_default"]
assert add_default["instance"].add == 1
assert add_default["input_sockets"] == {
"value": InputSocket(name="value", type=int, senders=["double"]),
"add": InputSocket(name="add", type=Optional[int], is_mandatory=False),
}
assert add_default["output_sockets"] == {"result": OutputSocket(name="result", type=int)}
assert add_default["visits"] == 0
## double
double = pipe.graph.nodes["double"]
assert double["instance"]
assert double["input_sockets"] == {"value": InputSocket(name="value", type=int, senders=["add_two"])}
assert double["output_sockets"] == {"value": OutputSocket(name="value", type=int, receivers=["add_default"])}
assert double["visits"] == 0
# Connections
connections = list(pipe.graph.edges(data=True))
assert len(connections) == 2
assert connections[0] == (
"add_two",
"double",
{
"conn_type": "int",
"from_socket": OutputSocket(name="result", type=int, receivers=["double"]),
"to_socket": InputSocket(name="value", type=int, senders=["add_two"]),
},
)
assert connections[1] == (
"double",
"add_default",
{
"conn_type": "int",
"from_socket": OutputSocket(name="value", type=int, receivers=["add_default"]),
"to_socket": InputSocket(name="value", type=int, senders=["double"]),
},
)
def test_from_dict_with_empty_dict():
assert Pipeline() == Pipeline.from_dict({})
def test_from_dict_with_components_instances():
add_two = AddFixedValue(add=2)
add_default = AddFixedValue()
components = {"add_two": add_two, "add_default": add_default}
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 100,
"components": {
"add_two": {},
"add_default": {},
"double": {"type": "haystack.testing.sample_components.double.Double", "init_parameters": {}},
chore: merge canals into Haystack codebase (#6422) * Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-11-27 15:16:35 +01:00
},
"connections": [
{"sender": "add_two.result", "receiver": "double.value"},
{"sender": "double.value", "receiver": "add_default.value"},
],
}
pipe = Pipeline.from_dict(data, components=components)
assert pipe.metadata == {"test": "test"}
assert pipe.max_loops_allowed == 100
# Components
assert len(pipe.graph.nodes) == 3
## add_two
add_two_data = pipe.graph.nodes["add_two"]
assert add_two_data["instance"] is add_two
assert add_two_data["instance"].add == 2
assert add_two_data["input_sockets"] == {
"value": InputSocket(name="value", type=int),
"add": InputSocket(name="add", type=Optional[int], is_mandatory=False),
}
assert add_two_data["output_sockets"] == {"result": OutputSocket(name="result", type=int, receivers=["double"])}
assert add_two_data["visits"] == 0
## add_default
add_default_data = pipe.graph.nodes["add_default"]
assert add_default_data["instance"] is add_default
assert add_default_data["instance"].add == 1
assert add_default_data["input_sockets"] == {
"value": InputSocket(name="value", type=int, senders=["double"]),
"add": InputSocket(name="add", type=Optional[int], is_mandatory=False),
}
assert add_default_data["output_sockets"] == {"result": OutputSocket(name="result", type=int, receivers=[])}
assert add_default_data["visits"] == 0
## double
double = pipe.graph.nodes["double"]
assert double["instance"]
assert double["input_sockets"] == {"value": InputSocket(name="value", type=int, senders=["add_two"])}
assert double["output_sockets"] == {"value": OutputSocket(name="value", type=int, receivers=["add_default"])}
assert double["visits"] == 0
# Connections
connections = list(pipe.graph.edges(data=True))
assert len(connections) == 2
assert connections[0] == (
"add_two",
"double",
{
"conn_type": "int",
"from_socket": OutputSocket(name="result", type=int, receivers=["double"]),
"to_socket": InputSocket(name="value", type=int, senders=["add_two"]),
},
)
assert connections[1] == (
"double",
"add_default",
{
"conn_type": "int",
"from_socket": OutputSocket(name="value", type=int, receivers=["add_default"]),
"to_socket": InputSocket(name="value", type=int, senders=["double"]),
},
)
def test_from_dict_without_component_type():
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 100,
"components": {"add_two": {"init_parameters": {"add": 2}}},
"connections": [],
}
with pytest.raises(PipelineError) as err:
Pipeline.from_dict(data)
err.match("Missing 'type' in component 'add_two'")
def test_from_dict_without_registered_component_type(request):
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 100,
"components": {"add_two": {"type": "foo.bar.baz", "init_parameters": {"add": 2}}},
"connections": [],
}
with pytest.raises(PipelineError) as err:
Pipeline.from_dict(data)
err.match(r"Component .+ not imported.")
def test_from_dict_without_connection_sender():
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 100,
"components": {},
"connections": [{"receiver": "some.receiver"}],
}
with pytest.raises(PipelineError) as err:
Pipeline.from_dict(data)
err.match("Missing sender in connection: {'receiver': 'some.receiver'}")
def test_from_dict_without_connection_receiver():
data = {
"metadata": {"test": "test"},
"max_loops_allowed": 100,
"components": {},
"connections": [{"sender": "some.sender"}],
}
with pytest.raises(PipelineError) as err:
Pipeline.from_dict(data)
err.match("Missing receiver in connection: {'sender': 'some.sender'}")
def test_falsy_connection():
A = component_class("A", input_types={"x": int}, output={"y": 0})
B = component_class("A", input_types={"x": int}, output={"y": 0})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.connect("a.y", "b.x")
assert p.run({"a": {"x": 10}})["b"]["y"] == 0
def test_describe_input_only_no_inputs_components():
A = component_class("A", input_types={}, output={"x": 0})
B = component_class("B", input_types={}, output={"y": 0})
C = component_class("C", input_types={"x": int, "y": int}, output={"z": 0})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.add_component("c", C())
p.connect("a.x", "c.x")
p.connect("b.y", "c.y")
assert p.inputs() == {}
def test_describe_input_some_components_with_no_inputs():
A = component_class("A", input_types={}, output={"x": 0})
B = component_class("B", input_types={"y": int}, output={"y": 0})
C = component_class("C", input_types={"x": int, "y": int}, output={"z": 0})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.add_component("c", C())
p.connect("a.x", "c.x")
p.connect("b.y", "c.y")
assert p.inputs() == {"b": {"y": {"type": int, "is_mandatory": True}}}
def test_describe_input_all_components_have_inputs():
A = component_class("A", input_types={"x": Optional[int]}, output={"x": 0})
B = component_class("B", input_types={"y": int}, output={"y": 0})
C = component_class("C", input_types={"x": int, "y": int}, output={"z": 0})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.add_component("c", C())
p.connect("a.x", "c.x")
p.connect("b.y", "c.y")
assert p.inputs() == {
"a": {"x": {"type": Optional[int], "is_mandatory": True}},
"b": {"y": {"type": int, "is_mandatory": True}},
}
def test_describe_output_multiple_possible():
"""
This pipeline has two outputs:
{"b": {"output_b": {"type": str}}, "a": {"output_a": {"type": str}}}
"""
A = component_class("A", input_types={"input_a": str}, output={"output_a": "str", "output_b": "str"})
B = component_class("B", input_types={"input_b": str}, output={"output_b": "str"})
pipe = Pipeline()
pipe.add_component("a", A())
pipe.add_component("b", B())
pipe.connect("a.output_b", "b.input_b")
assert pipe.outputs() == {"b": {"output_b": {"type": str}}, "a": {"output_a": {"type": str}}}
def test_describe_output_single():
"""
This pipeline has one output:
{"c": {"z": {"type": int}}}
"""
A = component_class("A", input_types={"x": Optional[int]}, output={"x": 0})
B = component_class("B", input_types={"y": int}, output={"y": 0})
C = component_class("C", input_types={"x": int, "y": int}, output={"z": 0})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.add_component("c", C())
p.connect("a.x", "c.x")
p.connect("b.y", "c.y")
assert p.outputs() == {"c": {"z": {"type": int}}}
def test_describe_no_outputs():
"""
This pipeline sets up elaborate connections between three components but in fact it has no outputs:
Check that p.outputs() == {}
"""
A = component_class("A", input_types={"x": Optional[int]}, output={"x": 0})
B = component_class("B", input_types={"y": int}, output={"y": 0})
C = component_class("C", input_types={"x": int, "y": int}, output={})
p = Pipeline()
p.add_component("a", A())
p.add_component("b", B())
p.add_component("c", C())
p.connect("a.x", "c.x")
p.connect("b.y", "c.y")
assert p.outputs() == {}