mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-12-30 08:37:20 +00:00
* Ignore some mypy errors * Fix I/O comparator * Avoid calling asdict multiple times when comparing dataclasses * Enhance component tests * Fix I/O dataclasses comparison * Use Any instead of type when expecting I/O dataclasses * Fix mypy * Change InputSocket taken_by field to sender * Remove variadics implementation * Adapt tests * Enhance docs and simplify run * Remove useless check on drawing * Add __canals_optional_inputs__ field in components * Rework a bit Pipeline._ready_to_run() * Simplify some logic * Add __canals_mandatory_inputs__ field in components * Handle pipeline loops * Fix tests * Document component state run logic * Add double loop pipeline test * Make component decorator a class * PR feedback * Add error logging when registering Component with identical names * Add 'remove' action that removes current component from Pipeline run input queue * Simplify run checks and logging * Better logging * Apply suggestions from code review Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Trim whitespace * Add support for Union in Component's I/O * Remove dependencies section in marshaled pipelines * Create Component Protocol * simpler optional deps * Simplify component init wrapping and fix issue with save_init_params * Update canals/pipeline/save_load.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Simplify functions to find I/O sockets * Fix import * change import * testing ci * testing ci * Simplify _save_init_params * testing ci * testing ci * use direct pytest call * trying to force old version for macos * list macos versions * list macos versions * disable on macos * remove extra * refactor imports * re-enable some logs * some more tests * small correction * Remove unused leftover methods * docs * update docstring * mention optionals * example for dataclass initialization * missed part * fix api docs * improve error reporting and testing * add tests for Any * parametrized tests * fix test for py<3.10 * test type printing * remove typing. prefix from Any (compat with Py3.11) * test helpers * test names * add type_is_compatible() * tests pass * more tests * add small comment * handle Unions as anything else * use sender/receiver for socket pairs * more sender/receiver renames * even more renames * split if statement * Update __about__.py * fix logic operator and add tests * Update __about__.py * Simplify imports * Move draw in pipeline module and clearly define public interface * Format pyproject.toml * Include only required files in built wheel * Move sample components out of tests * stub component class decorator * update static sample components to new API * stub * dynamic output examples * sum * add components fixed * re-add inputsocket and outputsocket creation * fix component tests * fixing tests * Add methods to set I/O dinamically * fix drawing * fix some integration tests * tests green * pylint * remove stray files * Remove default in InputSocket and add is_optional field * Fix drawing * Rework sockets string representation * Add back Component Protocol * Simplify method to get string representation of types * Remove sockets __str__ * Remove Component's I/O type checks at run time * Remove IO check in init wrapper * Update canals/utils.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Split __canals_io__ field in __canals_input__ and __canals_output__ * Order input and output fields * Add test to verify __canals_component__ is set * Remove empty line * Add component class factory * Fix API docs workflow failure * fix api docs * Update __about__.py * Add component from_dict and to_dict methods * Add Pipeline to_dict and from_dict * Fix components tests * Add some more tests * Change error messages * Simplify test_to_dict * Add max_loops_allowed in test_to_dict * Test non default max_loops_allowed in test_to_dict * Rework marshal_pipelines * Rework unmarshal_pipelines * Rename some stuff * allow falsy outputs * apply falsy fix to validation * add test for falsy inputs * Split _cleanup_marshalled_data into two functions * Use from_dict to deserialise component * Remove commented out code and update variable name * Add test to verify difference when unmarshaling Pipeline with duplicate names * Update marshal_pipelines docstring * update workflow * exclude tests from mypy in pre-commit hooks * add additional falsy tests * remove unnecessary import * split test into two Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * remove init_parameters decorator and fix assumptions * fix accumulate * stray if * Bump version to 0.5.0 * Implement generic default_to_dict and default_from_dict * Update default_to_dict docstring Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove all mentions of Component.defaults * Add Remainder to_dict and from_dict (#91) * Add Repeat to_dict and from_dict (#92) * Add Sum to_dict and from_dict (#93) * Add Greet to_dict and from_dict (#89) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Rework Accumulate to_dict and from_dict (#86) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add to_dict and from_dict for Parity, Subtract, Double, Concatenate (#87) * Add Concatenate to_dict and from_dict * Add Double to_dict and from_dict * Add Subtract to_dict and from_dict * Add Parity to_dict and from_dict --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Change _to_mermaid_text to use component serialization data (#94) * Add MergeLoop to_dict and from_dict (#90) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Add Threshold to_dict and from_dict (#97) * Add AddFixedValue to_dict and from_dict (#88) Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Remove BaseTestComponent (#99) * Change @component decorator so it doesn't add default to_dict and from_dict (#98) * Rename some classes in tests to suppress Pytest warnings (#101) * Check Component I/O socket names are valid (#100) * Remove handling of shared component instances on Pipeline serialization (#102) * Fix docs * Bump version to 0.6.0 * Revert "Check Component I/O socket names are valid (#100)" (#103) This reverts commit 4529874b562d12331ee2f4fde926ef5b5e3d24d7. * Bump canals to 0.7.0 * Downgrade log from ERROR to DEBUG (#104) * Make to/from_dict optional (#107) * remove from/to dict from Protocol * use a default marshaller * example component with no serializers * fix linting * make it smarter * fix linting * thank you mypy protector of the dumb programmers * feat: check returned dictionary (#106) * better error message if components don't return dictionaries * add test * use factory * needless import * Update __about__.py * fix default serialization and adjust sample components accordingly (#109) * fix default serialization and adjust sample components accordingly * typo * fix pylint errors * fix: `draw` function vs init parameters (#115) * fix draw * stray print * Update version (#118) * remove extras * Revert "remove extras" This reverts commit a096ff8f07bdcb6e54ec8457bcfad5db44d8bf03. * fix package name, change _parse_connection_name function name, add tests (#126) * move sockets into components package (#127) * chore: remove extras (#125) * remove extras * workflow * typo * fix: Sockets named "text/plain" or containing a "/" fail during pipeline.to_dict (#131) * don't split sockets by / * revert hashing edge keys * docs: remove missing module from docs (#132) * remove stray print (#123) * addo sockets docs (#133) * tidy up utils about types (#129) * Update canals.md (#134) * rename module in API docs * make `__canals_output__` and `__canals_input__` management consistent (#128) * make __canals_output__ and __canals_input__ management consistent and assign them to the component instance * make pylint happy * return the original type instead of the metaclass * use type checking instead of instance field * declare the actual returned type * fix after conflict resolution * remove check * Do not use a dict as intermediate format and use `Socket`s directly (#135) * do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects * fix leftover from cherry-pick * move is_optional evaluation for InputSocket to post_init (#136) * re-introduce variadics to support Joiner node (#122) * move sockets into components package make __canals_output__ and __canals_input__ management consistent and assign them to the component instance do not use a dict as intermediate format and use sockets directly to simplify code and remove side effects move is_optional evaluation for InputSocket to post_init re-introduce variadics to support Joiner node restore connection-time check use custom type annotation, fix tests * fix leftovers from rebase * rename fan-in to joiner * clean up and fix typing * let inputs arrive later * address review comments * address review comments * fix docstrings * try * try * fix run input * linting * remove comments * fix pylint * bumb version to 0.9.0 (#140) * properly annotate classmethods (#139) * feat: add `Pipeline.inputs()` (#120) * add Pipeline.describe_input() * add tests * split dict and str outputs and add to error messages * tests * accepts/expects * move methods * fix tests * fix module name * tests * review feedback * Add missing typing_extensions dependency (#152) * feat: use full connection data to route I/O (#148) * fix sample components * make sum variadic * separate queue and buffer * all works but loops & variadics together * fix some tests * fix some tests * all tests green * clean up code a bit * refactor code * fix tests * fix self loops * fix reused sockets bug * add distinct loops * add distinct loops test * break out some code from run() * docstring * improve variadics drawing * black * document the deepcopy * re-arrange connection dataclass and add tests * consumer -> receiver * fix typing * move Connection-related code under component package * clean up connect() * cosmetics and typing * fix linter, make Connection a dataclass again * fix typing * add test case for #105 --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * feat: Add Component inputs/outputs functions (#158) * Add component inputs/outputs methods * Different impl approach * Black fixes * Rename functions to match naming in pipeline inputs/ouputs * Fix find_component_inputs, update unit tests (#162) * Fix API docs (#164) * make Variadic wrap an iterable (#163) * Add pipeline outputs method (#150) Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * Update __about__.py (#165) Update version to 0.10.0 * add CODEOWNERS * feat: read defaults from `run()` signature (#166) * Read defaults from run signature * simplify setting of sockets * fix test * Update sample_components/fstring.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Update canals/component/component.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * dostring --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> * Use full import path as 'type' in serialization. (#167) * Use full import path as 'type' in serialization. Try to import the path when deserializing * fix test data * add from_dict test * remove leftover * Update canals/pipeline/pipeline.py Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * add error message to PipelineError --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> * bump version * fix: copy input values before passing them down pipeline.run (#168) * copy input values before passing them down pipeline.run * Update test_mutable_inputs.py * fix mypy and pyright (#169) * bump version * remove data we won't keep * reformat * try * skip tests on transient code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai> Co-authored-by: Michel Bartels <login@michelbartels.com> Co-authored-by: ZanSara <sarazanzo94@gmail.com> Co-authored-by: Julian Risch <julianrisch@gmx.de> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
317 lines
11 KiB
YAML
317 lines
11 KiB
YAML
# If you change this name also do it in tests_skipper.yml and ci_metrics.yml
|
|
name: Tests
|
|
|
|
on:
|
|
workflow_dispatch: # Activate this workflow manually
|
|
push:
|
|
branches:
|
|
- main
|
|
# release branches have the form v1.9.x
|
|
- "v[0-9].*[0-9].x"
|
|
- "!haystack/core/**/*.py"
|
|
pull_request:
|
|
types:
|
|
- opened
|
|
- reopened
|
|
- synchronize
|
|
- ready_for_review
|
|
paths:
|
|
- "haystack/**/*.py"
|
|
- "test/**/*.py"
|
|
- "!haystack/core/**/*.py"
|
|
|
|
env:
|
|
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
|
|
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
|
|
CORE_AZURE_CS_ENDPOINT: ${{ secrets.CORE_AZURE_CS_ENDPOINT }}
|
|
CORE_AZURE_CS_API_KEY: ${{ secrets.CORE_AZURE_CS_API_KEY }}
|
|
PYTHON_VERSION: "3.8"
|
|
|
|
jobs:
|
|
black:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
|
|
- name: Install Black
|
|
run: |
|
|
pip install --upgrade pip
|
|
pip install .[dev]
|
|
|
|
- name: Check status
|
|
run: |
|
|
if ! black . --check; then
|
|
git status
|
|
exit 1
|
|
fi
|
|
|
|
- name: Calculate alert data
|
|
id: calculator
|
|
shell: bash
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
run: |
|
|
if [ "${{ job.status }}" = "success" ]; then
|
|
echo "alert_type=success" >> "$GITHUB_OUTPUT";
|
|
else
|
|
echo "alert_type=error" >> "$GITHUB_OUTPUT";
|
|
fi
|
|
|
|
- name: Send event to Datadog
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
uses: masci/datadog@v1
|
|
with:
|
|
api-key: ${{ secrets.CORE_DATADOG_API_KEY }}
|
|
api-url: https://api.datadoghq.eu
|
|
events: |
|
|
- title: "${{ github.workflow }} workflow"
|
|
text: "Job ${{ github.job }} in branch ${{ github.ref_name }}"
|
|
alert_type: "${{ steps.calculator.outputs.alert_type }}"
|
|
source_type_name: "Github"
|
|
host: ${{ github.repository_owner }}
|
|
tags:
|
|
- "project:${{ github.repository }}"
|
|
- "job:${{ github.job }}"
|
|
- "run_id:${{ github.run_id }}"
|
|
- "workflow:${{ github.workflow }}"
|
|
- "branch:${{ github.ref_name }}"
|
|
- "url:https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
|
|
|
unit-tests:
|
|
name: Unit / ${{ matrix.os }}
|
|
needs: black
|
|
strategy:
|
|
fail-fast: false
|
|
matrix:
|
|
os:
|
|
- ubuntu-latest
|
|
- windows-latest
|
|
- macos-latest
|
|
runs-on: ${{ matrix.os }}
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
|
|
- name: Install Haystack
|
|
run: pip install .[dev,audio] langdetect transformers[torch,sentencepiece]==4.35.2 'sentence-transformers>=2.2.0' pypdf markdown-it-py mdit_plain tika 'azure-ai-formrecognizer>=3.2.0b2' cohere boilerpy3
|
|
|
|
- name: Run
|
|
run: pytest -m "not integration" test
|
|
|
|
- name: Calculate alert data
|
|
id: calculator
|
|
shell: bash
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
run: |
|
|
if [ "${{ job.status }}" = "success" ]; then
|
|
echo "alert_type=success" >> "$GITHUB_OUTPUT";
|
|
else
|
|
echo "alert_type=error" >> "$GITHUB_OUTPUT";
|
|
fi
|
|
|
|
- name: Send event to Datadog
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
uses: masci/datadog@v1
|
|
with:
|
|
api-key: ${{ secrets.CORE_DATADOG_API_KEY }}
|
|
api-url: https://api.datadoghq.eu
|
|
events: |
|
|
- title: "${{ github.workflow }} workflow"
|
|
text: "Job ${{ github.job }} in branch ${{ github.ref_name }}"
|
|
alert_type: "${{ steps.calculator.outputs.alert_type }}"
|
|
source_type_name: "Github"
|
|
host: ${{ github.repository_owner }}
|
|
tags:
|
|
- "project:${{ github.repository }}"
|
|
- "job:${{ github.job }}"
|
|
- "run_id:${{ github.run_id }}"
|
|
- "workflow:${{ github.workflow }}"
|
|
- "branch:${{ github.ref_name }}"
|
|
- "url:https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
|
|
|
integration-tests-linux:
|
|
name: Integration / ubuntu-latest
|
|
needs: unit-tests
|
|
runs-on: ubuntu-latest
|
|
services:
|
|
tika:
|
|
image: apache/tika:2.9.0.0
|
|
ports:
|
|
- 9998:9998
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
|
|
- name: Install dependencies
|
|
run: |
|
|
sudo apt update
|
|
sudo apt install ffmpeg # for local Whisper tests
|
|
|
|
- name: Install Haystack
|
|
run: pip install .[dev,audio] langdetect transformers[torch,sentencepiece]==4.35.2 'sentence-transformers>=2.2.0' pypdf markdown-it-py mdit_plain tika 'azure-ai-formrecognizer>=3.2.0b2' cohere boilerpy3
|
|
|
|
- name: Run
|
|
run: pytest --maxfail=5 -m "integration" test
|
|
|
|
- name: Calculate alert data
|
|
id: calculator
|
|
shell: bash
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
run: |
|
|
if [ "${{ job.status }}" = "success" ]; then
|
|
echo "alert_type=success" >> "$GITHUB_OUTPUT";
|
|
else
|
|
echo "alert_type=error" >> "$GITHUB_OUTPUT";
|
|
fi
|
|
|
|
- name: Send event to Datadog
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
uses: masci/datadog@v1
|
|
with:
|
|
api-key: ${{ secrets.CORE_DATADOG_API_KEY }}
|
|
api-url: https://api.datadoghq.eu
|
|
events: |
|
|
- title: "${{ github.workflow }} workflow"
|
|
text: "Job ${{ github.job }} in branch ${{ github.ref_name }}"
|
|
alert_type: "${{ steps.calculator.outputs.alert_type }}"
|
|
source_type_name: "Github"
|
|
host: ${{ github.repository_owner }}
|
|
tags:
|
|
- "project:${{ github.repository }}"
|
|
- "job:${{ github.job }}"
|
|
- "run_id:${{ github.run_id }}"
|
|
- "workflow:${{ github.workflow }}"
|
|
- "branch:${{ github.ref_name }}"
|
|
- "url:https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
|
|
|
integration-tests-macos:
|
|
name: Integration / macos-latest
|
|
needs: unit-tests
|
|
runs-on: macos-latest-xl
|
|
env:
|
|
HAYSTACK_MPS_ENABLED: false
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
|
|
- name: Install dependencies
|
|
run: |
|
|
brew install ffmpeg # for local Whisper tests
|
|
brew install docker
|
|
colima start
|
|
|
|
- name: Install Haystack
|
|
run: pip install .[dev,audio] langdetect transformers[torch,sentencepiece]==4.35.2 'sentence-transformers>=2.2.0' pypdf markdown-it-py mdit_plain tika 'azure-ai-formrecognizer>=3.2.0b2' cohere boilerpy3
|
|
|
|
- name: Run Tika
|
|
run: docker run -d -p 9998:9998 apache/tika:2.9.0.0
|
|
|
|
- name: Run
|
|
run: pytest --maxfail=5 -m "integration" test
|
|
|
|
- name: Calculate alert data
|
|
id: calculator
|
|
shell: bash
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
run: |
|
|
if [ "${{ job.status }}" = "success" ]; then
|
|
echo "alert_type=success" >> "$GITHUB_OUTPUT";
|
|
else
|
|
echo "alert_type=error" >> "$GITHUB_OUTPUT";
|
|
fi
|
|
|
|
- name: Send event to Datadog
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
uses: masci/datadog@v1
|
|
with:
|
|
api-key: ${{ secrets.CORE_DATADOG_API_KEY }}
|
|
api-url: https://api.datadoghq.eu
|
|
events: |
|
|
- title: "${{ github.workflow }} workflow"
|
|
text: "Job ${{ github.job }} in branch ${{ github.ref_name }}"
|
|
alert_type: "${{ steps.calculator.outputs.alert_type }}"
|
|
source_type_name: "Github"
|
|
host: ${{ github.repository_owner }}
|
|
tags:
|
|
- "project:${{ github.repository }}"
|
|
- "job:${{ github.job }}"
|
|
- "run_id:${{ github.run_id }}"
|
|
- "workflow:${{ github.workflow }}"
|
|
- "branch:${{ github.ref_name }}"
|
|
- "url:https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
|
|
|
integration-tests-windows:
|
|
name: Integration / windows-latest
|
|
needs: unit-tests
|
|
runs-on: windows-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
|
|
- name: Install Haystack
|
|
run: pip install .[dev,audio] langdetect transformers[torch,sentencepiece]==4.35.2 'sentence-transformers>=2.2.0' pypdf markdown-it-py mdit_plain tika 'azure-ai-formrecognizer>=3.2.0b2' cohere boilerpy3
|
|
|
|
- name: Run
|
|
run: pytest --maxfail=5 -m "integration" test -k 'not tika'
|
|
|
|
- name: Calculate alert data
|
|
id: calculator
|
|
shell: bash
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
run: |
|
|
if [ "${{ job.status }}" = "success" ]; then
|
|
echo "alert_type=success" >> "$GITHUB_OUTPUT";
|
|
else
|
|
echo "alert_type=error" >> "$GITHUB_OUTPUT";
|
|
fi
|
|
|
|
- name: Send event to Datadog
|
|
if: (success() || failure()) && github.ref_name == 'main'
|
|
uses: masci/datadog@v1
|
|
with:
|
|
api-key: ${{ secrets.CORE_DATADOG_API_KEY }}
|
|
api-url: https://api.datadoghq.eu
|
|
events: |
|
|
- title: "${{ github.workflow }} workflow"
|
|
text: "Job ${{ github.job }} in branch ${{ github.ref_name }}"
|
|
alert_type: "${{ steps.calculator.outputs.alert_type }}"
|
|
source_type_name: "Github"
|
|
host: ${{ github.repository_owner }}
|
|
tags:
|
|
- "project:${{ github.repository }}"
|
|
- "job:${{ github.job }}"
|
|
- "run_id:${{ github.run_id }}"
|
|
- "workflow:${{ github.workflow }}"
|
|
- "branch:${{ github.ref_name }}"
|
|
- "url:https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
|
|
|
catch-all:
|
|
name: Catch-all check
|
|
runs-on: ubuntu-latest
|
|
# This job will be executed only after all the other tests
|
|
# are successful.
|
|
# This way we'll be able to mark only this test as required
|
|
# and skip it accordingly.
|
|
needs:
|
|
- integration-tests-linux
|
|
- integration-tests-macos
|
|
- integration-tests-windows
|
|
steps:
|
|
- name: Finisher
|
|
run: echo "Finish him!"
|