Sara Zan d470b9d0bd
Improve dependency management (#1994)
* Fist attempt at using setup.cfg for dependency management

* Trying the new package on the CI and in Docker too

* Add composite extras_require

* Add the safe_import function for document store imports and add some try-catch statements on rest_api and ui imports

* Fix bug on class import and rephrase error message

* Introduce typing for optional modules and add type: ignore in sparse.py

* Include importlib_metadata backport for py3.7

* Add colab group to extra_requires

* Fix pillow version

* Fix grpcio

* Separate out the crawler as another extra

* Make paths relative in rest_api and ui

* Update the test matrix in the CI

* Add try catch statements around the optional imports too to account for direct imports

* Never mix direct deps with self-references and add ES deps to the base install

* Refactor several paths in tests to make them insensitive to the execution path

* Include tstadel review and re-introduce Milvus1 in the tests suite, to fix

* Wrap pdf conversion utils into safe_import

* Update some tutorials and rever Milvus1 as default for now, see #2067

* Fix mypy config


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-26 18:12:55 +01:00

90 lines
3.7 KiB
YAML

name: benchmarks
on:
workflow_dispatch:
pull_request:
types: [labeled]
jobs:
deploy-cloud-runner:
if: ${{ (github.event.action == 'labeled' && github.event.label.name == 'benchmark') || github.event.action == 'workflow_dispatch' }}
runs-on: [ubuntu-latest]
container: docker://dvcorg/cml
steps:
- name: deploy
env:
repo_token: ${{ secrets.HAYSTACK_BOT_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_CI_ACCESS_KEY }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_CI_SECRET_ACCESS_KEY }}
VPC: ${{ secrets.AWS_CI_VPC }}
run: |
echo "Deploying..."
RUNNER_LABELS="cml,aws"
RUNNER_REPO="https://github.com/${GITHUB_REPOSITORY}"
MACHINE="cml$(date +%s)"
docker-machine create \
--driver amazonec2 \
--amazonec2-instance-type p3.2xlarge \
--amazonec2-vpc-id $VPC \
--amazonec2-region us-east-1 \
--amazonec2-zone c \
--amazonec2-ssh-user ubuntu \
--amazonec2-ami ami-06a25ee8966373068 \
--amazonec2-root-size 150 \
$MACHINE
eval "$(docker-machine env --shell sh $MACHINE)"
(
docker-machine ssh $MACHINE "sudo mkdir -p \
/docker_machine && \
sudo chmod 777 /docker_machine" && \
docker-machine scp -r -q ~/.docker/machine/ \
$MACHINE:/docker_machine && \
docker run --name elasticsearch -d \
-p 9200:9200 \
-e "discovery.type=single-node" \
elasticsearch:7.9.2 && \
docker run --name postgres -d \
-p 5432:5432 \
--net host \
-e POSTGRES_PASSWORD=password \
-v /docker_machine/machine:/root/.docker/machine \
-e DOCKER_MACHINE=$MACHINE \
postgres && \
sleep 4 && \
docker exec -i postgres psql -U postgres -c "CREATE DATABASE haystack;" && \
docker run --name runner -d \
--gpus all \
-v /docker_machine/machine:/root/.docker/machine \
--net host \
-e DOCKER_MACHINE=$MACHINE \
-e repo_token=$repo_token \
-e RUNNER_LABELS=$RUNNER_LABELS \
-e RUNNER_REPO=$RUNNER_REPO \
-e RUNNER_IDLE_TIMEOUT=120 \
dvcorg/cml-py3:latest && \
sleep 20 && echo "Deployed $MACHINE"
) || (echo "Shut down machine" && docker-machine rm -y -f $MACHINE && exit 1)
run-benchmark:
if: ${{ (github.event.action == 'labeled' && github.event.label.name == 'benchmark') || github.event.action == 'workflow_dispatch' }}
needs: deploy-cloud-runner
runs-on: [self-hosted,cml]
steps:
- uses: actions/checkout@v2
- name: cml_run
env:
repo_token: ${{ secrets.HAYSTACK_BOT_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_CI_ACCESS_KEY }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_CI_SECRET_ACCESS_KEY }}
run: |
apt-get update -y
apt-get install python3-dev -y
pip install .[elasticsearch,faiss,milvus,weaviate,graphdb,ray,rest,ui,dev]
cd test/benchmarks && python run.py --retriever_index --retriever_query --reader --ci --save_markdown
echo -en "## Benchmarks: Retriever Indexing\n" >> report.md
cat retriever_index_results.md >> report.md
echo -en "\n\n## Benchmarks: Retriever Querying\n" >> report.md
cat retriever_query_results.md >> report.md
echo -en "\n\n## Benchmarks: Reader\n" >> report.md
cat reader_results.md >> report.md
cml-send-comment report.md