459 Commits

Author SHA1 Message Date
ZanSara
ce06268990
test: fix e2e test failures (#5685)
* fix test errors

* fix pipeline yaml

* disable cache

* fix errors

* remove stray fixture
2023-08-30 12:24:03 +02:00
ZanSara
1709be162c
auto trigger e2e workflow on PRs that affect it (#5684) 2023-08-30 10:25:47 +02:00
ZanSara
5985b6d358
chore: refactor pipeline tests for e2e testing (#5576)
* enable pipeline filder in e2e

* merge standard pipeline tests with stanrdard pipeline batch tests

* merge summarization tests into standard pipelines tests

* Update test_standard_pipelines.py

* black
2023-08-29 11:22:39 +02:00
Silvano Cerza
444edce126
Add workflow to trigger preview package release (#5631) 2023-08-25 17:10:28 +02:00
Silvano Cerza
cb894061f7
Add terminate-runner job in benchmarks.yml (#5611) 2023-08-25 10:14:39 +02:00
Silvano Cerza
b53fad4c4f
Add missing integration tests to catch-all required step in tests.yml (#5598) 2023-08-18 17:58:26 +02:00
bogdankostic
ee2745bad8
ci: Add Github workflow to automate benchmark runs (#5399)
* Add config files

* log benchmarks to stdout

* Add top-k and batch size to configs

* Add batch size to configs

* fix: don't download files if they already exist

* Add batch size to configs

* refine script

* Remove configs using 1m docs

* update run script

* update run script

* update run script

* datadog integration

* remove out folder

* gitignore benchmarks output

* test: send benchmarks to datadog

* remove uncommented lines in script

* feat: take branch/tag argument for benchmark setup script

* fix: run.sh should ignore errors

* Add GH workflow to run benchmarks periodically

* Remove unused script

* Adapt cml.yml

* Adapt cml.yml

* Rename cml.yml to benchmarks.yml

* Revert "Rename cml.yml to benchmarks.yml"

This reverts commit 897299433a71a55827124728adff5de918d46d21.

* remove benchmarks.yml

* Use same file extension for all config files

* Use checkout@v3

* Run benchmarks sequentially

* Add timeout-minutes parameter

* Remove changes unrelated to datadog

* Apply black

* use haystack-oss aws account

* Update test/benchmarks/utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* PR feedback

* fix aws credentials step

* Fix path

* check docker

* Allow spinning up containers from within container

* Allow spinning up containers from within container

* Separate launching doc stores from benchmarks

* Remove docker related commands

* run only retrievers

* change port

* Revert "change port"

This reverts commit 6e5bcebb1d16e03ba7672be7e8a089084c7fc3a7.

* Run opensearch benchmark only

* Run weaviate benchmark only

* Run bm25 benchmarks only

* Changes host of doc stores

* add step to get docker logs

* Revert "add step to get docker logs"

This reverts commit c10e6faa76bde5df406a027203bd775d18c93c90.

* Install docker

* Launch doc store containers from wtihin runner container

* Remove kill command

* Change host

* dump docker logs

* change port

* Add cloud startup script

* dump docker logs

* add network param

* add network to startup.sh

* check cluster health

* move steps

* change port

* try using services

* check cluster health

* use services

* run only weaviate

* change host

* Upload benchmark results as artifacts

* Update configs

* Delete index after benchmark run

* Use correct index name

* Run only failing config

* Use smaller batch size

* Increase memory for opensearch

* Reduce batch size further

* Provide more storage

* Reduce batch size

* dump docker logs

* add java opts

* Spin up only opensearch container

* Create separate job for each doc store

* Run benchmarks sequentially

* Set working directory

* Account for reader benchmarks not doing indexing

* Change key of reader metrics

* Apply PR feedback

* Remove whitespace

* Adapt workflow to changes in datadog scripts

* Adapt workflow to changes in datadog scripts

* Increase memory for opensearch

* Reduce batch size

* Add preprocessing_batch_size to Readers

* Remove unrelated change

* Move order

* Fix path

* Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

Manually terminate EC2 instance

* Manually terminate EC2 instance

* Manually terminate EC2 instance

* Always terminate runner

* Always terminate runner

* Remove unnecessary terminate-runner job

* Add cron schedule

* Disable telemetry

* Rename cml.yml to benchmarks.yml

---------

Co-authored-by: rjanjua <rohan.janjua@gmail.com>
Co-authored-by: Paul Steppacher <p.steppacher91@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-08-17 12:56:45 +02:00
Silvano Cerza
bc152d953c
Skip running tests in CI when editing docs Python files (#5482) 2023-08-01 12:31:24 +02:00
Silvano Cerza
9a359101fd
chore: Rework docs generation (#5481)
* Change docs generation to use id for parent doc instead of slug

* Rename step
2023-08-01 12:18:33 +02:00
Massimiliano Pippi
5f01391827
add workflow to check presence of release notes (#5449) 2023-07-27 10:40:40 +02:00
Julian Risch
eeb29b5686
test: Re-activate end-to-end tests workflow (#5343)
* Install haystack with required extras

* remove whitespaces

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Add sleep

* Add s for seconds

* Move container initialization in workflow

* Update e2e.yml

add nightly run

* use new folder for initial e2e test

* use file hash for caching and trigger on push to branch

* remove \n from model names read from file

* remove trigger on push to branch

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-07-20 11:48:51 +02:00
bogdankostic
b7f683bfa4
ci: Add unit test for Elasticsearch8 (#5300)
* Add job for ES8 integration tests

* Add unit test for Elasticsearch 8

* Add tests.yml

* Adapt tests.yml

* Remove added white space

* Adapt tests.yml

* Adapt tests.yml

* Add dependencies to unit test name

* Adapt unit test matrix

* Adapt unit test matrix

* Adapt unit test matrix

* Adapt unit test matrix

* Update tests.yml

* Create separate tests where necessary

* Fix skip

* Adapt tests
2023-07-10 16:03:50 +02:00
bogdankostic
048fc7f640
ci: Add job for ES8 integration tests (#5297)
* Add job for ES8 integration tests

* Remove whitespace

* Fix filename

* Add tests.yml

* Revert "Add tests.yml"

This reverts commit ec12654d4e146b5ef6cba04ad82f5973935d8520.
2023-07-10 10:43:05 +02:00
Vladimir Blagojevic
395854d823
Add cpu-remote-inference Docker image (#5225)
* Add cpu-remote-inference Docker image

* Add web lfqa pipeline as an example for cpu-remote-inference Docker image

* WebRetriever must have document_store attribute

* Add cpu-remote-inference-latest

* Add image testing in CI

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-07-07 10:23:14 +02:00
ZanSara
4b380d8fb0
fix: install inference in REST API tests (#5252)
* install inference in restapi tests

* add workflow dispatch to test the REST API CI in PR

* trigger ci

* tablecell

* tablecell

* revert ci trigger

* mypy
2023-07-03 15:10:14 +02:00
Stefano Fiorucci
637433841e
chore: remove deprecated Seq2SeqGenerator and RAGenerator (#5180)
* first draft of removal

* more removals

* don't download unused models
2023-06-21 16:38:45 +02:00
Julian Risch
30fdf2b5df
feat!: Add extra for inference dependencies such as torch (#5147)
* feat!: add extra for inference dependencies such as torch

* add inference extra to 'all' and 'all-gpu' extra

* install inference extra in selected integration tests

* import LazyImport

* review feedback

* add import error messages and update readme

* remove extra dot
2023-06-20 09:54:10 +02:00
ZanSara
97d5db3b9c
revert fix: change the Docker workflow runner (#5078) 2023-06-05 19:11:38 +02:00
ZanSara
be3eb3cdb5
fix: change Docker workflow runner (#5077) 2023-06-05 15:59:58 +02:00
ZanSara
8487cddc69
add cli to the jobs list (#5060) 2023-06-01 13:22:17 +02:00
Massimiliano Pippi
4aaf4fcc31
ci: fix Datadog event body (#5024)
* fix Datadog event body

* Update .github/workflows/license_compliance.yml

Co-authored-by: bogdankostic <bogdankostic@web.de>

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-05-27 18:12:53 +02:00
Massimiliano Pippi
8392e813a8
migrate to Datadog all the jobs (#5022) 2023-05-25 14:28:26 +02:00
Massimiliano Pippi
b69f0b3dd5
track failures on Datadog (#5020) 2023-05-25 11:26:09 +02:00
Massimiliano Pippi
929b8d1fb0
ci: run Elasticsearch 8.6 in compatibility mode (#3853)
* bump ES version in CI

disable ssl

wait for service to start

set env vars

do not use choco to install ES

re-enable jobs deps

skip test on windows CI because of OOM

allocate more memory for ES

uniform ES installation and use default heap size

skip tests causing OOM

increase job timeout

restore memory limit for ES8

* Use latest elasticsearch version
2023-05-24 18:53:54 +02:00
Massimiliano Pippi
8228081e7a
chore: leftovers from removing knowledge graph support (#4974)
* leftovers from removing knowledge graph support

* more leftovers
2023-05-22 10:03:51 +02:00
Silvano Cerza
f235d30af8
Add workflow name to Datadog event (#4968) 2023-05-19 17:42:33 +02:00
Silvano Cerza
ce4cf3bc55
Add workflow id to Datadog event tags (#4965) 2023-05-19 16:52:39 +02:00
Silvano Cerza
58bb5f09e4
Standardize workflows file names (#4964) 2023-05-19 16:41:56 +02:00
Massimiliano Pippi
4974bf7ab3
chore: remove deprecated MilvusDocumentStore (#4951)
* remove deprecated MilvusDocumentStore

* remove leftovers

* fix pylint
2023-05-19 16:37:38 +02:00
Silvano Cerza
d5cc6ff9a9
ci: Remove legacy tests (#4961)
* Remove legacy tests

* Remove unecessary env vars
2023-05-19 15:49:07 +02:00
Silvano Cerza
69bae2a3d6
Set calculator shell explicitly to handle Windows runs (#4960) 2023-05-19 15:15:18 +02:00
Silvano Cerza
dd9245531a
Add Datadog event send in examples tests workflow (#4959) 2023-05-19 15:15:10 +02:00
Silvano Cerza
2d76237508
Fix step failing to calculate Datadog event type (#4958) 2023-05-19 15:03:09 +02:00
Silvano Cerza
21ca24f70b
Send tests outcomes to Datadog instead of sending message to Slack (#4957) 2023-05-19 14:45:36 +02:00
Massimiliano Pippi
428096733d
ci: add a job to vet license of direct dependencies only (#4885)
* add conversion script

* run job in CI

* typo

* invoke python

* install toml

* fix pylint error

* more exclusions

* add toml to dev dependencies

* fix exclusions list

* fix mypy and remove test clause
2023-05-12 11:20:48 +02:00
Massimiliano Pippi
d322beed6c
build: do not install 'dev' extras with 'all' (#4888)
* do not install 'dev' with 'all'

* some fixes around
2023-05-11 19:24:47 +02:00
Silvano Cerza
6c84a05d98
Upload coverage only if all unit tests pass (#4874) 2023-05-11 14:29:44 +02:00
Massimiliano Pippi
c619aa29ec
ci: add new license checker (#4779)
* try

* add exclusions

* fix vanilla distribution

* use different requirements files

* fix comments and file name

* try with a recent version of pip

* use cpu version of torch

* try

* again

* exclude nvidia libraries

* revert old change

* send report to FOSSA

* add gpu section

* display job names

* remove FOSSA check

* send complete report to FOSSA

* removed FIXME
2023-05-10 16:33:08 +02:00
Silvano Cerza
06193e08b1
Add missing unit tests topics to coverage upload step (#4873) 2023-05-10 12:51:52 +02:00
ZanSara
28463e38e5
multi-os dep checker (#4845) 2023-05-09 11:46:53 +02:00
Sebastian
707f1c3546
Add modeling to unit tests so it we can get coverage for that (#4809)
* Add modeling to unit tests so it we can get coverage for that

* fix unit tests

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-05-08 19:05:21 +02:00
ZanSara
28260c5c3f
feat: introduce generalimport (#4662)
* introduce generalimport

* pylint

* fix optional deps typing for schema

* leftover

* typo

* typing with faiss

* make Base generation optional too

* handle sqlalchemy

* (almost) all import are optional

* TO REMOVE hijacking CI for tests

* some deps are actually needed

* get feature branch in CI

* get feature branch in CI

* fix array_equal

* pylint

* pandas also required

* improve imports.yml

* fix SquadData

* fix SquadData again

* generalimport imports list

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* review feedback

* remove todos

* reference main release

* pylint

* circular import

* review feedback

* move is_imported in init

* pylint

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-08 15:20:10 +02:00
Massimiliano Pippi
d8dc0d7403
chore: move custom linter to a separate package (#4790)
* move custom linter to its own package

* install the custom linter

* fix formatting

* drop python 3.7
2023-05-04 15:49:26 +02:00
Silvano Cerza
9b67611169
Add others folder to unit test job (#4800) 2023-05-03 10:47:21 +02:00
Silvano Cerza
645a5fe5ba
ci: Add coverage tracking with Coveralls (#4772)
* Format tests.yml properly

* Add pytest-cov dependency

* Add coverage in unit tests

* Ignore cov.info

* Change report format

* Unignore cov.info
2023-04-28 11:59:09 +02:00
ZanSara
1b57b96210
refactor!: extract elasticsearch (#4668)
* extract elasticsearch

* update pyproject.toml

* make more import optional

* move MockBaseRetriever in conftest

* install es in the es integration tests
2023-04-26 10:14:20 +02:00
bogdankostic
91b775bf43
Execute pipelines and utils unit tests in CI (#4749) 2023-04-26 10:00:52 +02:00
Massimiliano Pippi
0c081f19e2
fix: remove warnings from the more recent Elasticsearch client (#4602)
* clean up the ES instance in a more robust way

* do not sleep, refresh the index instead

* remove client warnings

* fix unit tests

* fix opensearch compatibility

* fix unit tests

* update ES version

* bump elasticsearch-py

* adjust docs

* use recreate_index param

* use same fixture strategy for Opensearch

* Update lg

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-18 15:40:17 +02:00
ZanSara
809ca73649
fix: make langdetect truly optional (#4686)
* make al langdetect imports optional

* add workflow

* fix workflow triggers

* change extra name
2023-04-17 11:35:53 +02:00
ZanSara
d8ac30fa47
refactor!: extract preprocessing and file conversion deps (#4605)
* isolate file-conversion deps

* pylint

* add to all extra

* chain was missing

* move langdetect into preprocessing and fix tika

* add file-conversion extra
2023-04-14 11:34:16 +02:00