322 Commits

Author SHA1 Message Date
Massimiliano Pippi
428096733d
ci: add a job to vet license of direct dependencies only (#4885)
* add conversion script

* run job in CI

* typo

* invoke python

* install toml

* fix pylint error

* more exclusions

* add toml to dev dependencies

* fix exclusions list

* fix mypy and remove test clause
2023-05-12 11:20:48 +02:00
Massimiliano Pippi
d322beed6c
build: do not install 'dev' extras with 'all' (#4888)
* do not install 'dev' with 'all'

* some fixes around
2023-05-11 19:24:47 +02:00
Silvano Cerza
6c84a05d98
Upload coverage only if all unit tests pass (#4874) 2023-05-11 14:29:44 +02:00
Massimiliano Pippi
c619aa29ec
ci: add new license checker (#4779)
* try

* add exclusions

* fix vanilla distribution

* use different requirements files

* fix comments and file name

* try with a recent version of pip

* use cpu version of torch

* try

* again

* exclude nvidia libraries

* revert old change

* send report to FOSSA

* add gpu section

* display job names

* remove FOSSA check

* send complete report to FOSSA

* removed FIXME
2023-05-10 16:33:08 +02:00
Silvano Cerza
06193e08b1
Add missing unit tests topics to coverage upload step (#4873) 2023-05-10 12:51:52 +02:00
ZanSara
28463e38e5
multi-os dep checker (#4845) 2023-05-09 11:46:53 +02:00
Sebastian
707f1c3546
Add modeling to unit tests so it we can get coverage for that (#4809)
* Add modeling to unit tests so it we can get coverage for that

* fix unit tests

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-05-08 19:05:21 +02:00
ZanSara
28260c5c3f
feat: introduce generalimport (#4662)
* introduce generalimport

* pylint

* fix optional deps typing for schema

* leftover

* typo

* typing with faiss

* make Base generation optional too

* handle sqlalchemy

* (almost) all import are optional

* TO REMOVE hijacking CI for tests

* some deps are actually needed

* get feature branch in CI

* get feature branch in CI

* fix array_equal

* pylint

* pandas also required

* improve imports.yml

* fix SquadData

* fix SquadData again

* generalimport imports list

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* review feedback

* remove todos

* reference main release

* pylint

* circular import

* review feedback

* move is_imported in init

* pylint

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-05-08 15:20:10 +02:00
Massimiliano Pippi
d8dc0d7403
chore: move custom linter to a separate package (#4790)
* move custom linter to its own package

* install the custom linter

* fix formatting

* drop python 3.7
2023-05-04 15:49:26 +02:00
Silvano Cerza
4cc69eeb29
Fix release_docs.py to create docs with correct version (#4803) 2023-05-03 15:04:50 +02:00
Silvano Cerza
9b67611169
Add others folder to unit test job (#4800) 2023-05-03 10:47:21 +02:00
Silvano Cerza
645a5fe5ba
ci: Add coverage tracking with Coveralls (#4772)
* Format tests.yml properly

* Add pytest-cov dependency

* Add coverage in unit tests

* Ignore cov.info

* Change report format

* Unignore cov.info
2023-04-28 11:59:09 +02:00
ZanSara
1b57b96210
refactor!: extract elasticsearch (#4668)
* extract elasticsearch

* update pyproject.toml

* make more import optional

* move MockBaseRetriever in conftest

* install es in the es integration tests
2023-04-26 10:14:20 +02:00
bogdankostic
91b775bf43
Execute pipelines and utils unit tests in CI (#4749) 2023-04-26 10:00:52 +02:00
Massimiliano Pippi
0c081f19e2
fix: remove warnings from the more recent Elasticsearch client (#4602)
* clean up the ES instance in a more robust way

* do not sleep, refresh the index instead

* remove client warnings

* fix unit tests

* fix opensearch compatibility

* fix unit tests

* update ES version

* bump elasticsearch-py

* adjust docs

* use recreate_index param

* use same fixture strategy for Opensearch

* Update lg

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-18 15:40:17 +02:00
ZanSara
809ca73649
fix: make langdetect truly optional (#4686)
* make al langdetect imports optional

* add workflow

* fix workflow triggers

* change extra name
2023-04-17 11:35:53 +02:00
ZanSara
d8ac30fa47
refactor!: extract preprocessing and file conversion deps (#4605)
* isolate file-conversion deps

* pylint

* add to all extra

* chain was missing

* move langdetect into preprocessing and fix tika

* add file-conversion extra
2023-04-14 11:34:16 +02:00
ZanSara
174d80ab41
skip tests (#4654) 2023-04-13 17:56:51 +02:00
Silvano Cerza
db69141642
Fix docstring-labeler.yml not working in PR from forks (#4648) 2023-04-12 21:16:06 +02:00
Massimiliano Pippi
322652c306
fix: provide a fallback for PyMuPDF (#4564)
* add a fallback xpdf alternative to PyMuPDF

* add xpdpf to the base images

* to be reverted

* silence mypy on conditional error

* do not install pdf extras in base images

* bring back the xpdf build strategy

* remove leftovers from old build

* fix indentation

* Apply suggestions from code review

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* revert test workflow

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-03-31 14:37:05 +02:00
Silvano Cerza
458e9f1897
Checkout correct ref in docstring-labeler.yml (#4563) 2023-03-30 18:11:43 +02:00
Silvano Cerza
3782ebc835
ci: Fix slack messages formatting (#4556)
* Fix slack messages formatting

* Remove unneeded file
2023-03-30 10:56:20 +02:00
Silvano Cerza
e00f1461bc
Use bigger runner for Docker release (#4538) 2023-03-29 13:14:46 +02:00
Silvano Cerza
78216196d1
Fix docker images testing (#4536) 2023-03-29 12:20:06 +02:00
Silvano Cerza
85ade5c878
Fix Slack messages formatting on job failure (#4520) 2023-03-29 09:24:41 +02:00
Massimiliano Pippi
0dfa5d6ad7
fix: do not override bake's platform definitions (#4518)
* do not override bake's platform definitions

* test

* fix job name and remove override from minor version job

* test

* bump docker login action

* fix plurals

* Remove platform from matrix and test both platform in a single job

* Remove branch trigger used for testing

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-03-28 17:57:29 +02:00
Silvano Cerza
f4fb8dd946
Revert "ci: Change docker_release.yml workflow to run after successful PyPi release (#4293)" (#4513)
This reverts commit 6e241262ada9e59359d653a779246d2ad03c1223.
2023-03-28 15:28:15 +02:00
Silvano Cerza
098342da32
Use new Slack action to send failure messages (#4464) 2023-03-28 10:49:32 +02:00
Silvano Cerza
dbdb682225
Enhance release_docs.py (#4459) 2023-03-28 09:56:42 +02:00
ZanSara
9518bcb7a8
remove env var (#4497) 2023-03-27 10:33:58 +02:00
Silvano Cerza
0f605118d9
ci: remove python_cache internal action (#4429) 2023-03-17 13:55:07 +01:00
Silvano Cerza
22c50207c1
Run readme_sync.yml in PRs (#4442) 2023-03-16 15:18:13 +01:00
Massimiliano Pippi
8d4c56720c
do not run tests on osx (#4443) 2023-03-16 15:00:29 +01:00
Vladimir Blagojevic
2538b4cbc9
Make promptnode test unit (#4420) 2023-03-15 22:17:23 +01:00
Silvano Cerza
b59cf76093
refactor: Remove AnswerToSpeech and DocumentToSpeech nodes (#4391)
* Remove AnswerToSpeech and DocumentToSpeech nodes

* Remove unused dataclasses

* Remove unnecessary dependencies

* Remove unused error class and imports
2023-03-15 19:31:13 +01:00
ZanSara
3ecce5cbeb
refactor: rename v2 package to preview (#4409)
* v2->preview

* fossa -> py3.8

* test matrix

* test matrix

* tests

* test imports

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-03-15 18:02:18 +01:00
Silvano Cerza
2c7c4aa04e
Use bigger runner for integration-tests-linux (#4422) 2023-03-15 11:22:16 +01:00
ZanSara
677fc8badf
feat: new Pipeline (#4368)
* add import for canals

* add stores support to canals

* pyproject.toml

* move tests

* add v2 to the extras in ci

* install v2 in action

* pylint

* save and load

* save and load

* codename "Alfalfa"

* workflows
2023-03-14 17:01:19 +01:00
Massimiliano Pippi
1498aacc77
chore: make the docs generator runnable without an API key (#4405)
* spit a warning instead of exiting

* print which file is being converted (useful to debug CI)

* pin docspec for the time being
2023-03-14 16:15:19 +01:00
Daniel Bichuetti
28724e2e25
feat: add automatic OCR detection mechanism and improve performance (#4329)
* feat: add automatic OCR detection mechanism and improve performance

* refactor: add error message

* refactor: ignore pdftoppm bad typing

* refactor: add Tesseract install. docstrings

* fix: check if OCR var. assigned on mp

* tests: add path to windows/linux tests

* tests: add tessdata path

* tests: include matrix ref.

* tests: custom Tesseract matrix install

* refactor: improve user guide

* tests: fix macos path

* tests: remove brew formulae version

* fix: macos paths

* tests: fix macos path

* tests: add Tesseract to Windows Path

* tests: pytesseract path

* tests: macos path

* refactor: fix path message and remove extra path from tests

* refactor: raise exception when path not found

* refactor: expression simplification

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: check ocr parameter

* tests: mark as integration

* tests: mock deprecation warning

* refactor: simplify code

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: change deprecation test

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: add unit patch

* refactor: black formatting

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-03-13 20:19:22 +05:30
Bilge Yücel
9198d5ec42
chore: add topic:promptnode label (#4347) 2023-03-07 21:23:40 +01:00
Silvano Cerza
9253990bdf
Add workflow to push CI metrics to Datadog (#4336) 2023-03-06 18:02:24 +01:00
bogdankostic
f33829fabf
Remove xpdf dependencies (#4314) 2023-03-02 11:12:03 +01:00
Silvano Cerza
90da7bf4f8
Fix docstring-labeler.yml workflow (#4307) 2023-03-01 17:49:04 +01:00
Silvano Cerza
ee74421212
ci: Refactor docs config and generation (#4280)
* Change docs yml category config

* Update docs renderers to fetch categories from Readme.io

* Update readme_sync.yml to handle new docs rendering

* Remove unecessary script and related workflow step

* Fix sys.exits
2023-03-01 09:51:02 +01:00
Silvano Cerza
6e241262ad
ci: Change docker_release.yml workflow to run after successful PyPi release (#4293)
* Change docker_release.yml workflow to run after successful PyPi release

* Add warning on name change in pypi_release.yml
2023-03-01 09:50:47 +01:00
Silvano Cerza
5678bb6375
Parallellize Docker build job (#4268) 2023-02-27 16:03:24 +01:00
Silvano Cerza
2c9e4c5ff9
Remove unnecessary operations in minor_version_release.yml (#4267) 2023-02-24 14:29:42 +01:00
Silvano Cerza
280414e5c6
Fix OpenAPI specs upload (#4266) 2023-02-24 10:50:59 +01:00
Silvano Cerza
d594ab800b
ci: Fix OpenAPI spec sync (#4254)
* Attempt to fix OpenAPI sync

* Dry run

* Add step to get OpenAPI specs id

* Remove dryRun and branch trigger
2023-02-23 19:02:46 +01:00