265 Commits

Author SHA1 Message Date
Silvano Cerza
645a5fe5ba
ci: Add coverage tracking with Coveralls (#4772)
* Format tests.yml properly

* Add pytest-cov dependency

* Add coverage in unit tests

* Ignore cov.info

* Change report format

* Unignore cov.info
2023-04-28 11:59:09 +02:00
ZanSara
1b57b96210
refactor!: extract elasticsearch (#4668)
* extract elasticsearch

* update pyproject.toml

* make more import optional

* move MockBaseRetriever in conftest

* install es in the es integration tests
2023-04-26 10:14:20 +02:00
bogdankostic
91b775bf43
Execute pipelines and utils unit tests in CI (#4749) 2023-04-26 10:00:52 +02:00
Massimiliano Pippi
0c081f19e2
fix: remove warnings from the more recent Elasticsearch client (#4602)
* clean up the ES instance in a more robust way

* do not sleep, refresh the index instead

* remove client warnings

* fix unit tests

* fix opensearch compatibility

* fix unit tests

* update ES version

* bump elasticsearch-py

* adjust docs

* use recreate_index param

* use same fixture strategy for Opensearch

* Update lg

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-18 15:40:17 +02:00
ZanSara
809ca73649
fix: make langdetect truly optional (#4686)
* make al langdetect imports optional

* add workflow

* fix workflow triggers

* change extra name
2023-04-17 11:35:53 +02:00
ZanSara
d8ac30fa47
refactor!: extract preprocessing and file conversion deps (#4605)
* isolate file-conversion deps

* pylint

* add to all extra

* chain was missing

* move langdetect into preprocessing and fix tika

* add file-conversion extra
2023-04-14 11:34:16 +02:00
ZanSara
174d80ab41
skip tests (#4654) 2023-04-13 17:56:51 +02:00
Silvano Cerza
db69141642
Fix docstring-labeler.yml not working in PR from forks (#4648) 2023-04-12 21:16:06 +02:00
Massimiliano Pippi
322652c306
fix: provide a fallback for PyMuPDF (#4564)
* add a fallback xpdf alternative to PyMuPDF

* add xpdpf to the base images

* to be reverted

* silence mypy on conditional error

* do not install pdf extras in base images

* bring back the xpdf build strategy

* remove leftovers from old build

* fix indentation

* Apply suggestions from code review

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* revert test workflow

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-03-31 14:37:05 +02:00
Silvano Cerza
458e9f1897
Checkout correct ref in docstring-labeler.yml (#4563) 2023-03-30 18:11:43 +02:00
Silvano Cerza
3782ebc835
ci: Fix slack messages formatting (#4556)
* Fix slack messages formatting

* Remove unneeded file
2023-03-30 10:56:20 +02:00
Silvano Cerza
e00f1461bc
Use bigger runner for Docker release (#4538) 2023-03-29 13:14:46 +02:00
Silvano Cerza
78216196d1
Fix docker images testing (#4536) 2023-03-29 12:20:06 +02:00
Silvano Cerza
85ade5c878
Fix Slack messages formatting on job failure (#4520) 2023-03-29 09:24:41 +02:00
Massimiliano Pippi
0dfa5d6ad7
fix: do not override bake's platform definitions (#4518)
* do not override bake's platform definitions

* test

* fix job name and remove override from minor version job

* test

* bump docker login action

* fix plurals

* Remove platform from matrix and test both platform in a single job

* Remove branch trigger used for testing

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-03-28 17:57:29 +02:00
Silvano Cerza
f4fb8dd946
Revert "ci: Change docker_release.yml workflow to run after successful PyPi release (#4293)" (#4513)
This reverts commit 6e241262ada9e59359d653a779246d2ad03c1223.
2023-03-28 15:28:15 +02:00
Silvano Cerza
098342da32
Use new Slack action to send failure messages (#4464) 2023-03-28 10:49:32 +02:00
Silvano Cerza
dbdb682225
Enhance release_docs.py (#4459) 2023-03-28 09:56:42 +02:00
ZanSara
9518bcb7a8
remove env var (#4497) 2023-03-27 10:33:58 +02:00
Silvano Cerza
0f605118d9
ci: remove python_cache internal action (#4429) 2023-03-17 13:55:07 +01:00
Silvano Cerza
22c50207c1
Run readme_sync.yml in PRs (#4442) 2023-03-16 15:18:13 +01:00
Massimiliano Pippi
8d4c56720c
do not run tests on osx (#4443) 2023-03-16 15:00:29 +01:00
Vladimir Blagojevic
2538b4cbc9
Make promptnode test unit (#4420) 2023-03-15 22:17:23 +01:00
Silvano Cerza
b59cf76093
refactor: Remove AnswerToSpeech and DocumentToSpeech nodes (#4391)
* Remove AnswerToSpeech and DocumentToSpeech nodes

* Remove unused dataclasses

* Remove unnecessary dependencies

* Remove unused error class and imports
2023-03-15 19:31:13 +01:00
ZanSara
3ecce5cbeb
refactor: rename v2 package to preview (#4409)
* v2->preview

* fossa -> py3.8

* test matrix

* test matrix

* tests

* test imports

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-03-15 18:02:18 +01:00
Silvano Cerza
2c7c4aa04e
Use bigger runner for integration-tests-linux (#4422) 2023-03-15 11:22:16 +01:00
ZanSara
677fc8badf
feat: new Pipeline (#4368)
* add import for canals

* add stores support to canals

* pyproject.toml

* move tests

* add v2 to the extras in ci

* install v2 in action

* pylint

* save and load

* save and load

* codename "Alfalfa"

* workflows
2023-03-14 17:01:19 +01:00
Daniel Bichuetti
28724e2e25
feat: add automatic OCR detection mechanism and improve performance (#4329)
* feat: add automatic OCR detection mechanism and improve performance

* refactor: add error message

* refactor: ignore pdftoppm bad typing

* refactor: add Tesseract install. docstrings

* fix: check if OCR var. assigned on mp

* tests: add path to windows/linux tests

* tests: add tessdata path

* tests: include matrix ref.

* tests: custom Tesseract matrix install

* refactor: improve user guide

* tests: fix macos path

* tests: remove brew formulae version

* fix: macos paths

* tests: fix macos path

* tests: add Tesseract to Windows Path

* tests: pytesseract path

* tests: macos path

* refactor: fix path message and remove extra path from tests

* refactor: raise exception when path not found

* refactor: expression simplification

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: check ocr parameter

* tests: mark as integration

* tests: mock deprecation warning

* refactor: simplify code

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: change deprecation test

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* refactor: add unit patch

* refactor: black formatting

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-03-13 20:19:22 +05:30
Silvano Cerza
9253990bdf
Add workflow to push CI metrics to Datadog (#4336) 2023-03-06 18:02:24 +01:00
bogdankostic
f33829fabf
Remove xpdf dependencies (#4314) 2023-03-02 11:12:03 +01:00
Silvano Cerza
90da7bf4f8
Fix docstring-labeler.yml workflow (#4307) 2023-03-01 17:49:04 +01:00
Silvano Cerza
ee74421212
ci: Refactor docs config and generation (#4280)
* Change docs yml category config

* Update docs renderers to fetch categories from Readme.io

* Update readme_sync.yml to handle new docs rendering

* Remove unecessary script and related workflow step

* Fix sys.exits
2023-03-01 09:51:02 +01:00
Silvano Cerza
6e241262ad
ci: Change docker_release.yml workflow to run after successful PyPi release (#4293)
* Change docker_release.yml workflow to run after successful PyPi release

* Add warning on name change in pypi_release.yml
2023-03-01 09:50:47 +01:00
Silvano Cerza
5678bb6375
Parallellize Docker build job (#4268) 2023-02-27 16:03:24 +01:00
Silvano Cerza
2c9e4c5ff9
Remove unnecessary operations in minor_version_release.yml (#4267) 2023-02-24 14:29:42 +01:00
Silvano Cerza
280414e5c6
Fix OpenAPI specs upload (#4266) 2023-02-24 10:50:59 +01:00
Silvano Cerza
d594ab800b
ci: Fix OpenAPI spec sync (#4254)
* Attempt to fix OpenAPI sync

* Dry run

* Add step to get OpenAPI specs id

* Remove dryRun and branch trigger
2023-02-23 19:02:46 +01:00
Massimiliano Pippi
722dead1b2
fix agents tests (#4237) 2023-02-23 13:03:45 +01:00
ZanSara
b193e08a64
set env var (#4239)
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-23 11:59:46 +01:00
Silvano Cerza
c3bf62d4b0
Add a simple way to skip required tests checks (#4245) 2023-02-23 11:00:20 +01:00
Massimiliano Pippi
dd37b4c29f
fix: apply black formatting (#4240)
* fix black formatting

* try
2023-02-23 08:59:40 +01:00
Silvano Cerza
b6371c95a8
Add missing dependencies in openapi upload workflow (#4236) 2023-02-22 19:34:22 +01:00
Silvano Cerza
181e5474e8
ci: Automate OpenAPI specs upload to Readme.io (#4228)
* Remove OpenAPI specs file

* OpenAPI specs are now automatically uploaded when necessary

* Rename openapi workflow
2023-02-22 18:01:18 +01:00
Massimiliano Pippi
40f772a9b0
refact: move the first batch of unit tests into the proper job (#4216)
* move the first batch of unit tests into the proper job

* leftover
2023-02-21 17:00:02 +01:00
Julian Risch
5ce7a404ac
feat: Add Agent (#4148)
* initial Agent implementation

* mypy and pylint fixes

* add missing ABC import

* improved prompt template

* refactor and shorten run method

* refactor and shorten run method

* add tests for extracting

* fix mixed up tool_input/observation & make tests more robust

* fix bug with max_iterations and update prompt template

* allow setting prompt_template in Agent init

* remove example yml for agent

* add final prediction to transcript

* add transcript to errors and accept PromptTemplate in init

* simplify if else to elif

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* add checks for max_iter<2 and empty list returned by prompt node

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-02-21 14:27:40 +01:00
Silvano Cerza
30cdb81f19
ci: Move xpdf build into separate container (#4199)
* Create Dockerfile and hcl config to build Xpdf

* Create workflow to build Xpdf Docker image

* Update Dockerfile.base to not build Xpdf

* Fix CWD removal and arg casing

* Fix ARG setting
2023-02-20 14:58:11 +01:00
Silvano Cerza
a4407f8f98
Use larger runner for Docker release workflow (#4185) 2023-02-16 18:59:13 +01:00
Silvano Cerza
689f2cd250
Update docstring-labeler.yml workflow to safely run in PRs from forks (#4146) 2023-02-16 16:02:41 +01:00
Massimiliano Pippi
ec72dd73fc
refactor: complete the document stores test refactoring (#4125)
* add e2e tests

* move tests to their own module

* add e2e workflow

* pylint

* remove from job

* fix index field name

* skip test on sql

* removed unused code

* fix embedding tests

* adjust test for pinecone

* adjust assertions to the new documents

* bad copypasta

* test

* fix tests

* fix tests

* fix test

* fix tests

* pylint

* update milvus version

* remove debug

* move graphdb tests under e2e
2023-02-16 09:43:25 +01:00
Silvano Cerza
d86a511cc1
Fix Docker images test on release (#4153) 2023-02-14 14:18:49 +01:00