3803 Commits

Author SHA1 Message Date
ZanSara
1ac9ca7fac
merge (#4620) 2023-04-12 09:38:04 +02:00
Silvano Cerza
3d79174eb8
Add 503 as status code that triggers retry in request_with_retry (#4640) 2023-04-11 11:54:53 +02:00
Silvano Cerza
5baf2f5930
refactor: Rework invocation layers (#4615)
* Move invocation layers into separate package

* Fix circular imports

* Fix import
2023-04-11 11:04:29 +02:00
Ben Heckmann
2d65742443
feat: arbitrary crawler_depth for Crawler class (#4623)
* #3674 implemented iterative crawler depth

* #3674 added two tests for increased crawler depth

* removed old comment
2023-04-11 10:39:17 +02:00
Silvano Cerza
5547e85bd5
feat: Add util method to make HTTP requests with configurable retry (#4627)
* Add util method to make HTTP requests with configurable retry

* Fix pylint

* Remove unnecessary optional parameter
2023-04-11 10:35:39 +02:00
Silvano Cerza
5ac3dffbef
test: Rework conftest (#4614)
* Split root conftest into multiple ones and remove unused fixtures

* Remove some constants and make them fixtures

* Remove unnecessary fixture scoping

* Fix failing whisper tests

* Fix image_file_paths fixture
2023-04-11 10:33:43 +02:00
Tuana Çelik
83d33f2aed
Update README.md (#4625) 2023-04-07 09:09:16 +02:00
Malte Pietsch
fabf77388c
Update readme with new companies using haystack (#4621) 2023-04-06 19:42:25 +02:00
Silvano Cerza
e85dc79eaa
test: Add pytest fixture to block requests in unit tests (#4433)
* Add pytest fixture to block requests in unit tests

* Mark test correctly as integration

* Fix crawler unit test failing cause it tries to install chromedriver
2023-04-06 18:04:57 +02:00
Silvano Cerza
c3abf73332
refactor: Rework prompt tests (#4600)
* Rework some PromptNode and PromptModel tests

* Remove duplicate code in PromptNode

* Fix mypy

* Fix test cause of missing fixture

* Revert "Fix mypy"

This reverts commit e530295a06cb260d9a8bd89679534958cb3d9776.

* Revert "Remove duplicate code in PromptNode"

This reverts commit 4a678ae81504dcc78a737372c061d12dc8799639.
2023-04-06 14:47:44 +02:00
Agnieszka Marzec
f2c6ce39e6
Docs: Fix QuestionGenerator and Summarizer docstrings (#4594)
* Add missing params and fix the docstrings

* Add reviewer's comments
2023-04-06 13:40:56 +02:00
Silvano Cerza
ee7b25b8cf
Remove unecessary literal_eval (#4570) 2023-04-06 13:30:45 +02:00
Tuana Çelik
e0895f0ac2
Adding missing emoji (#4613) 2023-04-06 11:20:16 +02:00
Tuana Çelik
1a37caad79
feat: Load documents from remote - helper function (#4545)
* first draft of the load documents from remote function

* resolving comments

* pylint fixes

* pylint fixes

* fixed import

* fixed black

* fixing returned instance

* pythonic list comprehension

* Addressed comments

---------

Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-04-06 10:19:35 +02:00
Massimiliano Pippi
52fb935936
build xpdf on bionic (#4606) 2023-04-05 15:52:44 +02:00
Agnieszka Marzec
bc95de5dc1
Update models and docstrings lg (#4595) 2023-04-04 16:48:14 +02:00
Vladimir Blagojevic
a8d283cfac
Fix HF stop words (single stop word) (#4584) 2023-04-04 14:45:10 +02:00
ZanSara
ce61eda970
feat: Haystack CLI (#4568)
* first implementation

* only version

* delete rest api management

* pylint
2023-04-04 14:24:00 +02:00
Stefano Fiorucci
423b135e14
refactor: remove variadic parameters in WebSearch initialization; make new nodes directly importable (#4581)
* make new nodes directly importable

* avoid circular imports

* rm variadic parameters in __init___

* forgotten import

* update docstrings

* don't expose SearchEngine
2023-04-04 14:21:26 +02:00
Agnieszka Marzec
7338e60362
Docs: Hide private modules from API docs (#4555)
* Hide private modules and fix order

* Add underscore
2023-04-04 14:07:18 +02:00
Agnieszka Marzec
7c5f9313ff
Docs: Update Whisper API. (#4539)
* Update lg

* Blackify
2023-04-04 12:32:06 +02:00
Mayank Jobanputra
ce82bfb197
chore: add citation (#4573)
* basic structure

* Added names, modified title
2023-04-04 10:10:44 +02:00
Agnieszka Marzec
c00bb1b732
Docs: Shaper API update (#4542)
* Update Shaper API

* Blackify
2023-04-04 08:21:58 +02:00
Silvano Cerza
1cc4c9c651
refactor: Refactor prompt node (#4580)
* Refactor prompt structure

* Refactor prompt tests structure

* Fix pylint

* Move TestPromptTemplateSyntax to test_prompt_template.py
2023-04-03 11:49:49 +02:00
ZanSara
c202866093
feat!: drop Python3.7 support (#4421)
* drop py3.7

* importlib-metadata
2023-04-03 10:34:58 +02:00
Silvano Cerza
af02803cce
Skip flaky prompt node integration test (#4572) 2023-04-03 09:49:30 +02:00
Massimiliano Pippi
322652c306
fix: provide a fallback for PyMuPDF (#4564)
* add a fallback xpdf alternative to PyMuPDF

* add xpdpf to the base images

* to be reverted

* silence mypy on conditional error

* do not install pdf extras in base images

* bring back the xpdf build strategy

* remove leftovers from old build

* fix indentation

* Apply suggestions from code review

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* revert test workflow

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-03-31 14:37:05 +02:00
Julian Risch
57415ef8ab
test: Remove duplicate test and edit docstring (#4567) 2023-03-31 12:39:18 +02:00
Silvano Cerza
458e9f1897
Checkout correct ref in docstring-labeler.yml (#4563) 2023-03-30 18:11:43 +02:00
Eren
b09289241b
docs: fix broken readme links (#4560)
* docs: fix broken links

* fix Azure typo
2023-03-30 15:17:00 +02:00
Eren
5c6b295fb2
fix: update tutorials link (#4559) 2023-03-30 14:58:23 +02:00
Agnieszka Marzec
815dcdebbd
docs: Update PromptNode API docs (#4549)
* Update docstrings

* adapt test to changed logging message

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-03-30 14:27:44 +02:00
Silvano Cerza
3782ebc835
ci: Fix slack messages formatting (#4556)
* Fix slack messages formatting

* Remove unneeded file
2023-03-30 10:56:20 +02:00
Tuana Çelik
0876bc13b1
update to readme (#4533)
* update to readme

* Update README.md

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Update README.md

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* resolving comments

* Small final fixes

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2023-03-30 10:51:01 +02:00
GitIgnoreMaybe
514dea2443
feat: Change default save_dir for FARMReader.train (#4553)
Co-authored-by: Marco <marco.herzog@nuzzera.com>
2023-03-30 13:20:25 +05:30
Stefano Fiorucci
57f87e24a3
refactor: OpenAIAnswerGenerator - avoid tokenizing all documents several times (#4504) 2023-03-29 22:38:27 +02:00
Zoltan Fedor
32091d66cb
Adding filtering support for Weaviate when used for BM25 querying (#4385) 2023-03-29 16:51:22 +02:00
Silvano Cerza
e00f1461bc
Use bigger runner for Docker release (#4538) 2023-03-29 13:14:46 +02:00
oryx1729
5fc84904f1
fix: update envs for the backend image of annotation tool (#4535) 2023-03-29 12:54:21 +02:00
ZanSara
16bd7d0625
add back tutorial_running() (#4534) 2023-03-29 12:20:24 +02:00
Silvano Cerza
78216196d1
Fix docker images testing (#4536) 2023-03-29 12:20:06 +02:00
Silvano Cerza
85ade5c878
Fix Slack messages formatting on job failure (#4520) 2023-03-29 09:24:41 +02:00
Massimiliano Pippi
0dfa5d6ad7
fix: do not override bake's platform definitions (#4518)
* do not override bake's platform definitions

* test

* fix job name and remove override from minor version job

* test

* bump docker login action

* fix plurals

* Remove platform from matrix and test both platform in a single job

* Remove branch trigger used for testing

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-03-28 17:57:29 +02:00
ZanSara
651be37afc
proposal: DocumentStores and Retrievers (#4370)
* add proposal

* add proposal

* pr number

* pr number

* start second draft

* second draft

* node examples

* phrasing

* get_documents -> filter_documents
2023-03-28 16:31:42 +02:00
Agnieszka Marzec
aae2ad8e5c
Add whisper api (#4511) 2023-03-28 15:43:59 +02:00
Silvano Cerza
f4fb8dd946
Revert "ci: Change docker_release.yml workflow to run after successful PyPi release (#4293)" (#4513)
This reverts commit 6e241262ada9e59359d653a779246d2ad03c1223.
2023-03-28 15:28:15 +02:00
Vladimir Blagojevic
7c9f719496
refactor: Adjust WhisperTranscriber to pipeline run methods (#4510)
* Retrofit WhisperTranscriber run methods
* Add pipeline unit test
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-03-28 13:52:21 +02:00
Silvano Cerza
098342da32
Use new Slack action to send failure messages (#4464) 2023-03-28 10:49:32 +02:00
Silvano Cerza
dbdb682225
Enhance release_docs.py (#4459) 2023-03-28 09:56:42 +02:00
Silvano Cerza
cfb8dfd470
Fix pipeline config and agent tools hashing for telemetry (#4508) 2023-03-28 09:41:50 +02:00