Silvano Cerza
18e83b3ed4
Pin requests-cache test dependency to <1.0.0 ( #4325 )
2023-03-03 12:47:15 +01:00
bogdankostic
f33829fabf
Remove xpdf dependencies ( #4314 )
2023-03-02 11:12:03 +01:00
Vladimir Blagojevic
79bf25aaea
feat: Add Azure as OpenAI endpoint ( #4170 )
...
* Add Azure as OpenAI endpoint
---------
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-03-02 09:55:09 +01:00
Daniel Bichuetti
7c49fffc71
feat: Enable PDFToTextConverter multiprocessing, increase general performance and simplify installation ( #4226 )
...
* refactor: isolate PDF converters
* refactor: remove xpdf dependency and fix tests
* refactor: add min. version
* feat: enable multiprocessing and add tests
* fix: remove unused imports
* fix: regression when moved code
* refactor: use itertools
* fix: mypy claims
* refactor: double tool support
* refactor: add fallback to xpdf
* refactor: black formatting
* refactor: make superclass signature compatible
* refactor: complete removal of xPdf
* refactor: regroup Haystack imports and fix regression
* refactor: remove original declaration
* docs: fix docstrings
* tests: add [pdf] to [all]
* refactor: remove redundant checks, avoid extra processes
* refactor: add deprecation warning
* refactor: add pytest mark
* tests: change PDF test file
* fix: correct pytest mark
* refactor: deprecate parameter and add new
* tests: change pdf sample
* Add minor lg changes to docstrings
* Fix default value in doc strings
* Update test/nodes/test_file_converter.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* tests: fix page count
* refactor: add imported function
* refactor: change default value
* tests: change parameters and fix typo
* Unify sort_by_position parameter names
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-03-01 22:34:38 +01:00
Silvano Cerza
90da7bf4f8
Fix docstring-labeler.yml workflow ( #4307 )
2023-03-01 17:49:04 +01:00
ZanSara
ae04ce3c6a
test: mock all Summarizer tests and move a few into e2e ( #4299 )
...
* stub e2e folders
* simplify pipeline test
* mocking
* unit tests fixed
* clean up e2e
* pipeline tests work
* pylint
* leftover
* small fix from #2994 and additional tests
* review feedback
* change summaries
* black
* revert models and summaries
2023-03-01 17:30:55 +01:00
bogdankostic
583d2d8244
Fix search path for Shaper API docs ( #4306 )
2023-03-01 16:10:39 +01:00
ZanSara
165a0a5faa
test: mock all Translator tests and move one to e2e ( #4290 )
...
* mock all translator tests and move one to e2e
* typo
* extract pipeline tests using translator
* remove duplicate test
* move generator test in e2e
* Update e2e/pipelines/test_extractive_qa.py
* pytest.mark.unit
* black
* remove model name as well
* remove unused fixture
* rename original and improve pipeline tests
* fixes
* pylint
2023-03-01 14:52:05 +01:00
Agnieszka Marzec
7e0f9715ba
Docs: Add shaper API ( #4288 )
...
* Add shaper and update category id
* Fix the category id
* Update category
2023-03-01 14:02:47 +01:00
Stefano Fiorucci
e8f9b1b65d
test: replace ElasticsearchDS with InMemoryDS when it makes sense; support scale_score in InMemoryDS ( #4283 )
...
* replace elasticds with imds - first draft
* fix
* fix tests and implement scale_score in imds bm25
* add docstrings for scale_score
2023-03-01 11:35:10 +01:00
Silvano Cerza
ee74421212
ci: Refactor docs config and generation ( #4280 )
...
* Change docs yml category config
* Update docs renderers to fetch categories from Readme.io
* Update readme_sync.yml to handle new docs rendering
* Remove unecessary script and related workflow step
* Fix sys.exits
2023-03-01 09:51:02 +01:00
Silvano Cerza
6e241262ad
ci: Change docker_release.yml workflow to run after successful PyPi release ( #4293 )
...
* Change docker_release.yml workflow to run after successful PyPi release
* Add warning on name change in pypi_release.yml
2023-03-01 09:50:47 +01:00
tstadel
d1c9407a25
fix opensearch delete_index ( #4295 )
2023-03-01 08:40:38 +01:00
Malte Pietsch
2a1d73e16d
refactor: Make extraction of "Tool" and "Tool input" for Agent more robust and user-friendly ( #4269 )
...
* adjust [] in prompt template. Add error+docs for Tool name.
* fix test
* update error message
2023-02-28 20:01:34 +01:00
Massimiliano Pippi
c3a38a59c0
Update test_prompt_node.py ( #4281 )
2023-02-28 09:37:40 +01:00
Julian Risch
662441a62b
fix: FARMReader produces Answers with negative start and end position ( #4248 )
2023-02-28 09:27:42 +01:00
Sebastian
040d806b42
test: Added integration test for using EntityExtractor in query pipeline ( #4117 )
...
* Added new test for using EntityExtractor in query node and made some fixtures to reduce code duplication.
* Reuse ner_node fixture
* Added pytest unit markings and swapped over to in memory doc store.
* Change to integration tests
2023-02-28 09:20:44 +01:00
Silvano Cerza
5678bb6375
Parallellize Docker build job ( #4268 )
2023-02-27 16:03:24 +01:00
Massimiliano Pippi
4b8d195288
refact: mark unit tests under the test/nodes/** path ( #4235 )
...
* document merger
* mark unit tests
* revert
2023-02-27 15:00:19 +01:00
Sebastian
efe46b1214
Fix: Allow torch_dtype="auto" in PromptNode ( #4166 )
...
* Fix for allowing torch_dtype="auto"
* Fix to logic of torch_dtype detection
* separate test for dtype
2023-02-27 09:59:27 +01:00
Silvano Cerza
4a93517eb4
test: Fix deprecation fixture ( #4219 )
...
* Fix deprecation fixture
* Update docstring
* Update docstring
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-02-27 09:55:03 +01:00
Kshitij Pawar
3d3e9c9b32
Fix: Issue of failure to initialize input_converter in Seq2SeqGenerator when model_file_path is given as folder path on local disk after manual model download ( #4213 )
...
* test
* test documentation commit:
* added original return statement for linting
* removed empty lines
* formatted code using black
* made changes based on suggestions
2023-02-26 18:13:26 +01:00
Silvano Cerza
2c9e4c5ff9
Remove unnecessary operations in minor_version_release.yml ( #4267 )
2023-02-24 14:29:42 +01:00
Silvano Cerza
280414e5c6
Fix OpenAPI specs upload ( #4266 )
2023-02-24 10:50:59 +01:00
ZanSara
13c4ff1b52
refactor: remove direct logging without a logger ( #4253 )
...
* remove direct logging without a logger
* add custom pylint checker
* add test
* pylint
* improve checker message
* mypy
* remove test
* add checker for basicConfig
* more logging missed
* ignore basicConfig
* move out logger
* move out statement
* remove logging configuration
2023-02-23 20:42:42 +01:00
Vladimir Blagojevic
4b189c0b40
proposal: Implement Agent demo ( #4085 )
...
* Agent demo proposal
* Replace on-the-fly module with WebRetriever
* Update proposal with ideas from discussion with Julian
* Replace SerpAPI references with SearchEngine
* Add Agent memory
* Update Agent memory
2023-02-23 19:56:38 +01:00
Silvano Cerza
d594ab800b
ci: Fix OpenAPI spec sync ( #4254 )
...
* Attempt to fix OpenAPI sync
* Dry run
* Add step to get OpenAPI specs id
* Remove dryRun and branch trigger
2023-02-23 19:02:46 +01:00
ZanSara
c0c09f1287
Fix typo in google.colab package detection ( #4238 )
2023-02-23 17:53:23 +01:00
Stefano Fiorucci
5e85f33bd3
refactor: Remove deprecated nodes EvalDocuments and EvalAnswers ( #4194 )
...
* remove deprecated classed and update test
* remove deprecated classed and update test
* remove unused code
* remove unused import
* remove empty evaluator node
* unused import :-)
* move sas to metrics
2023-02-23 15:26:17 +01:00
Massimiliano Pippi
722dead1b2
fix agents tests ( #4237 )
2023-02-23 13:03:45 +01:00
ZanSara
b193e08a64
set env var ( #4239 )
...
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-23 11:59:46 +01:00
Silvano Cerza
c3bf62d4b0
Add a simple way to skip required tests checks ( #4245 )
2023-02-23 11:00:20 +01:00
Massimiliano Pippi
764eaa035f
skip summarizer tests to reduce pressure ( #4241 )
2023-02-23 09:50:24 +01:00
Massimiliano Pippi
dd37b4c29f
fix: apply black formatting ( #4240 )
...
* fix black formatting
* try
2023-02-23 08:59:40 +01:00
Agnieszka Marzec
1dc7f6215e
Update top_k description ( #4224 )
2023-02-22 23:05:41 +02:00
Silvano Cerza
b6371c95a8
Add missing dependencies in openapi upload workflow ( #4236 )
2023-02-22 19:34:22 +01:00
ZanSara
f816efa50c
feat: reduce and focus telemetry ( #4087 )
...
* simplified telemetry and docker containers detection
* pylint
* mypy
* mypy
* Add new credentials and metadata
* remove prints
* mypy
* remove comment
* simplify inout len measurement
* black
* removed old telemetry, to revert
* reintroduce env function
* reintroduce old telemetry
* fix telemetry selection
* telemetry for promptnode
* telemetry for some training methods
* telemetry for eval and distillation
* mypy & pylint
* review
* Update lg
* mypy
* improve docstrings
* pylint
* mypy
* fix test
* linting
* remove old tests
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-22 19:02:47 +01:00
Silvano Cerza
181e5474e8
ci: Automate OpenAPI specs upload to Readme.io ( #4228 )
...
* Remove OpenAPI specs file
* OpenAPI specs are now automatically uploaded when necessary
* Rename openapi workflow
2023-02-22 18:01:18 +01:00
Daniel Bichuetti
e0b0fe1bc3
feat!: Increase Crawler standardization regarding Pipelines ( #4122 )
...
* feat!(Crawler): Integrate Crawler in the Pipeline.
+Output Documents
+Optional file saving
+Optional Document meta about file path
* refactor: add Optional decl.
* chore: dummy commit
* chore: dummy commit
* refactor: improve overwrite flow
* refactor: change custom file path meta logic + add test
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-22 17:34:19 +01:00
Balamurugan Periyasamy
49ed21b82d
fix: Better error messages for OCR requirement ( #3767 ) ( #3900 )
...
Add pip install requirement in the error message for missing depency.
2023-02-22 14:57:28 +01:00
tstadel
32b2abf9d5
fix: add option to not override results by Shaper ( #4231 )
...
* add option to shaper and support answers
* remove publish restrictions on outputs
* support list
2023-02-22 14:36:58 +01:00
Massimiliano Pippi
262c9771f4
relax test assertion ( #4229 )
2023-02-22 12:37:09 +01:00
Daniel Bichuetti
1e4ef24ae9
refactor: isolate PDF converters ( #4193 )
2023-02-22 08:50:18 +01:00
Massimiliano Pippi
40f772a9b0
refact: move the first batch of unit tests into the proper job ( #4216 )
...
* move the first batch of unit tests into the proper job
* leftover
2023-02-21 17:00:02 +01:00
Silvano Cerza
87a02d9372
Fix Dockerfile.base failing cause of missing dependencies ( #4215 )
2023-02-21 16:37:33 +01:00
Julian Risch
5ce7a404ac
feat: Add Agent ( #4148 )
...
* initial Agent implementation
* mypy and pylint fixes
* add missing ABC import
* improved prompt template
* refactor and shorten run method
* refactor and shorten run method
* add tests for extracting
* fix mixed up tool_input/observation & make tests more robust
* fix bug with max_iterations and update prompt template
* allow setting prompt_template in Agent init
* remove example yml for agent
* add final prediction to transcript
* add transcript to errors and accept PromptTemplate in init
* simplify if else to elif
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* add checks for max_iter<2 and empty list returned by prompt node
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-02-21 14:27:40 +01:00
Sebastian
bde01cbf1f
Checking if output keys and output_values are same length and fix bug in storing output keys ( #4223 )
2023-02-21 13:36:15 +01:00
Sebastian
2bedb80ba5
Fix for custom template in OpenAIAnswerGenerator ( #4220 )
2023-02-21 13:35:17 +01:00
Mayank Jobanputra
c4b98fcccc
allowing file-upload api to work with write permission ( #4221 )
2023-02-21 16:48:02 +05:30
Bijay Gurung
d4b822646e
feat: Add JsonConverter node ( #4130 )
...
* Add JsonConverter node
* Update language
* JsonConverter: Remove id_hash_keys overwrite when it's None
Also, changes in docstring based on review
* Update docstring for JsonConverter
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-02-21 09:23:42 +01:00