ZanSara
13c4ff1b52
refactor: remove direct logging without a logger ( #4253 )
...
* remove direct logging without a logger
* add custom pylint checker
* add test
* pylint
* improve checker message
* mypy
* remove test
* add checker for basicConfig
* more logging missed
* ignore basicConfig
* move out logger
* move out statement
* remove logging configuration
2023-02-23 20:42:42 +01:00
Vladimir Blagojevic
4b189c0b40
proposal: Implement Agent demo ( #4085 )
...
* Agent demo proposal
* Replace on-the-fly module with WebRetriever
* Update proposal with ideas from discussion with Julian
* Replace SerpAPI references with SearchEngine
* Add Agent memory
* Update Agent memory
2023-02-23 19:56:38 +01:00
Silvano Cerza
d594ab800b
ci: Fix OpenAPI spec sync ( #4254 )
...
* Attempt to fix OpenAPI sync
* Dry run
* Add step to get OpenAPI specs id
* Remove dryRun and branch trigger
2023-02-23 19:02:46 +01:00
ZanSara
c0c09f1287
Fix typo in google.colab package detection ( #4238 )
2023-02-23 17:53:23 +01:00
Stefano Fiorucci
5e85f33bd3
refactor: Remove deprecated nodes EvalDocuments
and EvalAnswers
( #4194 )
...
* remove deprecated classed and update test
* remove deprecated classed and update test
* remove unused code
* remove unused import
* remove empty evaluator node
* unused import :-)
* move sas to metrics
2023-02-23 15:26:17 +01:00
Massimiliano Pippi
722dead1b2
fix agents tests ( #4237 )
2023-02-23 13:03:45 +01:00
ZanSara
b193e08a64
set env var ( #4239 )
...
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-23 11:59:46 +01:00
Silvano Cerza
c3bf62d4b0
Add a simple way to skip required tests checks ( #4245 )
2023-02-23 11:00:20 +01:00
Massimiliano Pippi
764eaa035f
skip summarizer tests to reduce pressure ( #4241 )
2023-02-23 09:50:24 +01:00
Massimiliano Pippi
dd37b4c29f
fix: apply black formatting ( #4240 )
...
* fix black formatting
* try
2023-02-23 08:59:40 +01:00
Agnieszka Marzec
1dc7f6215e
Update top_k description ( #4224 )
2023-02-22 23:05:41 +02:00
Silvano Cerza
b6371c95a8
Add missing dependencies in openapi upload workflow ( #4236 )
2023-02-22 19:34:22 +01:00
ZanSara
f816efa50c
feat: reduce and focus telemetry ( #4087 )
...
* simplified telemetry and docker containers detection
* pylint
* mypy
* mypy
* Add new credentials and metadata
* remove prints
* mypy
* remove comment
* simplify inout len measurement
* black
* removed old telemetry, to revert
* reintroduce env function
* reintroduce old telemetry
* fix telemetry selection
* telemetry for promptnode
* telemetry for some training methods
* telemetry for eval and distillation
* mypy & pylint
* review
* Update lg
* mypy
* improve docstrings
* pylint
* mypy
* fix test
* linting
* remove old tests
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-22 19:02:47 +01:00
Silvano Cerza
181e5474e8
ci: Automate OpenAPI specs upload to Readme.io ( #4228 )
...
* Remove OpenAPI specs file
* OpenAPI specs are now automatically uploaded when necessary
* Rename openapi workflow
2023-02-22 18:01:18 +01:00
Daniel Bichuetti
e0b0fe1bc3
feat!: Increase Crawler standardization regarding Pipelines ( #4122 )
...
* feat!(Crawler): Integrate Crawler in the Pipeline.
+Output Documents
+Optional file saving
+Optional Document meta about file path
* refactor: add Optional decl.
* chore: dummy commit
* chore: dummy commit
* refactor: improve overwrite flow
* refactor: change custom file path meta logic + add test
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Update haystack/nodes/connector/crawler.py
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-22 17:34:19 +01:00
Balamurugan Periyasamy
49ed21b82d
fix: Better error messages for OCR requirement ( #3767 ) ( #3900 )
...
Add pip install requirement in the error message for missing depency.
2023-02-22 14:57:28 +01:00
tstadel
32b2abf9d5
fix: add option to not override results by Shaper
( #4231 )
...
* add option to shaper and support answers
* remove publish restrictions on outputs
* support list
2023-02-22 14:36:58 +01:00
Massimiliano Pippi
262c9771f4
relax test assertion ( #4229 )
2023-02-22 12:37:09 +01:00
Daniel Bichuetti
1e4ef24ae9
refactor: isolate PDF converters ( #4193 )
2023-02-22 08:50:18 +01:00
Massimiliano Pippi
40f772a9b0
refact: move the first batch of unit tests into the proper job ( #4216 )
...
* move the first batch of unit tests into the proper job
* leftover
2023-02-21 17:00:02 +01:00
Silvano Cerza
87a02d9372
Fix Dockerfile.base failing cause of missing dependencies ( #4215 )
2023-02-21 16:37:33 +01:00
Julian Risch
5ce7a404ac
feat: Add Agent ( #4148 )
...
* initial Agent implementation
* mypy and pylint fixes
* add missing ABC import
* improved prompt template
* refactor and shorten run method
* refactor and shorten run method
* add tests for extracting
* fix mixed up tool_input/observation & make tests more robust
* fix bug with max_iterations and update prompt template
* allow setting prompt_template in Agent init
* remove example yml for agent
* add final prediction to transcript
* add transcript to errors and accept PromptTemplate in init
* simplify if else to elif
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* add checks for max_iter<2 and empty list returned by prompt node
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-02-21 14:27:40 +01:00
Sebastian
bde01cbf1f
Checking if output keys and output_values are same length and fix bug in storing output keys ( #4223 )
2023-02-21 13:36:15 +01:00
Sebastian
2bedb80ba5
Fix for custom template in OpenAIAnswerGenerator ( #4220 )
2023-02-21 13:35:17 +01:00
Mayank Jobanputra
c4b98fcccc
allowing file-upload api to work with write permission ( #4221 )
2023-02-21 16:48:02 +05:30
Bijay Gurung
d4b822646e
feat: Add JsonConverter node ( #4130 )
...
* Add JsonConverter node
* Update language
* JsonConverter: Remove id_hash_keys overwrite when it's None
Also, changes in docstring based on review
* Update docstring for JsonConverter
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-02-21 09:23:42 +01:00
Silvano Cerza
f5b8835e2c
ci: Fix Dockerfile.base failing cause of missing git ( #4210 )
2023-02-20 18:40:30 +01:00
Silvano Cerza
e6af353530
ci: Add ca-certificates installation to xpdf container ( #4206 )
2023-02-20 17:47:10 +01:00
abwiersma
7aae4293d7
Check cuda availability before calling ( #4174 )
...
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-20 17:37:56 +01:00
bogdankostic
18e7b8399b
refactor: Remove id_hash_keys
parameter in from_dict
method ( #4207 )
...
* Remove id_hash_keys parameter in from_dict method
* Remove unused import
* Adapt `from_dict` of `SpeechDocument`
* Revert "Adapt `from_dict` of `SpeechDocument`"
This reverts commit 309cbeb7fbb3094c43be76d9e431db9391913144.
* Adapt `from_dict` of `SpeechDocument`
2023-02-20 17:37:35 +01:00
Silvano Cerza
30cdb81f19
ci: Move xpdf build into separate container ( #4199 )
...
* Create Dockerfile and hcl config to build Xpdf
* Create workflow to build Xpdf Docker image
* Update Dockerfile.base to not build Xpdf
* Fix CWD removal and arg casing
* Fix ARG setting
2023-02-20 14:58:11 +01:00
github-actions[bot]
aaa1522c45
Update unstable version and openapi schema ( #4205 )
...
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2023-02-20 14:57:45 +01:00
tstadel
14578aa54f
feat: add top_k
to PromptNode
( #4159 )
...
* add top_k to PromptNode
* fix OpenAI
* fix openai test
2023-02-20 14:51:45 +01:00
Sebastian
d129598203
Prompt node/run batch ( #4072 )
...
* Starting to implement first pass at run_batch
* Started to add _flatten_input function
* First pass at run_batch method.
* Fixed bug
* Adding tests for run_batch
* Update doc strings
* Pylint and mypy
* Pylint
* Fixing mypy
* Restructurig of run_batch tests
* Add minor lg updates
* Adding more tests
* Update dev comments and call static method differently
* Fixed the setting of output variable
* Set output_variable in __init__ of PromptNode
* Make a one-liner
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-20 11:58:13 +01:00
Massimiliano Pippi
83d615a32b
feat: include testing facilities into haystack package ( #4182 )
2023-02-17 19:38:03 +01:00
Sebastian
44509cd6a1
feat: Add OpenAIError to retry mechanism ( #4178 )
...
* Add OpenAIError to retry mechanism. Use env variable for timeout for OpenAI request in PromptNode.
* Updated retry in OpenAI embedding encoder as well.
* Empty commit
2023-02-17 13:17:44 +01:00
bogdankostic
7eeb3e07bf
feat: Add IVF and Product Quantization support for OpenSearchDocumentStore ( #3850 )
...
* Add IVF and Product Quantization support for OpenSearchDocumentStore
* Remove unused import statement
* Fix mypy
* Adapt doc strings and error messages to account for PQ
* Adapt validation of indices
* Adapt existing tests
* Fix pylint
* Add tests
* Update lg
* Adapt based on PR review comments
* Fix Pylint
* Adapt based on PR review
* Add request_timeout
* Adapt based on PR review
* Adapt based on PR review
* Adapt tests
* Pin tenacity
* Unpin tenacity
* Adapt based on PR comments
* Add match to tests
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-17 10:28:36 +01:00
Tuana Celik
8370715e7c
chore: de-couple the telemetry events for each tutorial from the dataset on AWS that is used ( #4155 )
...
* removing old dataset telemetry events
* changing function name
* adding the datasets back for old tutorials
* fixing mini bug
* resolving cometns
* quick bug fix
* re-adding docstrings
* removing unnecessay import
* re-adding the telemetry event call for datasets
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-17 00:21:46 +01:00
tstadel
e7bb2487eb
make all OpenAI API params controllable via model_kwargs ( #4183 )
...
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-16 19:56:08 +01:00
Daniel Bichuetti
9f5a3344d5
fix: Windows amd64 platform repr ( #4175 )
2023-02-16 19:46:34 +01:00
Tuana Celik
cdb05f0f9a
chore: Fixing PromptNode .prompt() docstring to include the PromptTemplate object as an option ( #4135 )
...
* fix to include the PromptTemplate object as an option
* small fix
2023-02-16 19:05:04 +01:00
Silvano Cerza
a4407f8f98
Use larger runner for Docker release workflow ( #4185 )
2023-02-16 18:59:13 +01:00
bogdankostic
fe650b2a3a
fix: Remove logging statement of setting ID manually in Document
( #4129 )
...
* Remove logging statement
* update lg
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-16 18:58:21 +01:00
Daniel Bichuetti
5187cc1801
refactor: Remove the pin from the espnet module and fix the audio node tests. ( #4128 )
...
* fix: fix audio tests + unbound some dependencies
* fix: update for Python 3.8
* refactor: change numpy assertion
* feat: add voice recog. support on audio tests
* fix: fix var assignement
* chore: dummy commit
* fix: fix sndfile error
* refactor: change skip reason
* refactor: hardcode variable
* refactor: unpin numpy
* fix: pin numpy only for audio
2023-02-16 22:12:17 +05:30
Agnieszka Marzec
e7c32da8d7
Fix code block formatting ( #4162 )
2023-02-16 16:55:41 +01:00
Agnieszka Marzec
e16f1c8935
Docs: Add filter to hide entity post processor ( #4160 )
...
* Add filter to hide entity post processor
* Add missing space
2023-02-16 16:40:42 +01:00
Silvano Cerza
689f2cd250
Update docstring-labeler.yml workflow to safely run in PRs from forks ( #4146 )
2023-02-16 16:02:41 +01:00
Mayank Jobanputra
d27f372b67
build: cache nltk models into the docker image ( #4118 )
...
* separated nltk cache
* separated nltk caching
* fixed pylint lazy log error
* using model name as default value
2023-02-16 16:56:16 +05:30
Massimiliano Pippi
ec72dd73fc
refactor: complete the document stores test refactoring ( #4125 )
...
* add e2e tests
* move tests to their own module
* add e2e workflow
* pylint
* remove from job
* fix index field name
* skip test on sql
* removed unused code
* fix embedding tests
* adjust test for pinecone
* adjust assertions to the new documents
* bad copypasta
* test
* fix tests
* fix tests
* fix test
* fix tests
* pylint
* update milvus version
* remove debug
* move graphdb tests under e2e
2023-02-16 09:43:25 +01:00
Sebastian
9a26942952
feat: Add model_kwargs option to PromptNode ( #4151 )
...
* Add input option to PromptNode to allow the passing of default kwargs
* Add yaml test for model_kwargs parameter
2023-02-15 18:46:26 +01:00