2539 Commits

Author SHA1 Message Date
Mayank Jobanputra
93962c09fc
fix: fix torchaudio version (#4102)
* fix torchaudio version

* added comment for keeping torchaudio last

* removed torchaudio from base
2023-02-09 15:14:10 +05:30
oryx1729
8ecadd1cac
fix: query filters in REST API (#4105)
* Remove legacy _format_filters()

* Remove test case
2023-02-09 10:42:31 +01:00
Bijay Gurung
79f57d8460
Proposal: Add a JsonConverter node (#3959)
* Add Proposal: JsonConverter

* Add jsonl support + schema to JsonConverter Proposal

* Remove format option from JsonConverter Proposal

---------

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-02-09 09:57:00 +01:00
hsm207
508d9f6b32
feat: add support for custom headers (#4040) 2023-02-09 07:08:40 +01:00
Silvano Cerza
adf4a3ea2f
Fix pylint CI check running with no files (#4097) 2023-02-08 16:33:07 +01:00
Silvano Cerza
274746db07
style: Update black (#4101)
* Update black version

* Format file with new black style

* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
Sebastian
1bbf10a376
Remove double batching in retrieve_batch (#4014)
* Removed double batching around embed_queries

* Add back tests for retrieve_batch for dpr and embedding retrievers

* Updated table-text-retriever to not double batch

* Fixing pylint

* Update to test

* Remove code breaking test

* Updating dev comment to be clearer
2023-02-08 14:39:20 +01:00
Silvano Cerza
c66f855caf
Add missing env vars in rest_api CI tests (#4098) 2023-02-08 12:48:20 +01:00
Sebastian
01d39df863
feat: Update allowed models to be used with Prompt Node (#4018)
* Update allowed models to be used with Prompt Node

* Added try except block around the config to skip over OpenAI models.

* Fixing tests

* Adding warning message

* Adding test for different HF models that could be used in prompt node
2023-02-08 12:47:52 +01:00
Agnieszka Marzec
8135e75139
Add shaper to api docs (#4083) 2023-02-08 12:15:08 +01:00
Stefano Fiorucci
5c009c2a1a
feat: OpenAI - warn users if max_tokens is too short (#4094)
* warn users if max_tokens is too short

* skip test if not API KEY

* add counters

* correctly run precommit
2023-02-08 10:39:40 +01:00
tstadel
92c58cfda1
feat: Support multiple document_ids in Answer object (for generative QA) (#4062)
* initial version without shapers

* set document_ids for BaseGenerator

* introduce question-answering-with-references template

* better prompt

* make PromptTemplate control output_variable

* update schema

* fix add_doc_meta_data_to_answer

* Revert "fix add_doc_meta_data_to_answer"

This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9.

* fix add_doc_meta_data_to_answer

* fix eval

* fix pylint

* fix pinecone

* fix other tests

* fix test

* fix flaky test

* Revert "fix flaky test"

This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775.

* adjust docstrings

* make Label loading backward-compatible

* fix Label backward compatibility for pinecone

* fix Label backward compatibility for search engines

* fix Label backward compatibility for deepset Cloud

* fix tests

* fix None issue

* fix test_write_feedback

* add tests for legacy label support

* add document_id test for pinecone

* reduce unnecessary contents

* add comment to pinecone test
2023-02-08 08:37:22 +01:00
Silvano Cerza
5689c43e7e
ci: Make tests run conditionally in CI (#4086)
* Make tests run conditionally in CI

* Move rest_api test into separate workflow

* Avoid running tests.yml when rest_api is modified
2023-02-07 21:16:56 +01:00
Zoltan Fedor
a3016f065f
feat: Support multiple RayPipelines (#4078) 2023-02-07 11:01:07 +01:00
Silvano Cerza
3e4a2201df
ci: Change actionlint pre-commit hook to use Dockerized tool (#4060)
* Change actionlint pre-commit hook to use Dockerized tool

* Add ignore rule for actionlint

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-07 09:34:25 +01:00
Julian Risch
0e282e5ca4
refactor: replace mutable default arguments (#4070)
* refactor: replace mutable default arguments

* change type annotation in BasePreProcessor to Optional[List]
2023-02-07 09:30:33 +01:00
Vladimir Blagojevic
3273a2714d
fix: Add PromptTemplate __repr__ method (#4058)
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-02-07 08:14:32 +01:00
Sebastian
a9f13d4641
feat: Allow all training options for training a SentenceTransformers EmbeddingRetriever (#4026)
* Add additional options to pass to the SentenceTransformers trainer

* Make options accessible to the EmbeddingRetriever.train

* Update file-converters.yml

* Update transformers-img-to-text.yml

* Update 3550-csv-converter.md

* move type: ignore to correct line

* Moving type ignore again

* Fixing pylint and mypy

* Update haystack/nodes/retriever/_embedding_encoder.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update haystack/nodes/retriever/_embedding_encoder.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Update haystack/nodes/retriever/_embedding_encoder.py

Co-authored-by: bogdankostic <bogdankostic@web.de>

* Updated docstring to be less misleading.

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-02-07 08:05:21 +01:00
Silvano Cerza
bcf3bfdf79
Fix pylint workflow check running on tests files (#4076) 2023-02-06 19:41:36 +01:00
Julian Risch
51f30487e1
fix: add inner query for mysql compatibility (#4068) 2023-02-06 18:18:25 +01:00
Silvano Cerza
9cd94f3dc3
ci: Move formatting and linting checks out of tests.yml (#4046)
* Move formatting and linting checks out of tests.yml

* Revert "Move formatting and linting checks out of tests.yml"

This reverts commit b88b54b7e6404ce10401f308770348465e44b4fc.

* Move pylint and mypy out of tests.yml

* Fix black version

* Handle skipped but required checks
2023-02-06 16:47:48 +01:00
Zoltan Fedor
f4a30a552a
fix: use correct count of outgoing edges in RayPipeline (#4066) 2023-02-06 10:52:32 +01:00
Julian Risch
d819d6badf
proposal: Add Agents for extended LLM support (#3925)
* draft proposal

* add link to colab notebook (api keys required)

* Add alternative name ideas for MRKLAgent

* Breakdown of agent steps

* Added more sections

* Add even more sections

* simplify tool/action mentions, shorten

* agents as new abstraction instead of BaseComponent

* agent tools can be pipelines or nodes

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-02-06 09:47:10 +01:00
Massimiliano Pippi
5e65905659
fix workflow (#4055) 2023-02-06 08:40:13 +01:00
Stefano Fiorucci
b9ab7b3ca2
fix: make the crawler more robust on Windows (#4049)
* first try

* simplify the code a bit

* fix; better docstrings

* add URL
2023-02-03 16:43:18 +01:00
ZanSara
76db26f228
logging-format-interpolation (#3907) 2023-02-03 13:30:56 +01:00
Massimiliano Pippi
8824f3a10a
re-organize pydoc config files (#4042) 2023-02-03 12:51:10 +01:00
Jack Butler
f006eded7d
fix: allow Biadaptive & Triadaptive to work with EarlyStopping (#4033)
* fix: allow str when saving tri/bi-adaptive models

* fix: make trainer model loading class-agnostic

* test: add test for DPR with EarlyStopping

* refactor: simplify model reloading via classmethod

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-02-03 11:13:18 +01:00
Silvano Cerza
a092eac2c7
Add missing env var in PyPi release slack notification (#4052) 2023-02-03 11:03:01 +01:00
Silvano Cerza
6a9cb8651b
Fix pylint version to prevent crash (#4043) 2023-02-02 17:57:39 +01:00
Massimiliano Pippi
76bb105388
chore: remove unneeded files (#4036)
* remove unneeded files

* readme file should stay
2023-02-02 15:38:56 +01:00
tstadel
9611b64ec5
fix: document retrieval metrics for non-document_id document_relevance_criteria (#3885)
* fix document retrieval metrics for all document_relevance_criteria

* fix tests

* fix eval_batch metrics

* small refactorings

* evaluate metrics on label level

* document retrieval tests added

* fix pylint

* fix test

* support file retrieval

* add comment about threshold

* rename test
2023-02-02 15:00:07 +01:00
Silvano Cerza
e62d24d0eb
ci: Add linting of workflow and related pre-commit hook (#4032)
* Add actionlint pre-commit hook

* Add workflow to lint workflows

* Remove unused input in Python Cache action

* Move from deprecated set-output syntax to new one

* Add actionlint config to specify self-hosted runners labels
2023-02-02 14:33:23 +01:00
Massimiliano Pippi
2878c57645
Update pyproject.toml (#4035) 2023-02-02 11:59:17 +01:00
Silvano Cerza
d79d39b28a
Bump act10ns/slack from v1 to v2 (#4031) 2023-02-02 09:39:36 +01:00
Silvano Cerza
938cb62144
Fix PyPi release workflow (#4029) 2023-02-02 09:36:23 +01:00
Zoltan Fedor
3aa6522564
fix: Event sending for RayPipeline crashing Haystack (#3971)
* Remove the `send_pipeline_event_if_needed()` to confirm fix

* Suspending evnet sending for RayPipelines as it is not compatible

* Update base.py

* Updating implementation based on feedback from @masci

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-02 08:27:20 +01:00
ZanSara
9009a9ae58
feat: add Shaper (#3880)
* Shaper initial version

* Inital pydoc

* Add more unit tests

* Fix pydoc, expand Shaper pydoc with YAML example

* Minor fix

* Improve pydoc

* More unit tests with prompt node

* Describe Shaper functions in pydoc

* More pydoc

* Use pytest.raises instead of catching errors

* Improve test_function_invocation_order unit test

* pylint fixes

* Improve run_batch handling

* simpler version, initial stub

* stubbing tests

* promptnode compatibility

* add tests

* simplify

* fix promptnode tests

* pylint

* mypy

* fix corner case & mypy

* mypy

* review feedback

* tests

* Add lg updates

* add rename

* pylint

* Add complex unit test with two PNs and ICMs in between (#3921)

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>

* docstring

* fix tests

* add join_lists

* add documents_to_strings

* fix tests

* allow lists of input values

* doc review feedback

* do not use locals()

* Update with minor lg changes

* fix corner case in ICM

* fix merge

* review feedback

* answers conversions

* mypy

* add tests

* generative answers

* forgot to commit

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-01 18:36:13 +01:00
Silvano Cerza
e8ff48094b
Automate release on PyPi (#4015) 2023-02-01 17:40:21 +01:00
Julian Risch
3fcfc8eb23
chore: add discord badge to readme (#4027) 2023-02-01 16:59:22 +01:00
Sebastian
7b3d7ee83a
Reuse tokenizer instead of loading new one. (#4016)
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-02-01 10:44:18 +01:00
Sebastian
96706e9e7b
proposal: TableCell (#3875)
* Initial commit for TableSpan proposal

* Updating the proposal

* More updates to the proposal

* More changes

* Rename of file per Proposal instructions

* Update link

* Adding drawbacks

* Fixing typos

* Changed TableSpan to TableCell and updated proposal based on discussions.

* Adding discussion on identified bug.

* Rename proposal to reflect name change made during discussion. Added point to make it clear that we will be able to return a List of TableCells

* Update proposal with discussion about storing table as a list of lists

* Adding some additional code change descriptions.
2023-02-01 09:08:12 +01:00
tstadel
8002cf92d6
fix: extend schema for prompt node results (#3891)
* extend schema for prompt node results

* extend schema

* update openapi

* fix mypy for test module

* added 1.14 specs

* reverted schema for 1.13

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-01-31 16:31:33 +01:00
Julian Risch
c855e18d78
fix: prevent posthog from sending errors to stderr (#4008) 2023-01-31 11:02:47 +01:00
Zoltan Fedor
2b1849f525
fix: Add a verbose option to PromptNode to let users understand the prompts being used #2 (#3898)
* fix: Add a verbose option to PromptNode to let users understand the prompts being used #2

* Add comments and refactoring todo note

* Fix logging-fstring-interpolation pylint

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-01-31 09:33:47 +01:00
Massimiliano Pippi
378a3fd2e7
chore: add topic:* labels automatically whenever possible (#3997)
* add topics:* labels automatically whenever possible

* address review comments
2023-01-30 20:13:06 +01:00
Silvano Cerza
5f29c83e62
Delete Docker images after testing to prevent workflow failure (#4004) 2023-01-30 17:57:35 +01:00
Sebastian
249398d806
fix: Update telemetry to not serialize Pipeline if disabled. (#4000)
* Update telemetry to not serialize Pipeline if disabled.

* Also disabled telemetry sending event in run_async in the RayPipeline since RayPipeline cannot be serialized currently.
2023-01-30 16:58:43 +01:00
bogdankostic
1a8fe0031d
feat: Add use_prefiltering parameter to DeepsetCloudDocumentStore (#3969)
* Add `use_prefiltering` parameter

* Adapt doc string

* Pass use_prefiltering via API to dC

* Adapt doc string

* Adapt test
2023-01-30 15:12:34 +01:00
Silvano Cerza
b4c5bb7de4
Simplifies and fix docker images tests on release (#3982) 2023-01-30 14:48:47 +01:00