bogdankostic
27aaa92800
docs: Remove some classes regarding PromptNode from API reference docs ( #4132 )
2023-02-10 15:56:38 +01:00
Vladimir Blagojevic
d839b9314f
Update PromptTemplate tests ( #4131 )
2023-02-10 15:24:01 +01:00
bogdankostic
05950719ba
fix: Deduplicate same Documents in isolated evaluation of Reader ( #4114 )
...
* Deduplicate same Documents in one MultiLabel
* Add tests
* Update label
* Update label
* Update test
* Update test
* Revert change to check CI
* Revert reversion
* Use deepcopy
* Update tests
2023-02-10 13:55:14 +01:00
Agnieszka Marzec
3c793e4edc
Docs: Update docstrings ( #4119 )
...
* Update docstrings
* Blackify
* Bring back the template wording
* Blackify
2023-02-10 11:51:51 +01:00
Silvano Cerza
2cc938ff90
ci: Add workflow to label PRs that edit docstrings ( #4115 )
...
* Add workflow to label PRs that edit docstrings
* Add python-version arg in setup-python steps
* Run workflow only in haystack and rest_api python files edit
* Fix labeling job
* Fix labeling conditional
* Fix files globbing in docstrings_checksum.py
* Fix typing
* Rework workflow to use a single job
2023-02-09 18:57:30 +01:00
Silvano Cerza
0b23f84205
Exclude .github folder from triggering tests in CI ( #4120 )
2023-02-09 18:07:27 +01:00
Jack Butler
e6b6f70ae2
fix: Fix TableTextRetriever
for input consisting of tables only ( #4048 )
...
* fix: update kwargs for TriAdaptiveModel
* fix: squeeze batch for TTR inference
* test: add test for ttr + dataframe case
* test: update and reorganise ttr tests
* refactor: make triadaptive model handle shapes
* refactor: remove duplicate reshaping
* refactor: rename test with duplicate name
* fix: add device assignment back to TTR
* fix: remove duplicated vars in test
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-02-09 11:38:16 +01:00
bogdankostic
986472c26f
feat: Add BM25 support for tables in InMemoryDocumentStore ( #4090 )
...
* Add BM25 support for tables in InMemoryDocumentStore
* Add table type to query method
* Fix import order
* Adapt tests
2023-02-09 10:47:35 +01:00
Mayank Jobanputra
93962c09fc
fix: fix torchaudio version ( #4102 )
...
* fix torchaudio version
* added comment for keeping torchaudio last
* removed torchaudio from base
2023-02-09 15:14:10 +05:30
oryx1729
8ecadd1cac
fix: query filters in REST API ( #4105 )
...
* Remove legacy _format_filters()
* Remove test case
2023-02-09 10:42:31 +01:00
Bijay Gurung
79f57d8460
Proposal: Add a JsonConverter node ( #3959 )
...
* Add Proposal: JsonConverter
* Add jsonl support + schema to JsonConverter Proposal
* Remove format option from JsonConverter Proposal
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-02-09 09:57:00 +01:00
hsm207
508d9f6b32
feat: add support for custom headers ( #4040 )
2023-02-09 07:08:40 +01:00
Silvano Cerza
adf4a3ea2f
Fix pylint CI check running with no files ( #4097 )
2023-02-08 16:33:07 +01:00
Silvano Cerza
274746db07
style: Update black ( #4101 )
...
* Update black version
* Format file with new black style
* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
Sebastian
1bbf10a376
Remove double batching in retrieve_batch ( #4014 )
...
* Removed double batching around embed_queries
* Add back tests for retrieve_batch for dpr and embedding retrievers
* Updated table-text-retriever to not double batch
* Fixing pylint
* Update to test
* Remove code breaking test
* Updating dev comment to be clearer
2023-02-08 14:39:20 +01:00
Silvano Cerza
c66f855caf
Add missing env vars in rest_api CI tests ( #4098 )
2023-02-08 12:48:20 +01:00
Sebastian
01d39df863
feat: Update allowed models to be used with Prompt Node ( #4018 )
...
* Update allowed models to be used with Prompt Node
* Added try except block around the config to skip over OpenAI models.
* Fixing tests
* Adding warning message
* Adding test for different HF models that could be used in prompt node
2023-02-08 12:47:52 +01:00
Agnieszka Marzec
8135e75139
Add shaper to api docs ( #4083 )
2023-02-08 12:15:08 +01:00
Stefano Fiorucci
5c009c2a1a
feat: OpenAI - warn users if max_tokens
is too short ( #4094 )
...
* warn users if max_tokens is too short
* skip test if not API KEY
* add counters
* correctly run precommit
2023-02-08 10:39:40 +01:00
tstadel
92c58cfda1
feat: Support multiple document_ids in Answer object (for generative QA) ( #4062 )
...
* initial version without shapers
* set document_ids for BaseGenerator
* introduce question-answering-with-references template
* better prompt
* make PromptTemplate control output_variable
* update schema
* fix add_doc_meta_data_to_answer
* Revert "fix add_doc_meta_data_to_answer"
This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9.
* fix add_doc_meta_data_to_answer
* fix eval
* fix pylint
* fix pinecone
* fix other tests
* fix test
* fix flaky test
* Revert "fix flaky test"
This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775.
* adjust docstrings
* make Label loading backward-compatible
* fix Label backward compatibility for pinecone
* fix Label backward compatibility for search engines
* fix Label backward compatibility for deepset Cloud
* fix tests
* fix None issue
* fix test_write_feedback
* add tests for legacy label support
* add document_id test for pinecone
* reduce unnecessary contents
* add comment to pinecone test
2023-02-08 08:37:22 +01:00
Silvano Cerza
5689c43e7e
ci: Make tests run conditionally in CI ( #4086 )
...
* Make tests run conditionally in CI
* Move rest_api test into separate workflow
* Avoid running tests.yml when rest_api is modified
2023-02-07 21:16:56 +01:00
Zoltan Fedor
a3016f065f
feat: Support multiple RayPipelines
( #4078 )
2023-02-07 11:01:07 +01:00
Silvano Cerza
3e4a2201df
ci: Change actionlint pre-commit hook to use Dockerized tool ( #4060 )
...
* Change actionlint pre-commit hook to use Dockerized tool
* Add ignore rule for actionlint
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-07 09:34:25 +01:00
Julian Risch
0e282e5ca4
refactor: replace mutable default arguments ( #4070 )
...
* refactor: replace mutable default arguments
* change type annotation in BasePreProcessor to Optional[List]
2023-02-07 09:30:33 +01:00
Vladimir Blagojevic
3273a2714d
fix: Add PromptTemplate __repr__ method ( #4058 )
...
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-02-07 08:14:32 +01:00
Sebastian
a9f13d4641
feat: Allow all training options for training a SentenceTransformers EmbeddingRetriever ( #4026 )
...
* Add additional options to pass to the SentenceTransformers trainer
* Make options accessible to the EmbeddingRetriever.train
* Update file-converters.yml
* Update transformers-img-to-text.yml
* Update 3550-csv-converter.md
* move type: ignore to correct line
* Moving type ignore again
* Fixing pylint and mypy
* Update haystack/nodes/retriever/_embedding_encoder.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/retriever/_embedding_encoder.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/retriever/_embedding_encoder.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Updated docstring to be less misleading.
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-02-07 08:05:21 +01:00
Silvano Cerza
bcf3bfdf79
Fix pylint workflow check running on tests files ( #4076 )
2023-02-06 19:41:36 +01:00
Julian Risch
51f30487e1
fix: add inner query for mysql compatibility ( #4068 )
2023-02-06 18:18:25 +01:00
Silvano Cerza
9cd94f3dc3
ci: Move formatting and linting checks out of tests.yml ( #4046 )
...
* Move formatting and linting checks out of tests.yml
* Revert "Move formatting and linting checks out of tests.yml"
This reverts commit b88b54b7e6404ce10401f308770348465e44b4fc.
* Move pylint and mypy out of tests.yml
* Fix black version
* Handle skipped but required checks
2023-02-06 16:47:48 +01:00
Zoltan Fedor
f4a30a552a
fix: use correct count of outgoing edges in RayPipeline ( #4066 )
2023-02-06 10:52:32 +01:00
Julian Risch
d819d6badf
proposal: Add Agents for extended LLM support ( #3925 )
...
* draft proposal
* add link to colab notebook (api keys required)
* Add alternative name ideas for MRKLAgent
* Breakdown of agent steps
* Added more sections
* Add even more sections
* simplify tool/action mentions, shorten
* agents as new abstraction instead of BaseComponent
* agent tools can be pipelines or nodes
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-02-06 09:47:10 +01:00
Massimiliano Pippi
5e65905659
fix workflow ( #4055 )
2023-02-06 08:40:13 +01:00
Stefano Fiorucci
b9ab7b3ca2
fix: make the crawler more robust on Windows ( #4049 )
...
* first try
* simplify the code a bit
* fix; better docstrings
* add URL
2023-02-03 16:43:18 +01:00
ZanSara
76db26f228
logging-format-interpolation ( #3907 )
2023-02-03 13:30:56 +01:00
Massimiliano Pippi
8824f3a10a
re-organize pydoc config files ( #4042 )
2023-02-03 12:51:10 +01:00
Jack Butler
f006eded7d
fix: allow Biadaptive & Triadaptive to work with EarlyStopping ( #4033 )
...
* fix: allow str when saving tri/bi-adaptive models
* fix: make trainer model loading class-agnostic
* test: add test for DPR with EarlyStopping
* refactor: simplify model reloading via classmethod
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-02-03 11:13:18 +01:00
Silvano Cerza
a092eac2c7
Add missing env var in PyPi release slack notification ( #4052 )
2023-02-03 11:03:01 +01:00
Silvano Cerza
6a9cb8651b
Fix pylint version to prevent crash ( #4043 )
2023-02-02 17:57:39 +01:00
Massimiliano Pippi
76bb105388
chore: remove unneeded files ( #4036 )
...
* remove unneeded files
* readme file should stay
2023-02-02 15:38:56 +01:00
tstadel
9611b64ec5
fix: document retrieval metrics for non-document_id document_relevance_criteria ( #3885 )
...
* fix document retrieval metrics for all document_relevance_criteria
* fix tests
* fix eval_batch metrics
* small refactorings
* evaluate metrics on label level
* document retrieval tests added
* fix pylint
* fix test
* support file retrieval
* add comment about threshold
* rename test
2023-02-02 15:00:07 +01:00
Silvano Cerza
e62d24d0eb
ci: Add linting of workflow and related pre-commit hook ( #4032 )
...
* Add actionlint pre-commit hook
* Add workflow to lint workflows
* Remove unused input in Python Cache action
* Move from deprecated set-output syntax to new one
* Add actionlint config to specify self-hosted runners labels
2023-02-02 14:33:23 +01:00
Massimiliano Pippi
2878c57645
Update pyproject.toml ( #4035 )
2023-02-02 11:59:17 +01:00
Silvano Cerza
d79d39b28a
Bump act10ns/slack from v1 to v2 ( #4031 )
2023-02-02 09:39:36 +01:00
Silvano Cerza
938cb62144
Fix PyPi release workflow ( #4029 )
2023-02-02 09:36:23 +01:00
Zoltan Fedor
3aa6522564
fix: Event sending for RayPipeline
crashing Haystack ( #3971 )
...
* Remove the `send_pipeline_event_if_needed()` to confirm fix
* Suspending evnet sending for RayPipelines as it is not compatible
* Update base.py
* Updating implementation based on feedback from @masci
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-02-02 08:27:20 +01:00
ZanSara
9009a9ae58
feat: add Shaper
( #3880 )
...
* Shaper initial version
* Inital pydoc
* Add more unit tests
* Fix pydoc, expand Shaper pydoc with YAML example
* Minor fix
* Improve pydoc
* More unit tests with prompt node
* Describe Shaper functions in pydoc
* More pydoc
* Use pytest.raises instead of catching errors
* Improve test_function_invocation_order unit test
* pylint fixes
* Improve run_batch handling
* simpler version, initial stub
* stubbing tests
* promptnode compatibility
* add tests
* simplify
* fix promptnode tests
* pylint
* mypy
* fix corner case & mypy
* mypy
* review feedback
* tests
* Add lg updates
* add rename
* pylint
* Add complex unit test with two PNs and ICMs in between (#3921 )
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
* docstring
* fix tests
* add join_lists
* add documents_to_strings
* fix tests
* allow lists of input values
* doc review feedback
* do not use locals()
* Update with minor lg changes
* fix corner case in ICM
* fix merge
* review feedback
* answers conversions
* mypy
* add tests
* generative answers
* forgot to commit
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-02-01 18:36:13 +01:00
Silvano Cerza
e8ff48094b
Automate release on PyPi ( #4015 )
2023-02-01 17:40:21 +01:00
Julian Risch
3fcfc8eb23
chore: add discord badge to readme ( #4027 )
2023-02-01 16:59:22 +01:00
Sebastian
7b3d7ee83a
Reuse tokenizer instead of loading new one. ( #4016 )
...
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-02-01 10:44:18 +01:00
Sebastian
96706e9e7b
proposal: TableCell ( #3875 )
...
* Initial commit for TableSpan proposal
* Updating the proposal
* More updates to the proposal
* More changes
* Rename of file per Proposal instructions
* Update link
* Adding drawbacks
* Fixing typos
* Changed TableSpan to TableCell and updated proposal based on discussions.
* Adding discussion on identified bug.
* Rename proposal to reflect name change made during discussion. Added point to make it clear that we will be able to return a List of TableCells
* Update proposal with discussion about storing table as a list of lists
* Adding some additional code change descriptions.
2023-02-01 09:08:12 +01:00