3803 Commits

Author SHA1 Message Date
Silvano Cerza
645a5fe5ba
ci: Add coverage tracking with Coveralls (#4772)
* Format tests.yml properly

* Add pytest-cov dependency

* Add coverage in unit tests

* Ignore cov.info

* Change report format

* Unignore cov.info
2023-04-28 11:59:09 +02:00
Vladimir Blagojevic
dcaf3002f1
fix: SentenceTransformersRanker's predict_batch returns wrong number of documents (#4756)
* Fix SentenceTransformersRanker spredict_batch returning wrong number of documents

* Julian's feedback
2023-04-27 15:24:39 +02:00
Vladimir Blagojevic
c9a415ec8d
refactor: Make agent test more robust (#4767)
* Add more examplars to lower test failure rate

* Easier agent run test, more robust, consistently passing
2023-04-27 14:53:15 +02:00
Vladimir Blagojevic
aebc22d27e
Upgrade transformers to 4.28.1 (#4665)
* Upgrade to transformers 4.28.1

* Commenting out failing piece of test

* trailing-whitespace

* Adjust regex for error match - it changed between releases

* Remove RAG tests failing with transformers update
2023-04-27 12:55:21 +02:00
bogdankostic
c7a20d68d2
fix: Add separate query method for OpenSearchDocumentStore (#4764)
* Add separate query method for OpenSearchDocumentStore

* Convert integration test to unit test + add separate tests for OpenSearch
2023-04-26 21:58:33 +02:00
Vladimir Blagojevic
41b6e33f64
Enhance the error logging in PromptTemplate variable resolution (#4730)
* Enhance the error logging in PromptTemplate variable resolution

* Revert change Daria made

* Silvano PR feedback
2023-04-26 18:09:20 +02:00
Darja Fokina
2f7104704a
fix: README latest and main installation (#4741)
Fixing README to explain how to install from main branch
2023-04-26 15:08:05 +02:00
Vladimir Blagojevic
3fefc475b4
fix: Deprecate Seq2SeqGenerator and RAGenerator (#4745)
* Deprecate Seq2SeqGenerator

* changed the warning to include suggestion

* Added example and msg to API reference docs

* Added RAG deprecation

* renamed name to adapt to naming conven

* update docstrings

---------

Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-04-26 13:59:35 +02:00
tstadel
9cbe9e0949
fix: recursion of death while loading PromptTemplate from yaml (#4691)
* fix recursion of death when deserializing prompttemplate

* add test

* set api_key

* fix test

* add generic test

* work in feedback on tests

---------

Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-04-26 13:56:51 +02:00
Vladimir Blagojevic
650e1a1a6f
fix: gpt-3.5-turbo is an agent streaming model (#4673)
* gpt-3.5 is also agent streaming model

* Add more streaming capable models

* Add end-of-file-fixer

* List full model names for clarity
2023-04-26 13:56:24 +02:00
s_teja
d033a086d0
fix: loads local HF Models in PromptNode pipeline (#4670)
* bug: fix load local HF Models in PromptNode pipeline

* Update hugging_face.py

remove duplicate validator

* update: black formatted

* update: update doc string, replace pop with get

* test HFLocalInvocationLayer with local model
2023-04-26 13:10:02 +02:00
bogdankostic
0d6fba14fd
Revert "fix: Log 'Observation' on new line (#4704)" (#4751)
This reverts commit 63f24cb1f37455a63dfd4b0c6f6c797fc8077aae.
2023-04-26 12:17:58 +02:00
Massimiliano Pippi
b54fdd3fa0
docs: add deprecation notes to docstrings (#4708) 2023-04-26 12:16:43 +02:00
ZanSara
1b57b96210
refactor!: extract elasticsearch (#4668)
* extract elasticsearch

* update pyproject.toml

* make more import optional

* move MockBaseRetriever in conftest

* install es in the es integration tests
2023-04-26 10:14:20 +02:00
bogdankostic
91b775bf43
Execute pipelines and utils unit tests in CI (#4749) 2023-04-26 10:00:52 +02:00
recrudesce
38768bffdf
fix: Tiktoken does not support Azure gpt-35-turbo (#4739)
* force support for gpt-35-turbo

Cos Tiktoken doesn't support it yet - see https://github.com/openai/tiktoken/pull/72

* Update openai_utils.py

* Appeasing the linting gods

Why hast thou forsaken me ?

* Remove trailing whitespace

* chg: remove redundant elif block
2023-04-25 16:43:24 +02:00
Wang, Yi
2be1a68fce
fix: Allow to set num_beams in HFInvocationLayer (#4731)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-04-25 16:08:06 +02:00
github-actions[bot]
7fa3591f5f
Update unstable version (#4740)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2023-04-24 14:07:50 +02:00
bogdankostic
7db025a97b
Update weaviate-client (#4715) 2023-04-20 17:54:55 +02:00
recrudesce
473152eb05
feat: Add AzureChatGPT Capability using new InvocationLayer style (#4675) 2023-04-20 16:27:07 +02:00
Tuana Çelik
4cc416236d
chore: Updating readme (#4714)
Adding another name in our 'who uses' section
2023-04-20 11:28:38 +02:00
Zoltan Fedor
49d548ef10
fix: Fixing the Weaviate BM25 query builder bug (#4703) 2023-04-20 09:56:49 +02:00
Tuana Çelik
63f24cb1f3
fix: Log 'Observation' on new line (#4704) 2023-04-20 09:53:08 +02:00
bogdankostic
3d3b79986f
docs: Adapt Shaper docstrings regarding dropping metadata (#4655) 2023-04-19 13:40:53 +02:00
Sebastian
8d9136bad4
feat: Implementation of Table Cell Proposal (#4616)
* Starting adding support for TableCell

* Update tests to use row and col

* Added schema test to check to_dict and from_dict works for Table documents. Also updated Doc.__eq__ to work for tables.

* Update eval test to use TableCell

* Added more schema tests for table docs, labels and answers.

* Add boolean to toggle between Span and TableCell

* Add deprecation message

* Test that table answers work as responses in the rest API

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-19 13:14:49 +02:00
Darja Fokina
ec7fc4aa0b
docs: add web retriever to api docs (#4699) 2023-04-18 17:19:57 +02:00
Silvano Cerza
f13cc751c3
Block requests_cache in unit tests (#4696) 2023-04-18 16:15:26 +02:00
Massimiliano Pippi
0c081f19e2
fix: remove warnings from the more recent Elasticsearch client (#4602)
* clean up the ES instance in a more robust way

* do not sleep, refresh the index instead

* remove client warnings

* fix unit tests

* fix opensearch compatibility

* fix unit tests

* update ES version

* bump elasticsearch-py

* adjust docs

* use recreate_index param

* use same fixture strategy for Opensearch

* Update lg

---------

Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-18 15:40:17 +02:00
Sebastian
8c4176bdb2
feat: More flexible routing for RouteDocuments node (#4690)
* Added warning messages for documents that are skipped by RouteDocuments. Begun adding support for new option return_remaining and List of List support for metadata value splitting.

* Simplify _split_by_content_type

* Added new unit test and updated _calculate_outgoing_edges

* Added some TODOs and turned assert into raising an error.

* Update logging messages and make new fixture in tests

* Update _split_by_metadata_values to work with return_remaining

* Remove unneeded code

* Documentation

* Add proper support for list of lists

* Fix mypy errors

* Added assert to make mypy happy

* Update haystack/nodes/other/route_documents.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* PR comments

* Remove check for logging level

* make mypy happy

* Update docstring of metadata_values

* Removed duplicate check. Make explicit check for metadata_values

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-04-18 15:18:13 +02:00
ZanSara
b06821b311
refactor: node->component (#4687)
* node->component

* fix tests
2023-04-17 12:20:42 +02:00
ZanSara
809ca73649
fix: make langdetect truly optional (#4686)
* make al langdetect imports optional

* add workflow

* fix workflow triggers

* change extra name
2023-04-17 11:35:53 +02:00
Fernando Pereira
a0d1733098
fix: PineconeDocumentStore error when delete_documents right after initialization (#4609) 2023-04-17 10:51:39 +02:00
Massimiliano Pippi
a03e8335aa
Ignore cross-reference properties when loading documents (#4664)
* drop cross-reference properties

* be more defensive

* fix regression
2023-04-17 10:40:30 +02:00
Julian Risch
dbe3049682
docs: Add docstring for PromptNode debug attribute (#4672) 2023-04-14 18:09:02 +02:00
Silvano Cerza
79727ed31f
Add requests blocker fixture (#4671) 2023-04-14 18:01:30 +02:00
Vladimir Blagojevic
1dcac11133
feat: Add Hugging Face inferencing PromptNode layer (#4641) 2023-04-14 17:59:17 +02:00
Vladimir Blagojevic
6a5acaa1e2
feat: Add chatgpt streaming (#4659) 2023-04-14 16:02:28 +02:00
Vladimir Blagojevic
1dd6158244
fix: Add model_max_length model_kwargs parameter to HF PromptNode (#4651) 2023-04-14 15:40:42 +02:00
ZanSara
d8ac30fa47
refactor!: extract preprocessing and file conversion deps (#4605)
* isolate file-conversion deps

* pylint

* add to all extra

* chain was missing

* move langdetect into preprocessing and fix tika

* add file-conversion extra
2023-04-14 11:34:16 +02:00
Tuana Çelik
16091f6ad2
Update README.md (#4661)
fixing links to sections
2023-04-14 10:23:06 +02:00
bogdankostic
cb13a537a9
Add deprecation information to doc string (#4658) 2023-04-14 09:39:34 +02:00
ZanSara
174d80ab41
skip tests (#4654) 2023-04-13 17:56:51 +02:00
Agnieszka Marzec
4aca24c845
Docs: Add max length unit to PromptNode API docs (#4601)
* Add max length unit

* Update to token

* Update invocation layers

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-04-13 16:48:32 +02:00
bogdankostic
db48773268
docs: Add PDFToTextOCRConverter to API Docs (#4656) 2023-04-13 15:31:45 +02:00
Joseph Smith
e09b3364c7
Check for date fields in weaviate meta update (#4371)
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-04-13 15:18:23 +02:00
Vladimir Blagojevic
e30bc8fe5a
feat: Add GenerationConfig option to PromptNode's HuggingFace invocation layer (#4649) 2023-04-13 12:15:00 +02:00
ZanSara
f2106ab37b
feat: initial implementation of MemoryDocumentStore for new Pipelines (#4447)
* add stub implementation

* reimplementation

* test files

* docstore tests

* tests for document

* better testing

* remove mmh3

* readme

* only store, no retrieval yet

* linting

* review feedback

* initial filters implementation

* working on filters

* linters

* filtering works and is isolated by document store

* simplify filters

* comments

* improve filters matching code

* review feedback

* pylint

* move logic into_create_id

* mypy
2023-04-13 09:36:23 +02:00
Silvano Cerza
db69141642
Fix docstring-labeler.yml not working in PR from forks (#4648) 2023-04-12 21:16:06 +02:00
ZanSara
ba11d1c2a8
refactor!: extract evaluation and statistical dependencies (#4457)
* try-catch sklearn and scipy

* haystack imports

* linting

* mypy

* try to import baseretriever

* remove typing

* unused import

* remove more typing

* pylint

* isolate sql imports for postgres, which we don't use anyway

* remove stats

* replace expit

* als inmemory

* mypy

* feedback

* docker

* expit

* re-add njit
2023-04-12 15:38:56 +02:00
Fernando Pereira
5d41e60d89
fix: ParsrConverter list element added (#4562)
* fix: list element and mapping logic around it added to ParsrConverter convert step + unit test covering the specific mapping of list content from Parsr's to Haystack's

* Code review changes

* changed the samples path after conftest changes

* added samples_path to function arg

---------

Co-authored-by: Namoush <fmpereira22@gmail.com>
Co-authored-by: Fernando Pereira <fernando.pereira@criticalsoftware.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-04-12 18:38:21 +05:30