Vladimir Blagojevic
53528c96a0
feat: Add ChatGPT PromptNode layer ( #4357 )
...
* Initial ChatGPTInvocationLayer
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
2023-03-17 14:16:41 +01:00
Silvano Cerza
9802fb159a
Remove unnecessary imports in conftest.py ( #4434 )
2023-03-16 10:02:01 +01:00
ZanSara
c802305ccf
test: move tests on standard pipelines in e2e/
( #4309 )
...
* move out standard pipelines e2e
* fixing unit tests
* add test data
* feedback
* pylint
* black
2023-03-06 17:26:19 +01:00
Daniel Bichuetti
1548c5ba0f
feat: Add Azure OpenAI embeddings support ( #4332 )
...
* feate: add Azure OpenAI as embedding option
* feat: Add Azure OpenAI embeddings support
* refactor: check api key
* refactor: better type checking for Azure
* refactor: enable parallelism + separate and update tests
* refactor: string reformat
* refactor: explicit typing
* refactor: update refs and remove unused code
2023-03-06 13:37:20 +01:00
Vladimir Blagojevic
79bf25aaea
feat: Add Azure as OpenAI endpoint ( #4170 )
...
* Add Azure as OpenAI endpoint
---------
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-03-02 09:55:09 +01:00
ZanSara
ae04ce3c6a
test: mock all Summarizer tests and move a few into e2e ( #4299 )
...
* stub e2e folders
* simplify pipeline test
* mocking
* unit tests fixed
* clean up e2e
* pipeline tests work
* pylint
* leftover
* small fix from #2994 and additional tests
* review feedback
* change summaries
* black
* revert models and summaries
2023-03-01 17:30:55 +01:00
ZanSara
165a0a5faa
test: mock all Translator
tests and move one to e2e
( #4290 )
...
* mock all translator tests and move one to e2e
* typo
* extract pipeline tests using translator
* remove duplicate test
* move generator test in e2e
* Update e2e/pipelines/test_extractive_qa.py
* pytest.mark.unit
* black
* remove model name as well
* remove unused fixture
* rename original and improve pipeline tests
* fixes
* pylint
2023-03-01 14:52:05 +01:00
Stefano Fiorucci
e8f9b1b65d
test: replace ElasticsearchDS
with InMemoryDS
when it makes sense; support scale_score
in InMemoryDS
( #4283 )
...
* replace elasticds with imds - first draft
* fix
* fix tests and implement scale_score in imds bm25
* add docstrings for scale_score
2023-03-01 11:35:10 +01:00
Silvano Cerza
4a93517eb4
test: Fix deprecation fixture ( #4219 )
...
* Fix deprecation fixture
* Update docstring
* Update docstring
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-02-27 09:55:03 +01:00
Julian Risch
5ce7a404ac
feat: Add Agent ( #4148 )
...
* initial Agent implementation
* mypy and pylint fixes
* add missing ABC import
* improved prompt template
* refactor and shorten run method
* refactor and shorten run method
* add tests for extracting
* fix mixed up tool_input/observation & make tests more robust
* fix bug with max_iterations and update prompt template
* allow setting prompt_template in Agent init
* remove example yml for agent
* add final prediction to transcript
* add transcript to errors and accept PromptTemplate in init
* simplify if else to elif
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* add checks for max_iter<2 and empty list returned by prompt node
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-02-21 14:27:40 +01:00
Massimiliano Pippi
ec72dd73fc
refactor: complete the document stores test refactoring ( #4125 )
...
* add e2e tests
* move tests to their own module
* add e2e workflow
* pylint
* remove from job
* fix index field name
* skip test on sql
* removed unused code
* fix embedding tests
* adjust test for pinecone
* adjust assertions to the new documents
* bad copypasta
* test
* fix tests
* fix tests
* fix test
* fix tests
* pylint
* update milvus version
* remove debug
* move graphdb tests under e2e
2023-02-16 09:43:25 +01:00
Silvano Cerza
274746db07
style: Update black ( #4101 )
...
* Update black version
* Format file with new black style
* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
ZanSara
90c877a559
bug: mypy
should ignore files in test/
( #3894 )
...
* exclude files in test/
* verify that the CI ignores test files
* dont fail in case of no files
2023-01-19 18:12:26 +01:00
ZanSara
9e457db2e9
test: add version deprecation fixture ( #3851 )
...
* add fixture
* Update test/conftest.py
* remove +2 and add tests
* few typos
* more cases
* Update test/conftest.py
2023-01-16 15:36:14 +01:00
Stefano Fiorucci
136928714c
refactor: remove deprecated parameters from Summarizer
( #3740 )
...
* remove deprecated parameters
* remove deprecation/removal test
2022-12-29 15:37:47 +05:30
Vladimir Blagojevic
9ebf164cfd
feat: Expand LLM support with PromptModel, PromptNode, and PromptTemplate ( #3667 )
...
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2022-12-20 11:21:26 +01:00
Sebastian
25bf95d47f
Update table reader tests to include checking the score of answers. ( #3641 )
2022-12-07 07:30:49 -08:00
Stefano Fiorucci
3040e59c63
feat: add support for BM25Retriever
in InMemoryDocumentStore
( #3561 )
...
* very first draft
* implement query and query_batch
* add more bm25 parameters
* add rank_bm25 dependency
* fix mypy
* remove tokenizer callable parameter
* remove unused import
* only json serializable attributes
* try to fix: pylint too-many-public-methods / R0904
* bm25 attribute always present
* convert errors into warnings to make the tutorial 1 work
* add docstrings; tests
* try to make tests run
* better docstrings; revert not running tests
* some suggestions from review
* rename elasticsearch retriever as bm25 in tests; try to test memory_bm25
* exclude tests with filters
* change elasticsearch to bm25 retriever in test_summarizer
* add tests
* try to improve tests
* better type hint
* adapt test_table_text_retriever_embedding
* handle non-textual docs
* query only textual documents
2022-11-22 09:24:52 +01:00
Stefano Fiorucci
dc26e6d43e
fix: Flatten DocumentClassifier
output in SQLDocumentStore
; remove _sql_session_rollback
hack in tests ( #3273 )
...
* first draft
* fix
* fix
* move test to test_sql
2022-11-16 12:20:57 +01:00
Massimiliano Pippi
6a48ace9b9
BREAKING CHANGE: remove Milvus1DocumentStore along with support for Milvus < 2.x ( #3552 )
...
* remove milvus1
* leftover
* revert deprecation process
2022-11-15 09:54:55 +01:00
Massimiliano Pippi
4dfddf0d10
refactor: Refactor Weaviate tests ( #3541 )
...
* refactor tests
* fix job
* revert
* revert
* revert
* use latest weaviate
* fix abstract methods signatures
* pass class_name to all the CRUD methods
* finish moving all the tests
* bump weaviate version
* raise, don't pass
2022-11-14 09:57:30 +01:00
Sara Zan
43b24fd1a7
fix: strip whitespaces safely from FARMReader
's answers ( #3526 )
...
* remove .strip()
* check for right-side offset
* return the whitespace-cleaned answer
* lstrip, not rstrip :D
* remove int
* left_offset
* slightly refactor reader fixture
* extend test_output
2022-11-08 09:26:47 +01:00
Massimiliano Pippi
255072d8d5
refactor: move dC tests to their own module and job ( #3529 )
...
* move dC tests to their own module and job
* restore global var
* revert
2022-11-04 17:05:10 +01:00
Massimiliano Pippi
2bb81331b7
feat: add SQLDocumentStore tests ( #3517 )
...
* port SQL tests
* cleanup document_store_tests.py from sql tests
* leftover
* Update .github/workflows/tests.yml
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
* review comments
* Update test/document_stores/test_base.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-11-04 09:24:19 +01:00
Sara Zan
f0be78c6a6
bug: remove useless import in conftest.py ( #3362 )
...
* Remove useless milvus import in conftest
* schemas
* schemas
2022-11-02 19:22:24 +05:30
Massimiliano Pippi
b694c7b5cb
Document Store test refactoring ( #3449 )
...
* add new marker
* start using test hierarchies
* move ES tests into their own class
* refactor test workflow
* job steps
* add more tests
* move more tests
* more tests
* test labels
* add more tests
* Update tests.yml
* Update tests.yml
* fix
* typo
* fix es image tag
* map es ports
* try
* fix
* default port
* remove opensearch from the markers sorcery
* revert
* skip new tests in old jobs
* skip opensearch_faiss
2022-10-31 15:30:14 +01:00
Sebastian
8db7dfb884
refactor: TableReader ( #3456 )
...
* Refactoring table reader
2022-10-26 20:57:28 +02:00
Sebastian
59857cb492
feat: Speed up reader tests ( #3476 )
...
* Use a smaller reader where possible
* Change scope to module of reader to get faster load times
2022-10-26 19:04:18 +02:00
Sara Zan
05c68b6624
feat: add document_store
to all BaseRetriever.retrieve()
and BaseRetriever.retrieve_batch()
implementations ( #3379 )
...
* add document_store to retrieve()]
* mypy & pylint
* pass docstore to embedding encoders
* schemas
* mypy and pylint
* fix tfidfretriever
* pylint
* mypy
* pylint
* fix tfidf
* mypy
* pylint
* schemas
* another fix for tfidf
* fix question generation tests
* remove docstore from embedding encoder signature
* pylint
* revert accidental test changes
* Apply suggestions from code review
* check for docstore similarity function only if the docstore is present
* check for docstore similarity function only if the docstore is present
2022-10-26 15:47:06 +02:00
Vladimir Blagojevic
5ca96357ff
feat: Add CohereEmbeddingEncoder to EmbeddingRetriever ( #3453 )
2022-10-25 17:52:29 +02:00
Sebastian
93817f63b4
feat: Speed up integration tests (nodes) ( #3408 )
...
* Changed summarizer model to a smaller one (2GB to 500MB) to save on space and speed up the tests.
* Removed google pegasus from cache
2022-10-18 16:23:57 +02:00
Vladimir Blagojevic
159cd5a666
feat: Add OpenAIEmbeddingEncoder to EmbeddingRetriever ( #3356 )
2022-10-14 15:01:03 +02:00
Stefano Fiorucci
7290196c32
fix: allow same vector_id
in different indexes for SQL-based Document stores ( #3383 )
...
* fix_multiple_indexes
* improve test names
2022-10-14 09:55:56 +02:00
Vladimir Blagojevic
6cb4e93965
refactor: remove Inferencer multiprocessing ( #3283 )
2022-10-04 14:08:23 +02:00
tstadel
05a86b9d3d
feat: FAISS in OpenSearch: Support HNSW for cosine ( #3217 )
...
* support cosine similiarity with faiss
* update docs
* update api docs
* fix tests
* Revert "update api docs"
This reverts commit 6138fdfefb3beaee2d55c5729cd4a2745ea6b143.
* fix api docs
* collapse test
* rename similairity to space_type mappings
* only normalize for faiss
* fix merge
* fix docs normalization
* get rid of List[np.array]
* update docs
* fix tests and tutorials
* fix mypy
* fix mypy
* fix mypy again
* again mypy
* blacken
* update tutorial 4 docs
* fix embeddingretriever
* fix faiss
* move dense specific logic to DenseRetriever
* fix mypy
* cosine tests for all documents stores
* fix pinecone
* add docstring
* docstring corrections
* update docs
* add integration test marker
* docstrings update
* update docs
* fix typo
* update docs
* fix MockDenseRetriever
* run integration tests for all documentstores
* fix test_update_embeddings_cosine_similarity
* fix faiss tests not running
* blacken
* make test_cosine_sanity_check integration test
* split PR
* update docs
* manually revert tutorial doc change
* Fix embedding type
* set integration marker correctly
* make BaseDocumentStore.normalize_embedding static
* format
* fix handling of opensearch_faiss param
* fix merge
* add DenseRetriever typing
* organize imports in conftest.py
* organize imports in conftest.py (2)
* fix DenseRetriever import
* add opensearch-tests-linux
2022-09-23 13:26:49 +02:00
tstadel
4fa9d2d8e7
Fix milvus and faiss tests not running ( #3263 )
...
* fix milvus and faiss tests not running
* fix schema manually
* fix test_dpr_embedding test for milvus
* pip freeze on milvus tests
* fix milvus1 tests being executed: fix all_doc_stores order
* Revert "pip freeze on milvus tests"
This reverts commit 75ebb6f7e507bb8477e87d9e63b4a294f7946cab.
* make infer_required_doc_store more robust
* don't skip tests without docstore requirements
* use markers for docstore tests
2022-09-22 17:46:49 +02:00
tstadel
b10e2c392e
chore: add DenseRetriever
abstraction ( #3252 )
...
* support cosine similiarity with faiss
* update docs
* update api docs
* fix tests
* Revert "update api docs"
This reverts commit 6138fdfefb3beaee2d55c5729cd4a2745ea6b143.
* fix api docs
* collapse test
* rename similairity to space_type mappings
* only normalize for faiss
* fix merge
* fix docs normalization
* get rid of List[np.array]
* update docs
* fix tests and tutorials
* fix mypy
* fix mypy
* fix mypy again
* again mypy
* blacken
* update tutorial 4 docs
* fix embeddingretriever
* fix faiss
* move dense specific logic to DenseRetriever
* fix mypy
* cosine tests for all documents stores
* fix pinecone
* add docstring
* docstring corrections
* update docs
* add integration test marker
* docstrings update
* update docs
* fix typo
* update docs
* fix MockDenseRetriever
* run integration tests for all documentstores
* fix test_update_embeddings_cosine_similarity
* fix faiss tests not running
* blacken
* make test_cosine_sanity_check integration test
* update docs
* fix imports
* import DenseRetriever normally
* update docs
* fix deepcopy of documents
* update schema
* Revert "update schema"
This reverts commit 83cf8f323648468e1c322d54852bec084d637e3f.
* fix schema for ci manually
2022-09-21 19:08:54 +02:00
Vladimir Blagojevic
938e6fda5b
Classify pipeline's type based on its components ( #3132 )
...
* Add pipeline get_type mehod
* Add pipeline uptime
* Add pipeline telemetry event sending
* Send pipeline telemetry once a day (at most)
* Add pipeline invocation counter, change invocation counter logic
* Update allowed telemetry parameters - allow pipeline parameters
* PR review: add unit test
2022-09-21 14:53:42 +02:00
Stefano Fiorucci
89247b804c
refactor: make TransformersDocumentClassifier
output consistent between different types of classification ( #3224 )
...
* make output consistent
* make output consistent
* added tests for details
* better tests
* Update test_document_classifier.py
* make black happy
* Update test_document_classifier.py
* Update test_document_classifier.py
2022-09-21 13:16:03 +02:00
Sara Zan
dcb132ba59
chore: remove f-strings from logs for performance reasons ( #3212 )
...
* Use the %s syntax on all debug messages
* Use the %s syntax on some more debug messages
* Use the %s syntax on info messages
* Use the %s syntax on warning messages
* Use the %s syntax on error and exception messages
* mypy
* pylint
* trogger tutorials execution in CI
* trigger tutorials execution on CI
* black
* remove embeddings from repr
* fix Document `__repr__`
* address feedback
* mypy
2022-09-19 18:18:32 +02:00
Daniel Bichuetti
e1f399284f
refactor: update dependencies and remove pins ( #3147 )
...
* refactor: remove azure-core, pydoc and hf-hub pins
* fix: remove extra-comma
* fix: force minimum version of azure forms recognizer
* refactor: allow newer ocr libs
* refactor: update more dependencies and container versions
* refactor: remove extra comment
* docs: pre-commit manual run
* refactor: remove unnecessary dependency
* tests: update weaviate container image version
2022-09-05 14:30:35 +02:00
James Briggs
9b1b03002f
update to PineconeDocumentStore to remove dependency on SQL db ( #2749 )
...
* update to PineconeDocumentStore to remove dependency on SQL db
* Update Documentation & Code Style
* typing fixes
* Update Documentation & Code Style
* fixed embedding generator to yield Documents
* Update Documentation & Code Style
* fixes for final typing issues
* fixes for pylint
* Update Documentation & Code Style
* uncomment pinecone tests
* added new params to docstrings
* Update Documentation & Code Style
* Update Documentation & Code Style
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update Documentation & Code Style
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* Update haystack/document_stores/pinecone.py
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* changes based on comments, updated errors and install
* Update Documentation & Code Style
* mypy
* implement simple filtering in pinecone mock
* typo
* typo in reverse
* account for missing meta key in filtering
* typo
* added metadata filtering to describe index
* added handling for users switching indexes in same doc store, and handling duplicate docs in write
* syntax tweaks
* added index option to document/embedding count calls
* labels implementation in progress
* added metadata fields to be indexed for pinecone tests
* further changes to mock
* WIP implementation of labels+multilabels
* switched to rely on labels namespace rather than filter
* simpler delete_labels
* label fixes, remove debug code
* Apply dostring fixes
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* mypy
* pylint
* docs
* temporarily un-mock Pinecone
* Small Pinecone test suite
* pylint
* Add fake test key to pass the None check
* Add again fake test key to pass the None check
* Add Pinecone to default docstores and fix filters
* Fix field name
* Change field name
* Change field value
* Remove comments
* forgot to upgrade pyproject.toml
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-08-24 13:27:15 +02:00
James Briggs
26c938a8e6
test: add meta fields for meta_config to be used during testing ( #3021 )
...
* added meta fields for meta_config to be used during realtime testing of PineconeDocumentStore
* Add documentation on metadata filtering in docstring
* docs
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-08-12 10:27:56 +02:00
Zoltan Fedor
f4128d3581
Adding support for additional distance/similarity metrics for Weaviate ( #3001 )
...
* Adding support for additional distance metrics for Weaviate
Fixes #3000
* Updating the docs
* Fixing error texts
* Fixing issues raised by the review
* Addressing the last issue from the reviews - removing test `test_weaviate.py::test_similarity`
* [EMPTY] Re-trigger CI
* Fixing things based on review
* [EMPTY] Re-trigger CI
2022-08-11 09:48:21 +02:00
Massimiliano Pippi
e7627c3f8b
Use opensearch-py in OpenSearchDocumentStore ( #2691 )
...
* add Opensearch extras
* let OpenSearchDocumentStore use opensearch-py
* Update Documentation & Code Style
* fix a bug found after adding tests
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-07-28 10:04:49 +02:00
Sara Zan
6b39fbd39c
Mocking Pinecone tests ( #2778 )
...
* Integrating the mock into conftest.py
* re-enable workflow
* delete_all
* Update Documentation & Code Style
* remove ValueError
* Add empty response
* wrong condition
* return response
* revert removal of delete_all
* change mock
* Update Documentation & Code Style
* test for rest api, to revert
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-14 20:03:33 +02:00
Malte Pietsch
ba08fc86f5
Add node to use OpenAI's GPT-3 for QA ( #2605 )
...
* first draft of openai node for QA
* Update Documentation & Code Style
* fix mypy. add node to inits
* Update Documentation & Code Style
* fix linter
* Adapt OpenAIGenerator to completions endpoint
* Update Documentation & Code Style
* Fix pylint
* Fix doc strings
* Make use of temperature
* Make use of api key in tests
* Adapt doc strings
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-07-08 13:59:27 +02:00
bogdankostic
195aed942f
Add update_document_meta
to InMemoryDocumentStore
( #2689 )
...
* Add update_document_meta to InMemoryDocumentStore
* Fix typo
* Update Documentation & Code Style
* Add update_document_meta to BaseDocumentStore
* Update Documentation & Code Style
* Fix mypy
* Update Documentation & Code Style
* Add update_document_meta to MockDocumentStore
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-07 15:44:07 +02:00
Patrick Deutschmann
1db3fd0942
Add support for Multi-Hop Dense Retrieval ( #2571 )
...
* Implement MDR
* Adapt conftest to new MDR signature
* Update Documentation & Code Style
* Change signature of queries param in batch methods of MDR like in #2575
* Update Documentation & Code Style
* Rename MultihopDenseRetriever to MultihopEmbeddingRetriever
* Fix filters in retrieve_batch
* Add docstring for MultihopEmbeddingRetriever.__init__
* Update Documentation & Code Style
* Revert forward signature of TextSimilarityHead
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-07-05 11:31:11 +02:00
Sara Zan
54518ac790
[CI Refactoring] Refactor Document
fixtures in tests ( #2577 )
...
* Refactor document fixtures
* Add embedding files
* Update Documentation & Code Style
* Indentation issue
* Update Documentation & Code Style
* Fix type conversion in conftest.py
* Update Documentation & Code Style
* mypy on sql.py
* mypy on crawler.py
* mypy on pinecone.py
* Adapt retriever tests
* Update Documentation & Code Style
* mypy on crawler.py
* Update Documentation & Code Style
* mypy on crawler.py again
* Update Documentation & Code Style
* mypy fix was too rough
* Fix some more tests
* Update Documentation & Code Style
* Skip meaningless test on FilterRetriever
* Make embedding values less specific
* Update Documentation & Code Style
* Use stable IDs in retriever tests that depend on it
* Remove needless fixtures
* docs_with_ids
* Update Documentation & Code Style
* Typo
* Fix retriever tests
* Fix reader tests
* Update Documentation & Code Style
* Workaround #2626
* Update Documentation & Code Style
* Fix label generator tests
* Reorder vectors
* remove print
* Update Documentation & Code Style
* Update Documentation & Code Style
* git tags leftover
* Update Documentation & Code Style
* fix last failing test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 18:22:48 +02:00