Sara Zan
e69492a28f
Tutorial 14 doc changes ( #2714 )
...
* let the bot apply changes in this pr
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-23 12:36:12 +02:00
Stefano Fiorucci
b01a7c2259
Add InMemoryKnowledgeGraph ( #2678 )
...
* draft for InMemoryKnowledgeGraph
* remove comments
* Update Documentation & Code Style
* fix import and signature
* Fix dependencies for in_memory_knowlede_graph
* updated tutorials
* Update Documentation & Code Style
* fix bug in notebook
* fix other notebook bug
* Update Documentation & Code Style
* improved tutorial notebook
* Update Documentation & Code Style
* better implementation of InMemoryKnowledgeGraph
* fix
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-22 19:16:33 +02:00
bogdankostic
b16430b61e
Tutorial 4: Set similarity to "cosine"
in DocStore initialization ( #2673 )
...
* Set similarity to cosine in DocStore initialization
* Update Documentation & Code Style
* Set `scale_score` to `False`
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-20 18:47:09 +02:00
Massimiliano Pippi
79b287b568
Extract common code for ES and OS into a base class ( #2664 )
...
* extract common code for ES and OS into a base class
* Update Documentation & Code Style
* give the base class a more obvious name
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-20 09:47:44 +02:00
MichelBartels
964e6cdafb
Fix JoinAnswer/JoinNode ( #2612 )
...
* fix join nodes
* Update Documentation & Code Style
* fix unused import
* change arg order
* Update Documentation & Code Style
* fix kwargs check
* add warning when there is only one input node
* Update Documentation & Code Style
* fix type hint
* fix wrong import order
* Update Documentation & Code Style
* undo kwargs
* add accidentally deleted newline#
* fix type hint
* fix type hint
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-17 16:29:15 +02:00
Sara Zan
a26c042994
Fix typo in code_and_docs.sh
( #2662 )
...
* Fix typo in code_and_docs.sh & install ffmpeg in autoformat.yml
* apt update to get ffmpeg
* Update Documentation & Code Style
* Add header and better error message
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-15 13:50:55 +02:00
Sara Zan
584e046642
AnswerToSpeech
(#2584 )
...
* Add new audio answer primitives
* Add AnswerToSpeech
* Add dependency group
* Update Documentation & Code Style
* Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives
* Add tests
* Update Documentation & Code Style
* Add ability to compress audio and more tests
* Add audio group to test, all and all-gpu
* fix pylint
* Update Documentation & Code Style
* Accidental git tag
* Try pleasing mypy
* Update Documentation & Code Style
* fix pylint
* Add warning for missing OS library and support in CI
* Try fixing mypy
* Update Documentation & Code Style
* Add docs, simplify args for audio nodes and add tutorials
* Fix mypy
* Fix run_batch
* Feedback on tutorials
* fix mypy and pylint
* Fix mypy again
* Fix mypy yet again
* Fix the ci
* Fix dicts merge and install ffmpeg on CI
* Make the audio nodes import safe
* Trying to increase tolerance in audio test
* Fix import paths
* fix linter
* Update Documentation & Code Style
* Add audio libs in unit tests
* Update _text_to_speech.py
* Update answer_to_speech.py
* Use dedicated dataset & update telemetry
* Remove and use distilled roberta
* Revert special primitives so that the nodes run in indexing
* Improve tutorials and fix smaller bugs
* Update Documentation & Code Style
* Fix serialization issue
* Update Documentation & Code Style
* Improve tutorial
* Update Documentation & Code Style
* Update _text_to_speech.py
* Minor lg updates
* Minor lg updates to tutorial
* Making indexing work in tutorials
* Update Documentation & Code Style
* Improve docstrings
* Try to use GPU when available
* Update Documentation & Code Style
* Fixi mypy and pylint
* Try to pass the device correctly
* Update Documentation & Code Style
* Use type of device
* use .cpu()
* Improve .ipynb
* update apt index to be able to download libsndfile1
* Fix SpeechDocument.from_dict()
* Change pip URL
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-06-15 10:13:18 +02:00
James Briggs
2688135481
Pinecone unary queries upgrade ( #2657 )
...
* update query and response process for unary query update
* added metadata_config parameter
* Update Documentation & Code Style
Co-authored-by: James Briggs <jamesbriggs@Jamess-MacBook-Pro-2.local>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-15 09:45:39 +02:00
Sara Zan
54518ac790
[CI Refactoring] Refactor Document
fixtures in tests ( #2577 )
...
* Refactor document fixtures
* Add embedding files
* Update Documentation & Code Style
* Indentation issue
* Update Documentation & Code Style
* Fix type conversion in conftest.py
* Update Documentation & Code Style
* mypy on sql.py
* mypy on crawler.py
* mypy on pinecone.py
* Adapt retriever tests
* Update Documentation & Code Style
* mypy on crawler.py
* Update Documentation & Code Style
* mypy on crawler.py again
* Update Documentation & Code Style
* mypy fix was too rough
* Fix some more tests
* Update Documentation & Code Style
* Skip meaningless test on FilterRetriever
* Make embedding values less specific
* Update Documentation & Code Style
* Use stable IDs in retriever tests that depend on it
* Remove needless fixtures
* docs_with_ids
* Update Documentation & Code Style
* Typo
* Fix retriever tests
* Fix reader tests
* Update Documentation & Code Style
* Workaround #2626
* Update Documentation & Code Style
* Fix label generator tests
* Reorder vectors
* remove print
* Update Documentation & Code Style
* Update Documentation & Code Style
* git tags leftover
* Update Documentation & Code Style
* fix last failing test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 18:22:48 +02:00
Vladimir Blagojevic
b13c32eb9c
Add GPL API docs, unit tests update ( #2634 )
...
* Update test_label_generator.py
* GPL increase default batch size to 16
* GPL - API docs
* GPL - split unit tests
* Make devs aware of multilingual GPL
* Create separate train/save test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 05:25:28 -04:00
Stefano Fiorucci
c178f60e3a
Make crawler extract also hidden text ( #2642 )
...
* make crawler extract also hidden text
* Update Documentation & Code Style
* try to adapt test for extract_hidden_text
* Update Documentation & Code Style
* fix test bug
* fix bug in test
* added test for hidden text"
* Update Documentation & Code Style
* fix bug in test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
* fix other test bug
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 09:51:41 +02:00
Massimiliano Pippi
374155fd5c
Move Opensearch document store in its own module ( #2603 )
...
* move OpenSearchDocumentStore into its own Python module
* Update Documentation & Code Style
* mark test with (sigh) elasticsearch
* skip opensearch tests on windows
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-08 16:37:23 +02:00
Ryan Russell
c1b7948e10
Improve Docs Readability ( #2617 )
...
Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-06-03 09:57:40 +02:00
Julian Risch
3c6fcc3e42
Bump version to next release candidate ( #2627 )
...
* bump version to next release candidate
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 18:58:44 +02:00
Julian Risch
4ca331c0a7
Bump version to v1.5.0 and copy docs folder ( #2625 )
...
* bump version to v1.5.0 and copy docs folder
* Update Documentation & Code Style
* update links to v1.5.0
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 17:20:42 +02:00
Vladimir Blagojevic
e10a3fba74
Add Generative Pseudo Labeling ( #2388 )
2022-06-02 10:12:47 -04:00
bogdankostic
61d9429c25
Simplify loading of EmbeddingRetriever
( #2619 )
...
* Infer model format for EmbeddingRetriever automatically
* Update Documentation & Code Style
* Adapt conftest to automatic inference of model_format
* Update Documentation & Code Style
* Fix tests
* Update Documentation & Code Style
* Fix tests
* Adapt tutorials
* Update Documentation & Code Style
* Add test for similarity scores with sentence transformers
* Adapt doc string and warning message
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 15:05:29 +02:00
bogdankostic
0395533a78
Add run_batch
for standard pipelines ( #2595 )
...
* Add run_batch for standard pipelines
* Update Documentation & Code Style
* Fix mypy
* Remove code duplication
* Fix linter
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-27 10:42:48 +02:00
tstadel
7caca41c5d
Support context matching in pipeline.eval()
( #2482 )
...
* calculate context pred metrics
* Update Documentation & Code Style
* extend doc_relevance_col values
* fix import order
* Update Documentation & Code Style
* fix mypy
* fix typings literal import
* add option for custom document_id_field
* Update Documentation & Code Style
* fix tests and dataframe col-order
* Update Documentation & Code Style
* rename content to context in eval dataframe
* add backward compatibility to EvaluationResult.load()
* Update Documentation & Code Style
* add docstrings
* Update Documentation & Code Style
* support sas
* Update Documentation & Code Style
* add answer_scope param
* Update Documentation & Code Style
* rework doc_relevance_col and keep document_id col in case of custom_document_id_field
* Update Documentation & Code Style
* improve docstrings
* Update Documentation & Code Style
* rename document_relevance_criterion into document_scope
* Update Documentation & Code Style
* add document_scope and answer_scope to print_eval_report
* support all new features in execute_eval_run()
* fix imports
* fix mypy
* Update Documentation & Code Style
* rename pred_label_sas_grid into pred_label_matrix
* update dataframe schema and sorting
* Update Documentation & Code Style
* pass through context_matching params and extend document_scope test
* Update Documentation & Code Style
* add answer_scope tests
* fix context_matching_threshold for document metrics
* shorten dataframe apply calls
* Update Documentation & Code Style
* fix queries getting lost if nothing was retrieved
* Update Documentation & Code Style
* Update Documentation & Code Style
* use document_id scopes
* Update Documentation & Code Style
* fix answer_scope literal
* Update Documentation & Code Style
* update the docs (lg changes)
* Update Documentation & Code Style
* update tutorial 5
* Update Documentation & Code Style
* fix tests
* Add minor lg updates
* final docstring changes
* fix single quotes in docstrings
* Update Documentation & Code Style
* dataframe scopes added for each column
* better docstrings for context_matching params
* Update Documentation & Code Style
* fix summarizer eval test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2022-05-24 18:11:52 +02:00
bogdankostic
867695ad0c
Change signature of queries param in batch methods ( #2575 )
...
* Change signature of queries param in batch methods
* Update Documentation & Code Style
* Fix mypy
* Remove unused import
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 12:33:45 +02:00
Julian Risch
075ed7fbcb
Remove encoding option from PDFToTextOCRConverter ( #2553 )
...
* remove encoding option from PDFToTextOCRConverter
* Update Documentation & Code Style
* add unused 'encoding' param to PDFToTextOCRConverter
* Update Documentation & Code Style
* call run instead of convert to use ligature replacing
* Update Documentation & Code Style
* add text to check installed poppler version
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 11:31:32 +02:00
dimitrisna
5bda63a6c0
Add training checkpoint in retriever trainer ( #2543 )
...
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 09:51:26 +02:00
Agnieszka Marzec
ebd54b225b
Update Ray pipeline docs with validation info ( #2590 )
...
* Update Ray pipeline docs
* Add Sara's suggestion
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 09:29:52 +02:00
tstadel
0e83535108
Show search endpoint after deepset Cloud deployment ( #2569 )
...
* show try-out-message after deployment
* better messages
* Update Documentation & Code Style
* tests added
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-23 14:19:31 +02:00
Sara Zan
89bb1ca139
[CI refactoring] Improve autoformat.yml
( #2556 )
...
* Restructure autoformat to run a single script
* Reduce diff for autoforma.yml
* Reduce diff on linux_ci.yml
2022-05-18 20:02:43 +02:00
tstadel
f6e3a63906
Prevent losing names of utilized components when loaded from config ( #2525 )
...
* Prevent losing names of utilized components when loaded from config
* Update Documentation & Code Style
* update test
* fix failing tests
* Update Documentation & Code Style
* fix even more tests
* Update Documentation & Code Style
* incorporate review feedback
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-18 14:17:54 +02:00
tstadel
110b9c2b0a
Warnings for write operations of DeepsetCloudDocumentStore
( #2565 )
...
* log inputs to write operations
* Update Documentation & Code Style
* adjust tests
* simplify by using decorator for write operation functions
* Update Documentation & Code Style
* fix comma
* fix comma in test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-17 17:53:55 +02:00
MichelBartels
a952ba240f
Include meta data when computing embeddings in EmbeddingRetriever ( #2559 )
...
* include meta data when calculating embeddings in EmbeddingRetriever
* Update Documentation & Code Style
* fix None meta field
* remove default values
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-17 12:37:04 +02:00
MichelBartels
686e9d24ef
Documenting output score of JoinDocuments when using concatenation ( #2561 )
...
* add documentation regarding the score of JoinDocuments when using concatenation
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-16 18:30:07 +02:00
Agnieszka Marzec
2d03a26045
Minor lg changes ( #2533 )
...
* Minor lg change
* Update Documentation & Code Style
* Fix missing articles
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-13 16:12:22 +02:00
Agnieszka Marzec
1ae5a1449b
Update run() and run_batch() params descriptions in API ( #2542 )
...
* Update run() and run_batch() params descriptions
* Update Documentation & Code Style
* Update api params descriptions
* Update Documentation & Code Style
* Fix typo
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Add Bogdan's suggestions
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-05-13 15:11:01 +02:00
bogdankostic
738e008020
Add run_batch
method to all nodes and Pipeline
to allow batch querying ( #2481 )
...
* Add run_batch methods for batch querying
* Update Documentation & Code Style
* Fix mypy
* Update Documentation & Code Style
* Fix mypy
* Fix linter
* Fix tests
* Update Documentation & Code Style
* Fix tests
* Update Documentation & Code Style
* Fix mypy
* Fix rest api test
* Update Documentation & Code Style
* Add Doc strings
* Update Documentation & Code Style
* Add batch_size as attribute to nodes supporting batching
* Adapt error messages
* Adapt type of filters in retrievers
* Revert change about truncation_warning in summarizer
* Unify multiple_doc_lists tests
* Use smaller models in extractor tests
* Add return types to JoinAnswers and RouteDocuments
* Adapt return statements in reader's run_batch method
* Allow list of filters
* Adapt error messages
* Update Documentation & Code Style
* Fix tests
* Fix mypy
* Adapt print_questions
* Remove disabling warning about too many public methods
* Add flag for pylint to disable warning about too many public methods in pipelines/base.py and document_stores/base.py
* Add type check
* Update Documentation & Code Style
* Adapt tutorial 11
* Update Documentation & Code Style
* Add query_batch method for DCDocStore
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-11 11:11:00 +02:00
bogdankostic
5378a9ab48
Fix tutorials 4, 7 and 8 ( #2526 )
...
* Fix tutorials 4, 7 and 8
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-11 09:17:05 +02:00
bogdankostic
4581b91e83
Make DeepsetCloudDocumentStore
work with non-existing index ( #2513 )
...
* Make DeepsetCloudDocumentStore work with non-existing index
* Update Documentation & Code Style
* Add tests
* Update Documentation & Code Style
* Fix tests, adapt warning messages + lowercase deepset
* Update Documentation & Code Style
* Fix typo in test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-10 15:21:35 +02:00
Branden Chan
43bfea6f3d
Add sort arg to JoinAnswers ( #2436 )
...
* Add sort arg to JoinAnswers
* Update Documentation & Code Style
* Change naming and docstring
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-10 11:47:00 +02:00
Sara Zan
3d8bdf3cb6
Remove safe import from ElasticsearchDocumentStore
( #2522 )
...
* Update version to 1.4.1rc0
* Elasticsearch is not an optional dependency
* Fix import path
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-09 18:07:42 +02:00
Gabriel Altay
988568882a
fix small typo in Document doc string ( #2520 )
...
* fix small typo in Document doc string
Was going through the tutorial, then digging through the code and just noticed a small typo
* generate markdown file changes from docstrings
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-05-09 18:04:33 +02:00
Branden Chan
caf1336424
Adjust pydoc markdown config so methods shown with classes ( #2511 )
...
* add_member_class_prefix: true
* Update Documentation & Code Style
* Trigger redeploy
* Trigger redeploy
* Fix pydoc param
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 16:00:08 +02:00
Sara Zan
1ed407cb5a
Update version to 1.4.1rc0 ( #2509 )
...
* Update version to 1.4.1rc0
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 11:46:31 +02:00
Julian Risch
081b886aa1
Release v1.4.0 ( #2502 )
...
* delete unneeded files of last release
* add v1.4.0 docs with updated links
* upgrade version number
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-05 12:24:45 +02:00
MichelBartels
c7e39e5225
Replace TableTextRetriever with EmbeddingRetriever in Tutorial 15 ( #2479 )
...
* replace TableTextRetriever with EmbeddingRetriever in Tutorial 15
* Update Documentation & Code Style
* fix bug
* Update Documentation & Code Style
* update tutorial 15 outputs
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-20-212.eu-west-1.compute.internal>
2022-05-05 10:12:44 +02:00
MichelBartels
5d98810a17
Raise error if torch-scatter is not installed or wrong version is installed ( #2486 )
...
* automatically download correct torch-scatter version
* raise error if torch-scatter is not installed
* Update Documentation & Code Style
* catch all import errors and fix linter
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-05 10:12:10 +02:00
Julian Risch
1418f0c603
change milvus links from 2.0.0 to 2.0.x ( #2496 )
...
* change milvus links from 2.0.0 to 2.0.x
* Update Documentation & Code Style
* fix two broken links
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 18:30:50 +02:00
Sara Zan
f8e02310bf
Validate YAML files without loading the nodes ( #2438 )
...
* Remove BasePipeline and make a module for RayPipeline
* Can load pipelines from yaml, plenty of issues left
* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it
* Fix pipeline tests
* Move some tests out of test_pipeline.py and create MockDenseRetriever
* myoy and pylint (silencing too-many-public-methods)
* Fix issue found in some yaml files and in schema files
* Fix paths to YAML and fix some typos in Ray
* Fix eval tests
* Simplify MockDenseRetriever
* Fix Ray test
* Accidentally pushed merge coinflict, fixed
* Typo in schemas
* Typo in _json_schema.py
* Slightly reduce noisyness of version validation warnings
* Fix version logs tests
* Fix version logs tests again
* remove seemingly unused file
* Add check and test to avoid adding the same node to the pipeline twice
* Update Documentation & Code Style
* Revert config to pipeline_config
* Remo0ve unused import
* Complete reverting to pipeline_config
* Some more stray config=
* Update Documentation & Code Style
* Feedback
* Move back other_nodes tests into pipeline tests temporarily
* Update Documentation & Code Style
* Fixing tests
* Update Documentation & Code Style
* Fixing ray and standard pipeline tests
* Rename colliding load() methods in dense retrievers and faiss
* Update Documentation & Code Style
* Fix mypy on ray.py as well
* Add check for no root node
* Fix tests to use load_from_directory and load_index
* Try to workaround the disabled add_node of RayPipeline
* Update Documentation & Code Style
* Fix Ray test
* Fix FAISS tests
* Relax class check in _add_node_to_pipeline_graph
* Update Documentation & Code Style
* Try to fix mypy in ray.py
* unused import
* Try another fix for Ray
* Fix connector tests
* Update Documentation & Code Style
* Fix ray
* Update Documentation & Code Style
* use BaseComponent.load() in pipelines/base.py
* another round of feedback
* stray BaseComponent.load()
* Update Documentation & Code Style
* Fix FAISS tests too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
2022-05-04 17:39:06 +02:00
Sara Zan
01ea4bf21f
Change default encoding for PDFToTextConverter
from Latin 1
to UTF-8
( #2420 )
...
* Change default encoding for PDFToTextConverter
* Update Documentation & Code Style
* Improve docstring
* Update Documentation & Code Style
* Add list of ligatures to ignore and add the possibility to modify such list at need
* Add docstring
* Add tests
* Rename parameter
* Update Documentation & Code Style
* Move implementation into the base converter to make mypy happier
* Update Documentation & Code Style
* mypy and pylint
* mypy
* move encoding parameter to init of PDFToTextConverter
* Update Documentation & Code Style
* make utf8 default and fix mypy
* Update Documentation & Code Style
* Update Documentation & Code Style
* remove note on encoding in tutorial8
* Update Documentation & Code Style
* skip OCRConverter and test converter.run
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-05-04 17:01:45 +02:00
bogdankostic
a4e603ce87
Deprecate Milvus1DocumentStore
( #2495 )
...
* Add warning message
* Update doc string
* Update Documentation & Code Style
* Change DeprecationWarning to FutureWarning
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 15:09:57 +02:00
Julian Risch
970c476615
Align TransformersReader defaults with FARMReader ( #2490 )
...
* Align TransformersReader defaults with vFARMReader
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 10:04:18 +02:00
Tuana Celik
b6e369d1ca
changing the name of the retrievers from es_retriever to retriever ( #2487 )
...
* changing the name of the retrievers from es_retriever to retriever
* Update Documentation & Code Style
* name fix 2
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-03 18:08:23 +02:00
tstadel
7d6b3fe954
Add flag to disable scaling scores to probabilities ( #2454 )
...
* add scale_scores_to_probabilities flag
* Update Documentation & Code Style
* fix tests
* fix sql mypy
* Update Documentation & Code Style
* fix responses
* Update Documentation & Code Style
* rename to scale_score_to_probability + docstrings
* use BaseDocumentStore.score_to_probability in elasticsearch and milvus2
* Update Documentation & Code Style
* fix tests
* Update Documentation & Code Style
* add tests
* improve naming
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-02 13:35:07 +02:00
Tuana Celik
e2b85e2913
Renaming the ElasticsearchFilterOnlyRetriever to FilterRetriever ( #2461 )
...
* Renaming the ElasticsearchFilterOnlyRetriever to FilterRetriever
* adding missed init file
* Update Documentation & Code Style
* fixed docstring
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-29 10:16:02 +02:00