Vladimir Blagojevic
b13c32eb9c
Add GPL API docs, unit tests update ( #2634 )
...
* Update test_label_generator.py
* GPL increase default batch size to 16
* GPL - API docs
* GPL - split unit tests
* Make devs aware of multilingual GPL
* Create separate train/save test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 05:25:28 -04:00
Stefano Fiorucci
c178f60e3a
Make crawler extract also hidden text ( #2642 )
...
* make crawler extract also hidden text
* Update Documentation & Code Style
* try to adapt test for extract_hidden_text
* Update Documentation & Code Style
* fix test bug
* fix bug in test
* added test for hidden text"
* Update Documentation & Code Style
* fix bug in test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
* fix other test bug
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-10 09:51:41 +02:00
Massimiliano Pippi
374155fd5c
Move Opensearch document store in its own module ( #2603 )
...
* move OpenSearchDocumentStore into its own Python module
* Update Documentation & Code Style
* mark test with (sigh) elasticsearch
* skip opensearch tests on windows
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-08 16:37:23 +02:00
Ryan Russell
c1b7948e10
Improve Docs Readability ( #2617 )
...
Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-06-03 09:57:40 +02:00
Julian Risch
3c6fcc3e42
Bump version to next release candidate ( #2627 )
...
* bump version to next release candidate
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 18:58:44 +02:00
Julian Risch
4ca331c0a7
Bump version to v1.5.0 and copy docs folder ( #2625 )
...
* bump version to v1.5.0 and copy docs folder
* Update Documentation & Code Style
* update links to v1.5.0
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 17:20:42 +02:00
Vladimir Blagojevic
e10a3fba74
Add Generative Pseudo Labeling ( #2388 )
2022-06-02 10:12:47 -04:00
bogdankostic
61d9429c25
Simplify loading of EmbeddingRetriever
( #2619 )
...
* Infer model format for EmbeddingRetriever automatically
* Update Documentation & Code Style
* Adapt conftest to automatic inference of model_format
* Update Documentation & Code Style
* Fix tests
* Update Documentation & Code Style
* Fix tests
* Adapt tutorials
* Update Documentation & Code Style
* Add test for similarity scores with sentence transformers
* Adapt doc string and warning message
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-02 15:05:29 +02:00
bogdankostic
0395533a78
Add run_batch
for standard pipelines ( #2595 )
...
* Add run_batch for standard pipelines
* Update Documentation & Code Style
* Fix mypy
* Remove code duplication
* Fix linter
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-27 10:42:48 +02:00
tstadel
7caca41c5d
Support context matching in pipeline.eval()
( #2482 )
...
* calculate context pred metrics
* Update Documentation & Code Style
* extend doc_relevance_col values
* fix import order
* Update Documentation & Code Style
* fix mypy
* fix typings literal import
* add option for custom document_id_field
* Update Documentation & Code Style
* fix tests and dataframe col-order
* Update Documentation & Code Style
* rename content to context in eval dataframe
* add backward compatibility to EvaluationResult.load()
* Update Documentation & Code Style
* add docstrings
* Update Documentation & Code Style
* support sas
* Update Documentation & Code Style
* add answer_scope param
* Update Documentation & Code Style
* rework doc_relevance_col and keep document_id col in case of custom_document_id_field
* Update Documentation & Code Style
* improve docstrings
* Update Documentation & Code Style
* rename document_relevance_criterion into document_scope
* Update Documentation & Code Style
* add document_scope and answer_scope to print_eval_report
* support all new features in execute_eval_run()
* fix imports
* fix mypy
* Update Documentation & Code Style
* rename pred_label_sas_grid into pred_label_matrix
* update dataframe schema and sorting
* Update Documentation & Code Style
* pass through context_matching params and extend document_scope test
* Update Documentation & Code Style
* add answer_scope tests
* fix context_matching_threshold for document metrics
* shorten dataframe apply calls
* Update Documentation & Code Style
* fix queries getting lost if nothing was retrieved
* Update Documentation & Code Style
* Update Documentation & Code Style
* use document_id scopes
* Update Documentation & Code Style
* fix answer_scope literal
* Update Documentation & Code Style
* update the docs (lg changes)
* Update Documentation & Code Style
* update tutorial 5
* Update Documentation & Code Style
* fix tests
* Add minor lg updates
* final docstring changes
* fix single quotes in docstrings
* Update Documentation & Code Style
* dataframe scopes added for each column
* better docstrings for context_matching params
* Update Documentation & Code Style
* fix summarizer eval test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2022-05-24 18:11:52 +02:00
bogdankostic
867695ad0c
Change signature of queries param in batch methods ( #2575 )
...
* Change signature of queries param in batch methods
* Update Documentation & Code Style
* Fix mypy
* Remove unused import
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 12:33:45 +02:00
Julian Risch
075ed7fbcb
Remove encoding option from PDFToTextOCRConverter ( #2553 )
...
* remove encoding option from PDFToTextOCRConverter
* Update Documentation & Code Style
* add unused 'encoding' param to PDFToTextOCRConverter
* Update Documentation & Code Style
* call run instead of convert to use ligature replacing
* Update Documentation & Code Style
* add text to check installed poppler version
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 11:31:32 +02:00
dimitrisna
5bda63a6c0
Add training checkpoint in retriever trainer ( #2543 )
...
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update dense.py
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 09:51:26 +02:00
Agnieszka Marzec
ebd54b225b
Update Ray pipeline docs with validation info ( #2590 )
...
* Update Ray pipeline docs
* Add Sara's suggestion
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-24 09:29:52 +02:00
tstadel
0e83535108
Show search endpoint after deepset Cloud deployment ( #2569 )
...
* show try-out-message after deployment
* better messages
* Update Documentation & Code Style
* tests added
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-23 14:19:31 +02:00
tstadel
f6e3a63906
Prevent losing names of utilized components when loaded from config ( #2525 )
...
* Prevent losing names of utilized components when loaded from config
* Update Documentation & Code Style
* update test
* fix failing tests
* Update Documentation & Code Style
* fix even more tests
* Update Documentation & Code Style
* incorporate review feedback
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-18 14:17:54 +02:00
tstadel
110b9c2b0a
Warnings for write operations of DeepsetCloudDocumentStore
( #2565 )
...
* log inputs to write operations
* Update Documentation & Code Style
* adjust tests
* simplify by using decorator for write operation functions
* Update Documentation & Code Style
* fix comma
* fix comma in test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-17 17:53:55 +02:00
MichelBartels
a952ba240f
Include meta data when computing embeddings in EmbeddingRetriever ( #2559 )
...
* include meta data when calculating embeddings in EmbeddingRetriever
* Update Documentation & Code Style
* fix None meta field
* remove default values
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-17 12:37:04 +02:00
MichelBartels
686e9d24ef
Documenting output score of JoinDocuments when using concatenation ( #2561 )
...
* add documentation regarding the score of JoinDocuments when using concatenation
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-16 18:30:07 +02:00
Agnieszka Marzec
2d03a26045
Minor lg changes ( #2533 )
...
* Minor lg change
* Update Documentation & Code Style
* Fix missing articles
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-13 16:12:22 +02:00
Agnieszka Marzec
1ae5a1449b
Update run() and run_batch() params descriptions in API ( #2542 )
...
* Update run() and run_batch() params descriptions
* Update Documentation & Code Style
* Update api params descriptions
* Update Documentation & Code Style
* Fix typo
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Add Bogdan's suggestions
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2022-05-13 15:11:01 +02:00
bogdankostic
738e008020
Add run_batch
method to all nodes and Pipeline
to allow batch querying ( #2481 )
...
* Add run_batch methods for batch querying
* Update Documentation & Code Style
* Fix mypy
* Update Documentation & Code Style
* Fix mypy
* Fix linter
* Fix tests
* Update Documentation & Code Style
* Fix tests
* Update Documentation & Code Style
* Fix mypy
* Fix rest api test
* Update Documentation & Code Style
* Add Doc strings
* Update Documentation & Code Style
* Add batch_size as attribute to nodes supporting batching
* Adapt error messages
* Adapt type of filters in retrievers
* Revert change about truncation_warning in summarizer
* Unify multiple_doc_lists tests
* Use smaller models in extractor tests
* Add return types to JoinAnswers and RouteDocuments
* Adapt return statements in reader's run_batch method
* Allow list of filters
* Adapt error messages
* Update Documentation & Code Style
* Fix tests
* Fix mypy
* Adapt print_questions
* Remove disabling warning about too many public methods
* Add flag for pylint to disable warning about too many public methods in pipelines/base.py and document_stores/base.py
* Add type check
* Update Documentation & Code Style
* Adapt tutorial 11
* Update Documentation & Code Style
* Add query_batch method for DCDocStore
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-11 11:11:00 +02:00
bogdankostic
4581b91e83
Make DeepsetCloudDocumentStore
work with non-existing index ( #2513 )
...
* Make DeepsetCloudDocumentStore work with non-existing index
* Update Documentation & Code Style
* Add tests
* Update Documentation & Code Style
* Fix tests, adapt warning messages + lowercase deepset
* Update Documentation & Code Style
* Fix typo in test
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-10 15:21:35 +02:00
Branden Chan
43bfea6f3d
Add sort arg to JoinAnswers ( #2436 )
...
* Add sort arg to JoinAnswers
* Update Documentation & Code Style
* Change naming and docstring
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-10 11:47:00 +02:00
Sara Zan
3d8bdf3cb6
Remove safe import from ElasticsearchDocumentStore
( #2522 )
...
* Update version to 1.4.1rc0
* Elasticsearch is not an optional dependency
* Fix import path
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-09 18:07:42 +02:00
Gabriel Altay
988568882a
fix small typo in Document doc string ( #2520 )
...
* fix small typo in Document doc string
Was going through the tutorial, then digging through the code and just noticed a small typo
* generate markdown file changes from docstrings
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-05-09 18:04:33 +02:00
Branden Chan
caf1336424
Adjust pydoc markdown config so methods shown with classes ( #2511 )
...
* add_member_class_prefix: true
* Update Documentation & Code Style
* Trigger redeploy
* Trigger redeploy
* Fix pydoc param
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 16:00:08 +02:00
Sara Zan
1ed407cb5a
Update version to 1.4.1rc0 ( #2509 )
...
* Update version to 1.4.1rc0
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 11:46:31 +02:00
Julian Risch
081b886aa1
Release v1.4.0 ( #2502 )
...
* delete unneeded files of last release
* add v1.4.0 docs with updated links
* upgrade version number
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-05 12:24:45 +02:00
Julian Risch
1418f0c603
change milvus links from 2.0.0 to 2.0.x ( #2496 )
...
* change milvus links from 2.0.0 to 2.0.x
* Update Documentation & Code Style
* fix two broken links
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 18:30:50 +02:00
Sara Zan
f8e02310bf
Validate YAML files without loading the nodes ( #2438 )
...
* Remove BasePipeline and make a module for RayPipeline
* Can load pipelines from yaml, plenty of issues left
* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it
* Fix pipeline tests
* Move some tests out of test_pipeline.py and create MockDenseRetriever
* myoy and pylint (silencing too-many-public-methods)
* Fix issue found in some yaml files and in schema files
* Fix paths to YAML and fix some typos in Ray
* Fix eval tests
* Simplify MockDenseRetriever
* Fix Ray test
* Accidentally pushed merge coinflict, fixed
* Typo in schemas
* Typo in _json_schema.py
* Slightly reduce noisyness of version validation warnings
* Fix version logs tests
* Fix version logs tests again
* remove seemingly unused file
* Add check and test to avoid adding the same node to the pipeline twice
* Update Documentation & Code Style
* Revert config to pipeline_config
* Remo0ve unused import
* Complete reverting to pipeline_config
* Some more stray config=
* Update Documentation & Code Style
* Feedback
* Move back other_nodes tests into pipeline tests temporarily
* Update Documentation & Code Style
* Fixing tests
* Update Documentation & Code Style
* Fixing ray and standard pipeline tests
* Rename colliding load() methods in dense retrievers and faiss
* Update Documentation & Code Style
* Fix mypy on ray.py as well
* Add check for no root node
* Fix tests to use load_from_directory and load_index
* Try to workaround the disabled add_node of RayPipeline
* Update Documentation & Code Style
* Fix Ray test
* Fix FAISS tests
* Relax class check in _add_node_to_pipeline_graph
* Update Documentation & Code Style
* Try to fix mypy in ray.py
* unused import
* Try another fix for Ray
* Fix connector tests
* Update Documentation & Code Style
* Fix ray
* Update Documentation & Code Style
* use BaseComponent.load() in pipelines/base.py
* another round of feedback
* stray BaseComponent.load()
* Update Documentation & Code Style
* Fix FAISS tests too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
2022-05-04 17:39:06 +02:00
Sara Zan
01ea4bf21f
Change default encoding for PDFToTextConverter
from Latin 1
to UTF-8
( #2420 )
...
* Change default encoding for PDFToTextConverter
* Update Documentation & Code Style
* Improve docstring
* Update Documentation & Code Style
* Add list of ligatures to ignore and add the possibility to modify such list at need
* Add docstring
* Add tests
* Rename parameter
* Update Documentation & Code Style
* Move implementation into the base converter to make mypy happier
* Update Documentation & Code Style
* mypy and pylint
* mypy
* move encoding parameter to init of PDFToTextConverter
* Update Documentation & Code Style
* make utf8 default and fix mypy
* Update Documentation & Code Style
* Update Documentation & Code Style
* remove note on encoding in tutorial8
* Update Documentation & Code Style
* skip OCRConverter and test converter.run
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-05-04 17:01:45 +02:00
bogdankostic
a4e603ce87
Deprecate Milvus1DocumentStore
( #2495 )
...
* Add warning message
* Update doc string
* Update Documentation & Code Style
* Change DeprecationWarning to FutureWarning
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 15:09:57 +02:00
Julian Risch
970c476615
Align TransformersReader defaults with FARMReader ( #2490 )
...
* Align TransformersReader defaults with vFARMReader
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-04 10:04:18 +02:00
tstadel
7d6b3fe954
Add flag to disable scaling scores to probabilities ( #2454 )
...
* add scale_scores_to_probabilities flag
* Update Documentation & Code Style
* fix tests
* fix sql mypy
* Update Documentation & Code Style
* fix responses
* Update Documentation & Code Style
* rename to scale_score_to_probability + docstrings
* use BaseDocumentStore.score_to_probability in elasticsearch and milvus2
* Update Documentation & Code Style
* fix tests
* Update Documentation & Code Style
* add tests
* improve naming
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-02 13:35:07 +02:00
Tuana Celik
e2b85e2913
Renaming the ElasticsearchFilterOnlyRetriever to FilterRetriever ( #2461 )
...
* Renaming the ElasticsearchFilterOnlyRetriever to FilterRetriever
* adding missed init file
* Update Documentation & Code Style
* fixed docstring
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-29 10:16:02 +02:00
Malte Pietsch
766e75370c
Update docs of DeepsetCloudDocumentStore ( #2460 )
...
* Update docs of DeepsetCloudDocumentStore
* Update Documentation & Code Style
* Update docstring
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
* Update Documentation & Code Style
* move DEFAULT_API_ENDPOINT
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
2022-04-27 19:40:39 +02:00
tstadel
7498c7c6fb
Fix and use delete_index instead of delete_documents in tests ( #2453 )
...
* use delete_index instead of delete_documents in tests
* fix delete_index
* fix delete_index() in memory and milvus
* fix imports
* fix memory keyerrors
* Update Documentation & Code Style
* increase timeout for pinecone tests to 60 minutes
* clean get_document_store()
* use recreate_index in tests
* Update Documentation & Code Style
* fix tests
* fix remaining tests
* log index deleted
* fix test_eval_pipeline
* simplify existing index detection in weaviate
* delete label_index on recreate_index for pinecone and milvus
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-26 19:06:30 +02:00
Tuana Celik
d49e92e21c
ElasticsearchRetriever to BM25Retriever ( #2423 )
...
* change class names to bm25
* Update Documentation & Code Style
* Update Documentation & Code Style
* Update Documentation & Code Style
* Add back all_terms_must_match
* fix syntax
* Update Documentation & Code Style
* Update Documentation & Code Style
* Creating a wrapper for old ES retriever with deprecated wrapper
* Update Documentation & Code Style
* New method for deprecating old ESRetriever
* New attempt for deprecating the ESRetriever
* Reverting to the simplest solution - warning logged
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
2022-04-26 16:09:39 +02:00
tstadel
60ff46e4e1
Log evaluation results to MLflow ( #2337 )
...
* track eval results in mlflow
* Update Documentation & Code Style
* add pipeline.yaml and environment info
* improve logging to mlflow
* Update Documentation & Code Style
* introduce ExperimentTracker
* Update Documentation & Code Style
* move modeling.utils.logger to utils.experiment_tracking
* renaming: tracker and TrackingHead
* Update Documentation & Code Style
* refactor env tracking
* fix pylint findings
* Update Documentation & Code Style
* rename MLFlowTrackingHead to MLflowTrackingHead
* implement dataset hash
* Update Documentation & Code Style
* set docstrings
* Update Documentation & Code Style
* introduce PipelineBundle and Corpus
* Update Documentation & Code Style
* support reusing index
* Update Documentation & Code Style
* rename Corpus to FileCorpus
* fix Corpus -> FileCorpus
* Update Documentation & Code Style
* resolve cyclic dependencies
* fix linter issues
* Update Documentation & Code Style
* remove helper classes
* Update Documentation & Code Style
* fix imports
* fix another unused import
* update docstrings
* Update Documentation & Code Style
* simplify usage of experiment tracking tools
* fix Literal import
* revert schema changes
* Update Documentation & Code Style
* always end run
* Update Documentation & Code Style
* fix mypy issue
* rename to execute_eval_run
* Update Documentation & Code Style
* fix merge of get_or_create_env_meta_data
* improve docstrings
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-25 20:14:48 +02:00
Adrien Wald
c401e86099
Use ElasticsearchDocumentStore.get_all_documents
in ElasticsearchFilterOnlyRetriever.retrieve
( #2151 )
...
* use get_all_documents in ElasticsearchFilterOnlyRetriever.retrieve
* Update Documentation & Code Style
* add test case for es_filter_only retriever
* Update Documentation & Code Style
* fix test by adding empty string for query
* Update Documentation & Code Style
* add explicit name of argument "query"
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2022-04-25 09:53:48 +02:00
tstadel
25475a68c7
Match answer sorting in QuestionAnsweringHead
with FARMReader
( #2414 )
...
* match no_answer confidence
* Update Documentation & Code Style
* test added
* Update Documentation & Code Style
* fix tests
* Update Documentation & Code Style
* apply penalties of scores to confidences too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-21 11:24:39 +02:00
Sara Zan
07d7ecbff1
Make python-magic
fully optional ( #2412 )
...
* Add windows specific package for python-magic
* Disable some tests on Windows and add explanatory warning in case of issues with libmagic
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-20 09:18:02 +02:00
tstadel
e862400256
Prevent Stackoverflow on Windows CI ( #2426 )
...
* prevent stackoverflow on windows ci
* Update Documentation & Code Style
* fix is_windows condition
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2022-04-19 16:10:39 +02:00
Sara Zan
4eec2dc45e
Change YAML version exception into a warning ( #2385 )
...
* Change exception into warning, add strict_version param, and remove compatibility between schemas
* Simplify update_json_schema
* Rename unstable into master
* Prevent validate_config from changing the config to validate
* Fix version validation and add tests
* Rename master into ignore
* Complete parameter rename
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-19 16:08:08 +02:00
Sara Zan
ba9c976bfe
Update pdftotext
link ( #2432 )
...
* Update pdftotext link
* Update Documentation & Code Style
* Update Tutorial8_Preprocessing.ipynb
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-19 14:30:18 +02:00
Sara Zan
929c685cda
Forbid usage of *args
and **kwargs
in any node's __init__
( #2362 )
...
* Add failing test
* Remove `**kwargs` from docstores' `__init__` functions (#2407 )
* Remove kwargs from ESDocStore subclasses
* Remove kwargs from subclasses of SQLDocumentStore
* Remove kwargs from Weaviate
* Revert change in pinecone
* Fix tests
* Fix retriever test wirh weaviate
* Change Exception into DocumentStoreError
* Update Documentation & Code Style
* Remove `**kwargs` from `FARMReader` (#2413 )
* Remove FARMReader kwargs without trying to replace them functionally
* Update Documentation & Code Style
* enforce same index values before and after saving/loading eval dataframes (#2398 )
* Add tests for missing `__init__` and `super().__init__()` in custom nodes (#2350 )
* Add tests for missing init and super
* Update Documentation & Code Style
* change in with endswith
* Move test in pipeline.py and change test in pipeline_yaml.py
* Update Documentation & Code Style
* Use caplog to test the warning
* Update Documentation & Code Style
* move tests into test_pipeline and use get_config
* Update Documentation & Code Style
* Unmock version name
* Improve variadic args test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-14 16:42:02 +02:00
Sara Zan
96a538b182
Pylint (import related warnings) and REST API improvements ( #2326 )
...
* remove duplicate imports
* fix ungrouped-imports
* Fix wrong-import-position
* Fix unused-import
* pyproject.toml
* Working on wrong-import-order
* Solve wrong-import-order
* fix Pool import
* Move open_search_index_to_document_store and elasticsearch_index_to_document_store in elasticsearch.py
* remove Converter from modeling
* Fix mypy issues on adaptive_model.py
* create es_converter.py
* remove converter import
* change import path in tests
* Restructure REST API to not rely on global vars from search.apy and improve tests
* Fix openapi generator
* Move variable initialization
* Change type of FilterRequest.filters
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-12 16:41:05 +02:00
Sara Zan
4862bbcd73
Add devices
alongside use_gpu
in FARMReader
( #2294 )
...
* Make initialize_device_settings take a devices list, and change signature of FARMReader
* reintroduce use_gpu and propagate devices to other methods
* fix typing for initialize_device_settings
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-12 14:21:25 +02:00
tstadel
8342a6c1d6
Fix eval discrepancies ( #2381 )
...
* fix eval discrepancies
* Update Documentation & Code Style
* fix reader eval comparison
* Update Documentation & Code Style
* slightly improve messed up top_n_f1 func
* add no_answer hint to reader.eval metrics
* fix tut5
* Update Documentation & Code Style
* correct doc_relevance_col in tests
* Update Documentation & Code Style
* redefine recall metrics for no_answers
* fix bugs in EvalAnswers
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-04-12 09:24:22 +02:00