* Testing black on ui/
* Applying black on docstores
* Add latest docstring and tutorial changes
* Create a single GH action for Black and docs to reduce commit noise to the minimum, slightly refactor the OpenAPI action too
* Remove comments
* Relax constraints on pydoc-markdown
* Split temporary black from the docs. Pydoc-markdown was obsolete and needs a separate PR to upgrade
* Fix a couple of bugs
* Add a type: ignore that was missing somehow
* Give path to black
* Apply Black
* Apply Black
* Relocate a couple of type: ignore
* Update documentation
* Make Linux CI run after applying Black
* Triggering Black
* Apply Black
* Remove dependency, does not work well
* Remove manually double trailing commas
* Update documentation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Files moved, imports all broken
* Fix most imports and docstrings into
* Fix the paths to the modules in the API docs
* Add latest docstring and tutorial changes
* Add a few pipelines that were lost in the inports
* Fix a bunch of mypy warnings
* Add latest docstring and tutorial changes
* Create a file_classifier module
* Add docs for file_classifier
* Fixed most circular imports, now the REST API can start
* Add latest docstring and tutorial changes
* Tackling more mypy issues
* Reintroduce from FARM and fix last mypy issues hopefully
* Re-enable old-style imports
* Fix some more import from the top-level package in an attempt to sort out circular imports
* Fix some imports in tests to new-style to prevent failed class equalities from breaking tests
* Change document_store into document_stores
* Update imports in tutorials
* Add latest docstring and tutorial changes
* Probably fixes summarizer tests
* Improve the old-style import allowing module imports (should work)
* Try to fix the docs
* Remove dedicated KnowledgeGraph page from autodocs
* Remove dedicated GraphRetriever page from autodocs
* Fix generate_docstrings.sh with an updated list of yaml files to look for
* Fix some more modules in the docs
* Fix the document stores docs too
* Fix a small issue on Tutorial14
* Add latest docstring and tutorial changes
* Add deprecation warning to old-style imports
* Remove stray folder and import Dict into dense.py
* Change import path for MLFlowLogger
* Add old loggers path to the import path aliases
* Fix debug output of convert_ipynb.py
* Fix circular import on BaseRetriever
* Missed one merge block
* re-run tutorial 5
* Fix imports in tutorial 5
* Re-enable squad_to_dpr CLI from the root package and move get_batches_from_generator into document_stores.base
* Add latest docstring and tutorial changes
* Fix typo in utils __init__
* Fix a few more imports
* Fix benchmarks too
* New-style imports in test_knowledge_graph
* Rollback setup.py
* Rollback squad_to_dpr too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Added support for Multi-GPU inference to DPR including benchmark
* fixed multi gpu
* added batch size to benchmark to better reflect multi gpu capabilities
* remove unnecessary entry in config.json
* fixed typos
* fixed config name
* update benchmark to use DEVICES constant
* changed multi gpu parameters and updated docstring
* adds silent fallback on cpu
* update doc string, warning and config
Co-authored-by: Michel Bartels <kontakt@michelbartels.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* initial test cml
* Update cml.yaml
* WIP test workflow
* switch to general ubuntu ami
* switch to general ubuntu ami
* disable gpu for tests
* rm gpu infos
* rm gpu infos
* update token env
* switch github token
* add postgres
* test db connection
* fix typo
* remove tty
* add sleep for db
* debug runner
* debug removal postgres
* debug: reset to working commit
* debug: change github token
* switch to new bot token
* debug token
* add back postgres
* adjust network runner docker
* add elastic
* fix typo
* adjust working dir
* fix benchmark execution
* enable s3 downloads
* add query benchmark. fix path
* add saving of markdown files
* cat md files. add faiss+dpr. increase n_queries
* switch to GPU instance
* switch availability zone
* switch to public aws DL ami
* increase volume size
* rm faiss. fix error logging
* save markdown files
* add reader benchmarks
* add download of squad data
* correct reader metric normalization
* fix newlines between reports
* fix max_docs for reader eval data. remove max_docs from ci run config
* fix mypy. switch workflow trigger
* try trigger for label
* try trigger for label
* change trigger syntax
* debug machine shutdown with test workflow
* add es and postgres to test workflow
* Revert "add es and postgres to test workflow"
This reverts commit 6f038d3d7f12eea924b54529e61b192858eaa9d5.
* Revert "debug machine shutdown with test workflow"
This reverts commit db70eabae8850b88e1d61fd79b04d4f49d54990a.
* fix typo in action. set benchmark config back to original
* add time and perf benchmark for es
* Add retriever benchmarking
* Add Reader benchmarking
* add nq to squad conversion
* add conversion stats
* clean benchmarks
* Add link to dataset
* Update imports
* add first support for neg psgs
* Refactor test
* set max_seq_len
* cleanup benchmark
* begin retriever speed benchmarking
* Add support for retriever query index benchmarking
* improve reader eval, retriever speed benchmarking
* improve retriever speed benchmarking
* Add retriever accuracy benchmark
* Add neg doc shuffling
* Add top_n
* 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging
* Add models to sweep
* add option for faiss index type
* remove unneeded line
* change faiss to faiss_flat
* begin automatic benchmark script
* remove existing postgres docker for benchmarking
* Add data processing scripts
* Remove shuffle in script bc data already shuffled
* switch hnsw setup from 256 to 128
* change es similarity to dot product by default
* Error includes stack trace
* Change ES default timeout
* remove delete_docs() from timing for indexing
* Add support for website export
* update website on push to benchmarks
* add complete benchmarks results
* new json format
* removed NaN as is not a valid json token
* versioning for docs
* unsaved changes
* cleaning
* cleaning
* Edit format of benchmarks data
* update also jsons in v0.4.0
Co-authored-by: brandenchan <brandenchan@icloud.com>
Co-authored-by: deepset <deepset@Crenolape.localdomain>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
* add time and perf benchmark for es
* Add retriever benchmarking
* Add Reader benchmarking
* add nq to squad conversion
* add conversion stats
* clean benchmarks
* Add link to dataset
* Update imports
* add first support for neg psgs
* Refactor test
* set max_seq_len
* cleanup benchmark
* begin retriever speed benchmarking
* Add support for retriever query index benchmarking
* improve reader eval, retriever speed benchmarking
* improve retriever speed benchmarking
* Add retriever accuracy benchmark
* Add neg doc shuffling
* Add top_n
* 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging
* Add models to sweep
* add option for faiss index type
* remove unneeded line
* change faiss to faiss_flat
* begin automatic benchmark script
* remove existing postgres docker for benchmarking
* Add data processing scripts
* Remove shuffle in script bc data already shuffled
* switch hnsw setup from 256 to 128
* change es similarity to dot product by default
* Error includes stack trace
* Change ES default timeout
* remove delete_docs() from timing for indexing
* Add support for website export
* update website on push to benchmarks
* add complete benchmarks results
* new json format
* removed NaN as is not a valid json token
* fix benchmarking for faiss hnsw queries. do sql calls in update_embeddings() as batches
* update benchmarks for hnsw 128,20,80
* don't delete full index in delete_all_documents()
* update texts for charts
* update recall column for retriever
* change scale and add units to desc
* add units to legend
* add axis titles. update desc
* add html tags
Co-authored-by: deepset <deepset@Crenolape.localdomain>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>