Markus Paff 2531c8e061
Add versioning docs (#495)
* add time and perf benchmark for es

* Add retriever benchmarking

* Add Reader benchmarking

* add nq to squad conversion

* add conversion stats

* clean benchmarks

* Add link to dataset

* Update imports

* add first support for neg psgs

* Refactor test

* set max_seq_len

* cleanup benchmark

* begin retriever speed benchmarking

* Add support for retriever query index benchmarking

* improve reader eval, retriever speed benchmarking

* improve retriever speed benchmarking

* Add retriever accuracy benchmark

* Add neg doc shuffling

* Add top_n

* 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging

* Add models to sweep

* add option for faiss index type

* remove unneeded line

* change faiss to faiss_flat

* begin automatic benchmark script

* remove existing postgres docker for benchmarking

* Add data processing scripts

* Remove shuffle in script bc data already shuffled

* switch hnsw setup from 256 to 128

* change es similarity to dot product by default

* Error includes stack trace

* Change ES default timeout

* remove delete_docs() from timing for indexing

* Add support for website export

* update website on push to benchmarks

* add complete benchmarks results

* new json format

* removed NaN as is not a valid json token

* versioning for docs

* unsaved changes

* cleaning

* cleaning

* Edit format of benchmarks data

* update also jsons in v0.4.0

Co-authored-by: brandenchan <brandenchan@icloud.com>
Co-authored-by: deepset <deepset@Crenolape.localdomain>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-10-19 11:46:51 +02:00
..
2020-10-19 11:46:51 +02:00
2020-10-19 11:46:51 +02:00
2020-10-19 11:46:51 +02:00
2020-10-19 11:46:51 +02:00
2020-10-19 11:46:51 +02:00
2020-10-19 11:46:51 +02:00


Haystack — Docstrings Generation


Setup Pydoc-Markdown

Pydoc-Markdown is a tool and library to create Python API documentation in Markdown format based on lib2to3, allowing it to parse your Python code without executing it (link).

Pydoc-Markdown can be installed from PyPI (Get Started)

$ pipx install 'pydoc-markdown>=3.0.0,<4.0.0' $ pydoc-markdown --version

Configuration

Pydoc will read the configuration from a .yml file which is located in the current working directory. Our files contains three main sections:

  • loader: A list of plugins that load API objects from python source files.
    • type: Loader for python source files
    • search_path: Location of source files
    • ignore_when_discovered: Define which files should be ignored
  • processor: A list of plugins that process API objects to modify their docstrings (e.g. to adapt them from a documentation format to Markdown or to remove items that should not be rendered into the documentation).
    • ignore_when_discovered: Define which API objects should be ignored
    • documented_only: Only documented API objects
    • do_not_filter_modules: Do not filter module objects
    • skip_empty_modules: Skip modules without content
  • renderer: A plugin that produces the output files.
    • type: Define the renderer which you want to use. We are using the Markdown renderer as it can be configured in very detail.
    • descriptive_class_title: Remove the word "Object" from class titles.
    • filename: file name of the generated file

Geneate Docstrings

Every .yml file will generate a new markdown file. Run one of the following commands to generate the needed output:

  • Document store: pydoc-markdown pydoc-markdown-document-store.yml
  • File converters: pydoc-markdown pydoc-markdown-file-converters.yml
  • Preprocessor: pydoc-markdown pydoc-markdown-preprocessor.yml
  • Reder: pydoc-markdown pydoc-markdown-reader.yml
  • Retriever: pydoc-markdown pydoc-markdown-retriever.yml