Markus Paff 2531c8e061
Add versioning docs (#495)
* add time and perf benchmark for es

* Add retriever benchmarking

* Add Reader benchmarking

* add nq to squad conversion

* add conversion stats

* clean benchmarks

* Add link to dataset

* Update imports

* add first support for neg psgs

* Refactor test

* set max_seq_len

* cleanup benchmark

* begin retriever speed benchmarking

* Add support for retriever query index benchmarking

* improve reader eval, retriever speed benchmarking

* improve retriever speed benchmarking

* Add retriever accuracy benchmark

* Add neg doc shuffling

* Add top_n

* 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging

* Add models to sweep

* add option for faiss index type

* remove unneeded line

* change faiss to faiss_flat

* begin automatic benchmark script

* remove existing postgres docker for benchmarking

* Add data processing scripts

* Remove shuffle in script bc data already shuffled

* switch hnsw setup from 256 to 128

* change es similarity to dot product by default

* Error includes stack trace

* Change ES default timeout

* remove delete_docs() from timing for indexing

* Add support for website export

* update website on push to benchmarks

* add complete benchmarks results

* new json format

* removed NaN as is not a valid json token

* versioning for docs

* unsaved changes

* cleaning

* cleaning

* Edit format of benchmarks data

* update also jsons in v0.4.0

Co-authored-by: brandenchan <brandenchan@icloud.com>
Co-authored-by: deepset <deepset@Crenolape.localdomain>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-10-19 11:46:51 +02:00

46 lines
2.3 KiB
Markdown

*******************************************************
# Haystack — Docstrings Generation
*******************************************************
Setup Pydoc-Markdown
============
Pydoc-Markdown is a tool and library to create Python API documentation in Markdown format based on lib2to3, allowing it to parse your Python code without executing it ([link](https://pydoc-markdown.readthedocs.io/en/latest/)).
Pydoc-Markdown can be installed from PyPI ([Get Started](https://pydoc-markdown.readthedocs.io/en/latest/docs/getting-started/))
``
$ pipx install 'pydoc-markdown>=3.0.0,<4.0.0'
$ pydoc-markdown --version
``
Configuration
============
Pydoc will read the configuration from a `.yml` file which is located in the current working directory. Our files contains three main sections:
- **loader**: A list of plugins that load API objects from python source files.
- **type**: Loader for python source files
- **search_path**: Location of source files
- **ignore_when_discovered**: Define which files should be ignored
- **processor**: A list of plugins that process API objects to modify their docstrings (e.g. to adapt them from a documentation format to Markdown or to remove items that should not be rendered into the documentation).
- **ignore_when_discovered**: Define which API objects should be ignored
- **documented_only**: Only documented API objects
- **do_not_filter_modules**: Do not filter module objects
- **skip_empty_modules**: Skip modules without content
- **renderer**: A plugin that produces the output files.
- **type**: Define the renderer which you want to use. We are using the Markdown renderer as it can be configured in very detail.
- **descriptive_class_title**: Remove the word "Object" from class titles.
- **filename**: file name of the generated file
Geneate Docstrings
============
Every .yml file will generate a new markdown file. Run one of the following commands to generate the needed output:
- **Document store**: `pydoc-markdown pydoc-markdown-document-store.yml`
- **File converters**: `pydoc-markdown pydoc-markdown-file-converters.yml`
- **Preprocessor**: `pydoc-markdown pydoc-markdown-preprocessor.yml`
- **Reder**: `pydoc-markdown pydoc-markdown-reader.yml`
- **Retriever**: `pydoc-markdown pydoc-markdown-retriever.yml`