mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-07-21 07:51:40 +00:00

* add time and perf benchmark for es * Add retriever benchmarking * Add Reader benchmarking * add nq to squad conversion * add conversion stats * clean benchmarks * Add link to dataset * Update imports * add first support for neg psgs * Refactor test * set max_seq_len * cleanup benchmark * begin retriever speed benchmarking * Add support for retriever query index benchmarking * improve reader eval, retriever speed benchmarking * improve retriever speed benchmarking * Add retriever accuracy benchmark * Add neg doc shuffling * Add top_n * 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging * Add models to sweep * add option for faiss index type * remove unneeded line * change faiss to faiss_flat * begin automatic benchmark script * remove existing postgres docker for benchmarking * Add data processing scripts * Remove shuffle in script bc data already shuffled * switch hnsw setup from 256 to 128 * change es similarity to dot product by default * Error includes stack trace * Change ES default timeout * remove delete_docs() from timing for indexing * Add support for website export * update website on push to benchmarks * add complete benchmarks results * new json format * removed NaN as is not a valid json token * versioning for docs * unsaved changes * cleaning * cleaning * Edit format of benchmarks data * update also jsons in v0.4.0 Co-authored-by: brandenchan <brandenchan@icloud.com> Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Haystack — Docstrings Generation
Setup Pydoc-Markdown
Pydoc-Markdown is a tool and library to create Python API documentation in Markdown format based on lib2to3, allowing it to parse your Python code without executing it (link).
Pydoc-Markdown can be installed from PyPI (Get Started)
$ pipx install 'pydoc-markdown>=3.0.0,<4.0.0' $ pydoc-markdown --version
Configuration
Pydoc will read the configuration from a .yml
file which is located in the current working directory. Our files contains three main sections:
- loader: A list of plugins that load API objects from python source files.
- type: Loader for python source files
- search_path: Location of source files
- ignore_when_discovered: Define which files should be ignored
- processor: A list of plugins that process API objects to modify their docstrings (e.g. to adapt them from a documentation format to Markdown or to remove items that should not be rendered into the documentation).
- ignore_when_discovered: Define which API objects should be ignored
- documented_only: Only documented API objects
- do_not_filter_modules: Do not filter module objects
- skip_empty_modules: Skip modules without content
- renderer: A plugin that produces the output files.
- type: Define the renderer which you want to use. We are using the Markdown renderer as it can be configured in very detail.
- descriptive_class_title: Remove the word "Object" from class titles.
- filename: file name of the generated file
Geneate Docstrings
Every .yml file will generate a new markdown file. Run one of the following commands to generate the needed output:
- Document store:
pydoc-markdown pydoc-markdown-document-store.yml
- File converters:
pydoc-markdown pydoc-markdown-file-converters.yml
- Preprocessor:
pydoc-markdown pydoc-markdown-preprocessor.yml
- Reder:
pydoc-markdown pydoc-markdown-reader.yml
- Retriever:
pydoc-markdown pydoc-markdown-retriever.yml