mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-09-20 05:33:23 +00:00

* Replace FARM import statements; add dependencies * Add InferenceProc., TextCl.Proc., TextPairCl.Proc. * Remove FARMRanker, add type annotations, rename max_sample * Add sample_to_features_text for InferenceProc. * Fix type annotations: model_name_or_path is str not Path * Fix mypy errors: implement _create_dataset in TextCl.Proc. * Add task_type "embeddings" in Inferencer * Allow loading AdaptiveModel for embedding task * Add SQuAD eval metrics; enable InferenceProc for embedding task * Add baskets as param to log_samples and handle empty basket list in log_samples * Remove unused dependencies * Remove FARMClassifier (doc classificer) due to ref to TextClassificationHead * Remove FARMRanker and Classifier from doc generation scripts Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Benchmarks
Run the benchmarks with the following command:
python run.py [--reader] [--retriever_index] [--retriever_query] [--ci] [--update-json]
You can specify which components and processes to benchmark with the following flags.
--reader will trigger the speed and accuracy benchmarks for the reader. Here we simply use the SQuAD dev set.
--retriever_index will trigger indexing benchmarks
--retriever_query will trigger querying benchmarks (embeddings will be loaded from file instead of being computed on the fly)
--ci will cause the the benchmarks to run on a smaller slice of each dataset and a smaller subset of Retriever / Reader / DocStores.
--update-json will cause the script to update the json files in docs/_src/benchmarks so that the website benchmarks will be updated.