haystack/test/samples/pipeline/test.haystack-pipeline.yml
Sara Zan f8e02310bf
Validate YAML files without loading the nodes (#2438)
* Remove BasePipeline and make a module for RayPipeline

* Can load pipelines from yaml, plenty of issues left

* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it

* Fix pipeline tests

* Move some tests out of test_pipeline.py and create MockDenseRetriever

* myoy and pylint (silencing too-many-public-methods)

* Fix issue found in some yaml files and in schema files

* Fix paths to YAML and fix some typos in Ray

* Fix eval tests

* Simplify MockDenseRetriever

* Fix Ray test

* Accidentally pushed merge coinflict, fixed

* Typo in schemas

* Typo in _json_schema.py

* Slightly reduce noisyness of version validation warnings

* Fix version logs tests

* Fix version logs tests again

* remove seemingly unused file

* Add check and test to avoid adding the same node to the pipeline twice

* Update Documentation & Code Style

* Revert config to pipeline_config

* Remo0ve unused import

* Complete reverting to pipeline_config

* Some more stray config=

* Update Documentation & Code Style

* Feedback

* Move back other_nodes tests into pipeline tests temporarily

* Update Documentation & Code Style

* Fixing tests

* Update Documentation & Code Style

* Fixing ray and standard pipeline tests

* Rename colliding load() methods in dense retrievers and faiss

* Update Documentation & Code Style

* Fix mypy on ray.py as well

* Add check for no root node

* Fix tests to use load_from_directory and load_index

* Try to workaround the disabled add_node of RayPipeline

* Update Documentation & Code Style

* Fix Ray test

* Fix FAISS tests

* Relax class check in _add_node_to_pipeline_graph

* Update Documentation & Code Style

* Try to fix mypy in ray.py

* unused import

* Try another fix for Ray

* Fix connector tests

* Update Documentation & Code Style

* Fix ray

* Update Documentation & Code Style

* use BaseComponent.load() in pipelines/base.py

* another round of feedback

* stray BaseComponent.load()

* Update Documentation & Code Style

* Fix FAISS tests too

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
2022-05-04 17:39:06 +02:00

91 lines
2.1 KiB
YAML

version: ignore
components:
- name: Reader
type: FARMReader
params:
no_ans_boost: -10
model_name_or_path: deepset/roberta-base-squad2
num_processes: 0
- name: ESRetriever
type: BM25Retriever
params:
document_store: DocumentStore
- name: DocumentStore
type: ElasticsearchDocumentStore
params:
index: haystack_test
label_index: haystack_test_label
- name: PDFConverter
type: PDFToTextConverter
params:
remove_numeric_tables: false
- name: TextConverter
type: TextConverter
- name: Preprocessor
type: PreProcessor
params:
clean_whitespace: true
- name: IndexTimeDocumentClassifier
type: TransformersDocumentClassifier
params:
batch_size: 16
use_gpu: false
- name: QueryTimeDocumentClassifier
type: TransformersDocumentClassifier
params:
use_gpu: false
pipelines:
- name: query_pipeline
nodes:
- name: ESRetriever
inputs: [Query]
- name: Reader
inputs: [ESRetriever]
- name: query_pipeline_with_document_classifier
nodes:
- name: ESRetriever
inputs: [Query]
- name: QueryTimeDocumentClassifier
inputs: [ESRetriever]
- name: Reader
inputs: [QueryTimeDocumentClassifier]
- name: indexing_pipeline
nodes:
- name: PDFConverter
inputs: [File]
- name: Preprocessor
inputs: [PDFConverter]
- name: ESRetriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [ESRetriever]
- name: indexing_text_pipeline
nodes:
- name: TextConverter
inputs: [File]
- name: Preprocessor
inputs: [TextConverter]
- name: ESRetriever
inputs: [Preprocessor]
- name: DocumentStore
inputs: [ESRetriever]
- name: indexing_pipeline_with_classifier
nodes:
- name: PDFConverter
inputs: [File]
- name: Preprocessor
inputs: [PDFConverter]
- name: IndexTimeDocumentClassifier
inputs: [Preprocessor]
- name: ESRetriever
inputs: [IndexTimeDocumentClassifier]
- name: DocumentStore
inputs: [ESRetriever]