haystack/test/samples/pipeline/ray.advanced.haystack-pipeline.yml
Zoltan Fedor aafa017c17
Refactoring the Raypipeline.run method - merging it with the Pipeline.run (#2981)
* Refactoring the `Raypipeline.run` method - merging it with the `Pipeline.run`

This is to fix #2968

* Bug: variable `i` was already in use

* Removing unused imports

* Removing unused import

* [EMPTY] Re-trigger CI

* Addressing concerns raised pre-review

- Removing the attempt to try to make it without the need for `JoinDocuments` - it is okey to fail without `JoinDocuments` for certain pipelines.

* Refactoring based on reviews
2022-08-11 09:50:14 +02:00

74 lines
1.9 KiB
YAML

version: ignore
extras: ray
components:
- name: DocumentStore
type: ElasticsearchDocumentStore
params:
index: haystack_test
label_index: haystack_test_label
- name: ESRetriever1
type: BM25Retriever
params:
document_store: DocumentStore
- name: ESRetriever2
# type: TfidfRetriever # can't use TfidfRetriever until https://github.com/deepset-ai/haystack/pull/2984 isn't merged
type: BM25Retriever
params:
document_store: DocumentStore
- name: Reader
type: FARMReader
params:
no_ans_boost: -10
model_name_or_path: deepset/roberta-base-squad2
num_processes: 0
- name: PDFConverter
type: PDFToTextConverter
params:
remove_numeric_tables: false
- name: Preprocessor
type: PreProcessor
params:
clean_whitespace: true
- name: IndexTimeDocumentClassifier
type: TransformersDocumentClassifier
params:
batch_size: 16
use_gpu: false
- name: QueryTimeDocumentClassifier
type: TransformersDocumentClassifier
params:
use_gpu: false
- name: JoinDocuments
params: {}
type: JoinDocuments
pipelines:
- name: ray_query_pipeline
nodes:
- name: ESRetriever1
inputs: [ Query ]
serve_deployment_kwargs:
num_replicas: 2
version: Twenty
ray_actor_options:
# num_gpus: 0.25 # we have no GPU to test this
num_cpus: 0.25
max_concurrent_queries: 17
- name: ESRetriever2
inputs: [ Query ]
serve_deployment_kwargs:
num_replicas: 2
version: Twenty
ray_actor_options:
# num_gpus: 0.25 # we have no GPU to test this
num_cpus: 0.25
max_concurrent_queries: 15
- name: JoinDocuments
inputs:
- ESRetriever1
- ESRetriever2
- name: Reader
inputs: [ JoinDocuments ]