For some pipelines, both the CLI and Python API of PaddleOCR support specifying multiple inference devices simultaneously. If multiple devices are specified, during pipeline initialization, an instance of the underlying pipeline class will be created on each device, and the received inputs will be processed using parallel inference. For example, for the document image preprocessing pipeline:
```bash
paddleocr doc_preprocessor \
--input input_images/ \
--device 'gpu:0,1,2,3' \
--use_doc_orientation_classify True \
--use_doc_unwarping True
--save_path ./output \
```
```python
from paddleocr import DocPreprocessor
pipeline = DocPreprocessor(device="gpu:0,1,2,3")
output = pipeline.predict(
input="input_images/",
use_doc_orientation_classify=True,
use_doc_unwarping=True)
```
Both examples above use 4 GPUs (numbered 0, 1, 2, 3) to perform parallel inference on the `doc_test_rotated.jpg` image.
When specifying multiple devices, the inference interface remains consistent with that of single-device usage. Please refer to the production line usage tutorial to check whether a specific production line supports multiple inference devices.
## Example of Multi-Process Parallel Inference
Beyond PaddleOCR's built-in multi-device parallel inference capability, users can also implement parallelism by wrapping PaddleOCR pipeline API calls themselves according to their specific scenario, with a view to achieving a better speedup. Below is an example of using Python multiprocessing to perform multi-GPU, multi-instance parallel processing on files in an input directory.
```python
import argparse
import sys
from multiprocessing import Manager, Process
from pathlib import Path
from queue import Empty
import paddleocr
def load_pipeline(class_name: str, device: str):
if not hasattr(paddleocr, class_name):
raise ValueError(f"Class {class_name} not found in paddleocr module.")