--- comments: true --- # Document Image Orientation Classification Module Tutorial ## 1. Overview The Document Image Orientation Classification Module is primarily designed to distinguish the orientation of document images and correct them through post-processing. During processes such as document scanning or ID photo capturing, the device might be rotated to achieve clearer images, resulting in images with various orientations. Standard OCR pipelines may not handle these images effectively. By leveraging image classification techniques, the orientation of documents or IDs containing text regions can be pre-determined and adjusted, thereby improving the accuracy of OCR processing. ## 2. Supported Models List
Model | Model Download Links | Top-1 Acc (%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Size (MB) | Description |
---|---|---|---|---|---|---|
PP-LCNet_x1_0_doc_ori | Inference Model/Pretrained Model | 99.06 | 2.62 / 0.59 | 3.24 / 1.19 | 7 | A document image classification model based on PP-LCNet_x1_0, with four categories: 0°, 90°, 180°, and 270°. |
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Optimal combination of precision type and acceleration strategy | FP32 Precision / 8 Threads | Optimal backend selected (Paddle/OpenVINO/TRT, etc.) |
Parameter | Description | Type | Default |
---|---|---|---|
model_name |
Model name. If set to None , PP-LCNet_x1_0_doc_ori will be used. |
str|None |
None |
model_dir |
Model storage path. | str|None |
None |
device |
Device for inference. For example: "cpu" , "gpu" , "npu" , "gpu:0" , "gpu:0,1" .If multiple devices are specified, parallel inference will be performed. By default, GPU 0 is used if available; otherwise, CPU is used. |
str|None |
None |
enable_hpi |
Whether to enable high-performance inference. | bool |
False |
use_tensorrt |
Whether to use the Paddle Inference TensorRT subgraph engine. If the model does not support acceleration through TensorRT, setting this flag will not enable acceleration. For Paddle with CUDA version 11.8, the compatible TensorRT version is 8.x (x>=6), and it is recommended to install TensorRT 8.6.1.6. |
bool |
False |
precision |
Computation precision when using the TensorRT subgraph engine in Paddle Inference. Options: "fp32" , "fp16" . |
str |
"fp32" |
enable_mkldnn |
Whether to enable MKL-DNN acceleration for inference. If MKL-DNN is unavailable or the model does not support it, acceleration will not be used even if this flag is set. | bool |
True |
mkldnn_cache_capacity |
MKL-DNN cache capacity. | int |
10 |
cpu_threads |
Number of threads to use for inference on CPUs. | int |
10 |
Parameter | Description | Type | Default |
---|---|---|---|
input |
Input data to be predicted. Required. Supports multiple input types:
|
Python Var|str|list |
|
batch_size |
Batch size, positive integer. | int |
1 |
Method | Description | Parameter | Parameter Type | Description | Default Value |
---|---|---|---|---|---|
print() |
Print the result to the terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specify the indentation level to beautify the output JSON data and make it more readable. It is only valid when format_json is True . |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters as Unicode . When set to True , all non-ASCII characters will be escaped; when set to False , the original characters will be retained. It is only valid when format_json is True . |
False |
||
save_to_json() |
Save the result as a file in json format |
save_path |
str |
The file path to save. When it is a directory, the saved file name is consistent with the naming of the input file type. | None |
indent |
int |
Specify the indentation level to beautify the output JSON data and make it more readable. It is only valid when format_json is True . |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters as Unicode . When set to True , all non-ASCII characters will be escaped; when set to False , the original characters will be retained. It is only valid when format_json is True . |
False |
||
save_to_img() |
Save the result as a file in image format | save_path |
str |
The file path to save. When it is a directory, the saved file name is consistent with the naming of the input file type. | None |
Attribute | Description |
---|---|
json |
Get the prediction result in json format |
img |
Get the visualization image in dict format |