--- comments: true --- # Seal Text Detection Module Tutorial ## I. Overview The seal text detection module typically outputs multi-point bounding boxes around text regions, which are then passed as inputs to the distortion correction and text recognition modules for subsequent processing to identify the textual content of the seal. Recognizing seal text is an integral part of document processing and finds applications in various scenarios such as contract comparison, inventory access auditing, and invoice reimbursement verification. The seal text detection module serves as a subtask within OCR (Optical Character Recognition), responsible for locating and marking the regions containing seal text within an image. The performance of this module directly impacts the accuracy and efficiency of the entire seal text OCR system. ## II. Supported Model List
Model Name | Model Download Link | Hmean(%) | GPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
CPU Inference Time (ms) [Normal Mode / High-Performance Mode] |
Model Storage Size (MB) | Description |
---|---|---|---|---|---|---|
PP-OCRv4_server_seal_det | Inference Model/Training Model | 98.40 | 124.64 / 91.57 | 545.68 / 439.86 | 109 | The server-side seal text detection model of PP-OCRv4 boasts higher accuracy and is suitable for deployment on better-equipped servers. |
PP-OCRv4_mobile_seal_det | Inference Model/Training Model | 96.36 | 9.70 / 3.56 | 50.38 / 19.64 | 4.6 | The mobile-side seal text detection model of PP-OCRv4, on the other hand, offers greater efficiency and is suitable for deployment on end devices. |
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |
Parameter | Description | Type | Default |
---|---|---|---|
model_name |
Model name. If set to None , PP-OCRv4_mobile_seal_det will be used. |
str|None |
None |
model_dir |
Model storage path. | str|None |
None |
device |
Device for inference. For example: "cpu" , "gpu" , "npu" , "gpu:0" , "gpu:0,1" .If multiple devices are specified, parallel inference will be performed. By default, GPU 0 is used if available; otherwise, CPU is used. |
str|None |
None |
enable_hpi |
Whether to enable high-performance inference. | bool |
False |
use_tensorrt |
Whether to use the Paddle Inference TensorRT subgraph engine. If the model does not support acceleration through TensorRT, setting this flag will not enable acceleration. For Paddle with CUDA version 11.8, the compatible TensorRT version is 8.x (x>=6), and it is recommended to install TensorRT 8.6.1.6. For Paddle with CUDA version 12.6, the compatible TensorRT version is 10.x (x>=5), and it is recommended to install TensorRT 10.5.0.18. |
bool |
False |
precision |
Computation precision when using the TensorRT subgraph engine in Paddle Inference. Options: "fp32" , "fp16" . |
str |
"fp32" |
enable_mkldnn |
Whether to enable MKL-DNN acceleration for inference. If MKL-DNN is unavailable or the model does not support it, acceleration will not be used even if this flag is set. | bool |
True |
mkldnn_cache_capacity |
MKL-DNN cache capacity. | int |
10 |
cpu_threads |
Number of threads to use for inference on CPUs. | int |
10 |
limit_side_len |
Limit on the side length of the input image for detection. int specifies the value. If set to None , the model's default configuration will be used. |
int|None |
None |
limit_type |
Type of image side length limitation. "min" ensures the shortest side of the image is no less than det_limit_side_len ; "max" ensures the longest side is no greater than limit_side_len . If set to None , the model's default configuration will be used. |
str|None |
None |
thresh |
Pixel score threshold. Pixels in the output probability map with scores greater than this threshold are considered text pixels. Accepts any float value greater than 0. If set to None , the model's default configuration will be used. |
float|None |
None |
box_thresh |
If the average score of all pixels inside the bounding box is greater than this threshold, the result is considered a text region. Accepts any float value greater than 0. If set to None , the model's default configuration will be used. |
float|None |
None |
unclip_ratio |
Expansion ratio for the Vatti clipping algorithm, used to expand the text region. Accepts any float value greater than 0. If set to None , the model's default configuration will be used. |
float|None |
None |
input_shape |
Input image size for the model in the format (C, H, W) . If set to None , the model's default size will be used. |
tuple|None |
None |
Parameter | Description | Type | Default |
---|---|---|---|
input |
Input data to be predicted. Required. Supports multiple input types:
|
Python Var|str|list |
|
batch_size |
Batch size, can be set to any positive integer. | int |
1 |
limit_side_len |
Same meaning as the instantiation parameters. If set to None , the instantiation value is used; otherwise, this parameter takes precedence. |
int|None |
None |
limit_type |
Same meaning as the instantiation parameters. If set to None , the instantiation value is used; otherwise, this parameter takes precedence. |
str|None |
None |
thresh |
Same meaning as the instantiation parameters. If set to None , the instantiation value is used; otherwise, this parameter takes precedence. |
float|None |
None |
box_thresh |
Same meaning as the instantiation parameters. If set to None , the instantiation value is used; otherwise, this parameter takes precedence. |
float|None |
None |
unclip_ratio |
Same meaning as the instantiation parameters. If set to None , the instantiation value is used; otherwise, this parameter takes precedence. |
float|None |
None |
Method | Method Description | Parameter | Parameter Type | Parameter Description | Default Value |
---|---|---|---|---|---|
print() |
Print the result to the terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True |
False |
||
save_to_json() |
Save the result as a file in JSON format | save_path |
str |
The file path for saving. When it is a directory, the saved file name will be consistent with the input file name | None |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True |
False |
||
save_to_img() |
Save the result as a file in image format | save_path |
str |
The file path for saving. When it is a directory, the saved file name will be consistent with the input file name | None |
Attribute | Attribute Description |
---|---|
json |
Get the prediction result in json format |
img |
Get the visual image in dict format |