mirror of
				https://github.com/Unstructured-IO/unstructured.git
				synced 2025-10-31 01:54:25 +00:00 
			
		
		
		
	 8e6bc10ba1
			
		
	
	
		8e6bc10ba1
		
			
		
	
	
	
	
		
			
			To test: > cd docs && make html Changelogs: * Added verbiage about the cap limit and data usage for the Freemium AP * Added deprecated warning on Staging bricks * Added warning and code examples to use the SaaS API Endpoints using CLI-vs-SDKs * Fixed example page formatting * Added deprecation warning on ``model_name`` param in favor of ``hi_res_model_name`` * Added ``extract_images_in_pdf`` usage and code example in ``partition_pdf`` section * Reorganized and improved the documentation Intro section
		
			
				
	
	
		
			96 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			96 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| .. role:: raw-html(raw)
 | ||
|     :format: html
 | ||
| 
 | ||
| Models
 | ||
| ======
 | ||
| 
 | ||
| Depending on your need, ``Unstructured`` provides OCR-based and Transformer-based models to detect elements in the documents. The models are useful to detect the complex layout in the documents and predict the element types.
 | ||
| 
 | ||
| **Basic usage:**
 | ||
| 
 | ||
| .. code:: python
 | ||
| 
 | ||
|     elements = partition(filename=filename,
 | ||
|                          strategy="hi_res",
 | ||
|                          hi_res_model_name="yolox")
 | ||
| 
 | ||
| .. note::
 | ||
|     * To use any model with the partition, set the ``strategy`` to ``hi_res`` as shown above.
 | ||
|     * To maintain the consistency between the ``unstructured`` and ``unstructured-api`` libraries, we are deprecating the ``model_name`` parameter. Please use ``hi_res_model_name`` parameter when specifing a model.
 | ||
| 
 | ||
| :raw-html:`<br />`
 | ||
| **List of Available Models in the Partitions:**
 | ||
| 
 | ||
| * ``detectron2_onnx`` is a Computer Vision model by Facebook AI that provides object detection and segmentation algorithms with ONNX Runtime. It is the fastest model with the ``hi_res`` strategy.
 | ||
| * ``yolox`` is a single-stage real-time object detector that modifies YOLOv3 with a DarkNet53 backbone.
 | ||
| * ``yolox_quantized``: runs faster than YoloX and its speed is closer to Detectron2.
 | ||
| * ``chipper`` (beta version): the Chipper model is Unstructured’s in-house image-to-text model based on transformer-based Visual Document Understanding (VDU) models.
 | ||
| 
 | ||
| 
 | ||
| Using a Non-Default Model
 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^
 | ||
| 
 | ||
| ``Unstructured`` will download the model specified in ``UNSTRUCTURED_HI_RES_MODEL_NAME`` environment variable. If not defined, it will download the default model.
 | ||
| 
 | ||
| There are three ways you can use the non-default model as follows:
 | ||
| 
 | ||
| 1. Store the model name in the environment variable
 | ||
| 
 | ||
| .. code:: python
 | ||
| 
 | ||
|     import os
 | ||
|     from unstructured.partition.pdf import partition_pdf
 | ||
| 
 | ||
|     os.environ["UNSTRUCTURED_HI_RES_MODEL_NAME"] = "yolox"
 | ||
| 
 | ||
|     out_yolox = partition_pdf("example-docs/layout-parser-paper-fast.pdf", strategy="hi_res")
 | ||
| 
 | ||
| 
 | ||
| 2. Pass the model name in the ``partition`` function.
 | ||
| 
 | ||
| .. code:: python
 | ||
| 
 | ||
|     filename = "example-docs/layout-parser-paper-fast.pdf"
 | ||
| 
 | ||
|     elements = partition(filename=filename,
 | ||
|                          strategy="hi_res",
 | ||
|                          hi_res_model_name="yolox")
 | ||
| 
 | ||
| 3. Use `unstructured-inference <url_>`_ library.
 | ||
| 
 | ||
| .. _url: https://github.com/Unstructured-IO/unstructured-inference
 | ||
| 
 | ||
| .. code:: python
 | ||
| 
 | ||
|     from unstructured_inference.models.base import get_model
 | ||
|     from unstructured_inference.inference.layout import DocumentLayout
 | ||
| 
 | ||
|     model = get_model("yolox")
 | ||
|     layout = DocumentLayout.from_file("sample-docs/layout-parser-paper.pdf", detection_model=model)
 | ||
| 
 | ||
| 
 | ||
| Bring Your Own Models
 | ||
| ^^^^^^^^^^^^^^^^^^^^^
 | ||
| 
 | ||
| **Utilizing Layout Detection Model Zoo**
 | ||
| 
 | ||
| In the `LayoutParser <layout_>`_ library, you can use various pre-trained models available in the `model zoo <modelzoo_>`_ for document layout analysis. Here's a guide on leveraging this feature using the ``UnstructuredDetectronModel`` class in ``unstructured-inference`` library.
 | ||
| 
 | ||
| The ``UnstructuredDetectronModel`` class in ``unstructured_inference.models.detectron2`` uses the ``faster_rcnn_R_50_FPN_3x`` model pretrained on ``DocLayNet``. But any model in the model zoo can be used by using different construction parameters. ``UnstructuredDetectronModel`` is a light wrapper around the LayoutParser's ``Detectron2LayoutModel`` object, and accepts the same arguments.
 | ||
| 
 | ||
| .. _modelzoo: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
 | ||
| 
 | ||
| .. _layout: https://layout-parser.readthedocs.io/en/latest/api_doc/models.html#layoutparser.models.Detectron2LayoutModel
 | ||
| 
 | ||
| **Using Your Own Object Detection Model**
 | ||
| 
 | ||
| To seamlessly integrate your custom detection and extraction models into ``unstructured_inference`` pipeline, start by wrapping your model within the ``UnstructuredObjectDetectionModel`` class. This class acts as an intermediary between your detection model and Unstructured workflow.
 | ||
| 
 | ||
| Ensure your ``UnstructuredObjectDetectionModel`` subclass incorporates two vital methods:
 | ||
| 
 | ||
| 1. The ``predict`` method, which should be designed to accept a ``PIL.Image.Image`` type and return a list of ``LayoutElements``, facilitating the communication of your model's results.
 | ||
| 2. The ``initialize`` method is essential for loading and prepping your model for inference, guaranteeing its readiness for any incoming tasks.
 | ||
| 
 | ||
| It's important that your model's outputs, specifically from the predict method, integrate smoothly with the DocumentLayout class for optimal performance.
 | ||
| 
 |