mirror of
				https://github.com/Unstructured-IO/unstructured.git
				synced 2025-10-31 01:54:25 +00:00 
			
		
		
		
	 096d23bc28
			
		
	
	
		096d23bc28
		
			
		
	
	
	
	
		
			
			### Summary This PR is the second part of the "layout analysis" refactor to move it from unstructured-inference repo to unstructured repo, the first part is done in https://github.com/Unstructured-IO/unstructured-inference/pull/305. This PR adds logic to support annotating `inferred` and `extracted` elements. ### Testing ``` PYTHONPATH=. python examples/layout-analysis/visualization.py <file_path> <strategy> <document_type> ``` e.g. ``` PYTHONPATH=. python examples/layout-analysis/visualization.py example-docs/layout-parser-paper-fast.pdf hi_res pdf ```
Analyzing Layout Elements
This directory contains examples of how to analyze layout elements.
How to run
Run pip install -r requirements.txt to install the Python dependencies.
Visualization
- Python script (visualization.py)
$ PYTHONPATH=. python examples/layout-analysis/visualization.py <file_path> <strategy>
The strategy can be one of "auto", "hi_res", "ocr_only", or "fast". For example,
$ PYTHONPATH=. python examples/layout-analysis/visualization.py example-docs/loremipsum.pdf hi_res