mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-07-10 18:45:54 +00:00

### Description Currently linting only takes place over the base unstructured directory but we support python files throughout the repo. It makes sense for all those files to also abide by the same linting rules so the entire repo was set to be inspected when the linters are run. Along with that autoflake was added as a linter which has a lot of added benefits such as removing unused imports for you that would currently break flake and require manual intervention. The only real relevant changes in this PR are in the `Makefile`, `setup.cfg`, and `requirements/test.in`. The rest is the result of running the linters.
Analyzing Layout Elements
This directory contains examples of how to analyze layout elements.
How to run
Run pip install -r requirements.txt
to install the Python dependencies.
Visualization
- Python script (visualization.py)
$ PYTHONPATH=. python examples/layout-analysis/visualization.py <file_path> <strategy>
The strategy can be one of "auto", "hi_res", "ocr_only", or "fast". For example,
$ PYTHONPATH=. python examples/layout-analysis/visualization.py example-docs/loremipsum.pdf hi_res