cragwolfe 19fc1fcc72
feat: convenience unstructured-get-json.sh update (#3971)
* script now supports:
   * the --vlm flag, to process the document with the VLM strategy
   * optionally takes --vlm-model, --vlm-provider args
* optionally also writes .html outputs by converting unstructured .json
output
   * optionally opens those .html outputs in a browser
   
Tested with:
   ```
unstructured-get-json.sh --write-html --open-html --fast
layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --hi-res
layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --ocr-only
layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --vlm
layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --vlm --vlm-provider
openai --vlm-model gpt-4o layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --vlm --vlm-provider
vertexai --vlm-model gemini-2.0-flash-001 layout-parser-paper-p2.pdf
unstructured-get-json.sh --write-html --open-html --vlm --vlm-provider
anthropic --vlm-model claude-3-5-sonnet-20241022
layout-parser-paper-p2.pdf

```

[layout-parser-paper-p2.pdf](https://github.com/user-attachments/files/19514007/layout-parser-paper-p2.pdf)
2025-03-31 09:45:01 -07:00
..
2024-01-09 23:37:30 +00:00
2024-01-09 23:37:30 +00:00
2023-12-12 01:04:15 +00:00