3 Commits

Author SHA1 Message Date
cragwolfe
bd8a74d686
chore: shell scripts default indent of 2 instead of 4 (#2287)
Given the tendency for shell scripts to easily enter into a few levels
of indentation and long line lengths, update the default to 2 spaces.
2023-12-19 07:48:21 +00:00
Roman Isecke
76efcf4dd7
chore: add shfmt (#2246)
### Description
Given all the shell files that now exist in the repo, would be nice to
have linting/formatting around them (in addition to the existing
shellcheck which doesn't do anything to format the shell code). This PR
introduces `shfmt` to both check for changes and apply formatting when
the associated make targets are called.
2023-12-12 01:04:15 +00:00
cragwolfe
5fa40850f4
feat: convenience script to post files to the API (#2083)
Usage: ./unstructured-get-json.sh [options] <file>"
                                                                                                                                                       
Options:                                                                                                                                                             
  --api-key KEY   Specify the API key for authentication. Set the env var $UNST_API_KEY to skip providing this option.                                               
  --hi-res        hi_res strategy: Enable high-resolution processing, with layout segmentation and OCR                                                               
  --fast          fast strategy: No OCR, just extract embedded text                                                                                                  
  --ocr-only      ocr_only strategy: Perform OCR (Optical Character Recognition) only. No layout segmentation.                                                       
  --tables        Enable table extraction: tables are represented as html in metadata                                                                                
  --coordinates   Include coordinates in the output                                                                                                                  
  --trace         Enable trace logging for debugging, useful to cut and paste the executed curl call                                                                 
  --verbose       Enable verbose logging including printing first 8 elements to stdout                                                                               
  --s3            Write the resulting output to s3 (like a pastebin)                                                                                                 
  --help          Display this help and exit.                                                                                                                        
                                                                                                                                                                     
Arguments:                                                                                                                                                           
  <file>          File to send to the API.                                                                                                                           
                                                                                                                                                                     
The script requires a <file>, the document to post to the Unstructured API.                                                                                          
The .json result is written to ~/tmp/unst-outputs/ -- this path is echoed and copied to your clipboard.
2023-11-15 22:58:28 -08:00