Roman Isecke ed7f991ab9
Add s3 writer (#1223)
### Description
Convert s3 cli code to also support writing to s3. Writers are added as
optional subcommands to the parent command with their own arguments.
Custom `click.Group` introduced to add some custom formatting and text
in help messages.

To limit the scope of this PR, most existing files were not touched but
instead new files were added for the new flow. This allowed _only_ the
s3 connector to be updated without breaking any other ones.
2023-08-31 22:19:53 +00:00

21 lines
667 B
Bash
Executable File

#!/usr/bin/env bash
# Processes 3 PDF's from s3://utic-dev-tech-fixtures/small-pdf-set/
# through Unstructured's library in 2 processes.
# Structured outputs are stored in s3-small-batch-output/
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
cd "$SCRIPT_DIR"/../../.. || exit 1
PYTHONPATH=. ./unstructured/ingest/main.py \
s3 \
--remote-url s3://utic-dev-tech-fixtures/small-pdf-set/ \
--anonymous \
--output-dir s3-small-batch-output \
--num-processes 2 \
--verbose \
s3 \
--anonymous \
--remote-url s3://utic-dev-tech-fixtures/small-pdf-set-output