cragwolfe bd8a74d686
chore: shell scripts default indent of 2 instead of 4 (#2287)
Given the tendency for shell scripts to easily enter into a few levels
of indentation and long line lengths, update the default to 2 spaces.
2023-12-19 07:48:21 +00:00

32 lines
1.0 KiB
Bash
Executable File
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
# Processes Outlook emails through Unstructured's library. Does not download attachments.
# Structured outputs are stored in outlook-output/
# NOTE, this script is not ready-to-run!
# You must enter a Azure AD app client-id, client secret, tenant-id, and email
# before running.
# To get the credentials for your Azure AD app, follow these steps:
# https://learn.microsoft.com/en-us/graph/auth-register-app-v2
# https://learn.microsoft.com/en-us/graph/auth-v2-service
# Assign the neccesary permissions for the application to read from mail.
# https://learn.microsoft.com/en-us/graph/permissions-reference
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
cd "$SCRIPT_DIR"/../../.. || exit 1
PYTHONPATH=. ./unstructured/ingest/main.py \
outlook \
--client-id "$MS_CLIENT_ID" \
--client-cred "$MS_CLIENT_CRED" \
--tenant "$MS_TENANT_ID" \
--user-email "$MS_USER_EMAIL" \
--outlook-folders "Inbox,Sent Items" \
--output-dir outlook-output \
--num-processes 2 \
--recursive \
--verbose