Roman Isecke 59e850bbd9
Roman/downstream connector cli subcommand (#1302)
### Description
Update all other connectors to use the new downstream architecture that
was recently introduced for the s3 connector.

Closes #1313 and #1311
2023-09-11 11:40:56 -04:00

33 lines
1.1 KiB
Bash
Executable File
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
# Processes Outlook emails through Unstructured's library. Does not download attachments.
# Structured outputs are stored in outlook-output/
# NOTE, this script is not ready-to-run!
# You must enter a Azure AD app client-id, client secret, tenant-id, and email
# before running.
# To get the credentials for your Azure AD app, follow these steps:
# https://learn.microsoft.com/en-us/graph/auth-register-app-v2
# https://learn.microsoft.com/en-us/graph/auth-v2-service
# Assign the neccesary permissions for the application to read from mail.
# https://learn.microsoft.com/en-us/graph/permissions-reference
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
cd "$SCRIPT_DIR"/../../.. || exit 1
PYTHONPATH=. ./unstructured/ingest/main.py \
outlook \
--client-id "$MS_CLIENT_ID" \
--client-cred "$MS_CLIENT_CRED" \
--tenant "$MS_TENANT_ID" \
--user-email "$MS_USER_EMAIL" \
--outlook-folders Inbox,"Sent Items" \
--output-dir outlook-output \
--num-processes 2 \
--recursive \
--verbose