mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-12-16 01:34:56 +00:00
### Description
This adds in a destination connector to write content to the Databricks
Unity Catalog Volumes service. Currently there is an internal account
that can be used for testing manually but there is not dedicated account
to use for testing so this is not being added to the automated ingest
tests that get run in the CI.
To test locally:
```shell
#!/usr/bin/env bash
path="testpath/$(uuidgen)"
PYTHONPATH=. python ./unstructured/ingest/main.py local \
--num-processes 4 \
--output-dir azure-test \
--strategy fast \
--verbose \
--input-path example-docs/fake-memo.pdf \
--recursive \
databricks-volumes \
--catalog "utic-dev-tech-fixtures" \
--volume "small-pdf-set" \
--volume-path "$path" \
--username "$DATABRICKS_USERNAME" \
--password "$DATABRICKS_PASSWORD" \
--host "$DATABRICKS_HOST"
```
28 lines
906 B
ReStructuredText
28 lines
906 B
ReStructuredText
Destination Connectors
|
|
======================
|
|
|
|
Connect to your favorite data storage platforms for effortless batch processing of your files.
|
|
We are constantly adding new data connectors and if you don't see your favorite platform let us know
|
|
in our community `Slack. <https://short.unstructured.io/pzw05l7>`_
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
destination_connectors/azure
|
|
destination_connectors/azure_cognitive_search
|
|
destination_connectors/box
|
|
destination_connectors/chroma
|
|
destination_connectors/databricks_volumes
|
|
destination_connectors/delta_table
|
|
destination_connectors/dropbox
|
|
destination_connectors/elasticsearch
|
|
destination_connectors/gcs
|
|
destination_connectors/mongodb
|
|
destination_connectors/pinecone
|
|
destination_connectors/opensearch
|
|
destination_connectors/qdrant
|
|
destination_connectors/s3
|
|
destination_connectors/sql
|
|
destination_connectors/weaviate
|
|
|