Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

83 lines
2.8 KiB
Markdown
Raw Permalink Normal View History

---
title: ADLS Datalake | OpenMetadata Database Connector Guide
slug: /connectors/database/adls-datalake
---
{% connectorDetailsHeader
name="ADLS Datalake"
stage="PROD"
platform="OpenMetadata"
availableFeatures=["Metadata", "Data Profiler", "Data Quality", "Sample Data"]
unavailableFeatures=["Query Usage", "Lineage", "Column-level Lineage", "Owners", "dbt", "Tags", "Stored Procedures"]
/ %}
In this section, we provide guides and references to use the ADLS Datalake connector.
Configure and schedule Datalake metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Troubleshooting](/connectors/database/adls-datalake/troubleshooting)
{% partial file="/v1.7/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/adls-datalake/yaml"} /%}
## Requirements
{% note %}
The ADLS Datalake connector supports extracting metadata from file types `JSON`, `CSV`, `TSV` & `Parquet`.
{% /note %}
### ADLS Permissions
To extract metadata from Azure ADLS (Storage Account - StorageV2), you will need an **App Registration** with the following
permissions on the Storage Account:
- Storage Blob Data Reader
- Storage Queue Data Reader
## Metadata Ingestion
{% partial
file="/v1.7/connectors/metadata-ingestion-ui.md"
variables={
connector: "Datalake",
selectServicePath: "/images/v1.7/connectors/datalake/select-service.png",
addNewServicePath: "/images/v1.7/connectors/datalake/add-new-service.png",
serviceConnectionPath: "/images/v1.7/connectors/datalake/service-connection.png",
}
/%}
{% stepsContainer %}
{% extraContent parentTagName="stepsContainer" %}
#### Connection Details for Azure
- **Azure Credentials**
- **Client ID** : Client ID of the data storage account
- **Client Secret** : Client Secret of the account
- **Tenant ID** : Tenant ID under which the data storage account falls
- **Account Name** : Account Name of the data Storage
- **Required Roles**
Please make sure the following roles associated with the data storage account.
- `Storage Blob Data Reader`
- `Storage Queue Data Reader`
The current approach for authentication is based on `app registration`, reach out to us on [slack](https://slack.open-metadata.org/) if you find the need for another auth system
{% partial file="/v1.7/connectors/database/advanced-configuration.md" /%}
{% /extraContent %}
{% partial file="/v1.7/connectors/test-connection.md" /%}
{% partial file="/v1.7/connectors/database/configure-ingestion.md" /%}
{% partial file="/v1.7/connectors/ingestion-schedule-and-deploy.md" /%}
{% /stepsContainer %}
{% partial file="/v1.7/connectors/database/related.md" /%}