mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-04 15:50:14 +00:00
3.2 KiB
3.2 KiB
Prerequisities
- Get your Databricks instance's workspace url
- Create a Databricks Service Principal
- You can skip this step and use your own account to get things running quickly, but we strongly recommend creating a dedicated service principal for production use.
- Generate a Databricks Personal Access token following the following guides:
- Provision your service account:
- To ingest your workspace's metadata and lineage, your service principal must have all of the following:
- One of: metastore admin role, ownership of, or
USE CATALOG
privilege on any catalogs you want to ingest - One of: metastore admin role, ownership of, or
USE SCHEMA
privilege on any schemas you want to ingest - Ownership of or
SELECT
privilege on any tables and views you want to ingest - Ownership documentation
- Privileges documentation
- One of: metastore admin role, ownership of, or
- To ingest legacy hive_metastore catalog (
include_hive_metastore
- enabled by default), your service principal must have all of the following:READ_METADATA
andUSAGE
privilege onhive_metastore
catalogREAD_METADATA
andUSAGE
privilege on schemas you want to ingestREAD_METADATA
andUSAGE
privilege on tables and views you want to ingest- Hive Metastore Privileges documentation
- To ingest your workspace's notebooks and respective lineage, your service principal must have
CAN_READ
privileges on the folders containing the notebooks you want to ingest: guide. - To
include_usage_statistics
(enabled by default), your service principal must haveCAN_MANAGE
permissions on any SQL Warehouses you want to ingest: guide. - To ingest
profiling
information withmethod: ge
, you needSELECT
privileges on all profiled tables. - To ingest
profiling
information withmethod: analyze
andcall_analyze: true
(enabled by default), your service principal must have ownership orMODIFY
privilege on any tables you want to profile.- Alternatively, you can run ANALYZE TABLE yourself on any tables you want to profile, then set
call_analyze
tofalse
. You will still needSELECT
privilege on those tables to fetch the results.
- Alternatively, you can run ANALYZE TABLE yourself on any tables you want to profile, then set
- To ingest your workspace's metadata and lineage, your service principal must have all of the following:
- Check the starter recipe below and replace
workspace_url
andtoken
with your information from the previous steps.