
* Added mark delete logic * Final test and optimization * After merge fixes * Added include tags for dash pipelines dbt * added docs and fixed test * Fixed py tests * Added UI changes for following newly added fields: - markDeletedDashboards - markDeletedMlModels - markDeletedPipelines - markDeletedTopics - includeTags * Fixed failing unit tests * updated json files of localization for other languages * Improved localization changes * added localization changes for other languages * Updated mark deleted desc * updated the ingestion fields descriptions in the ingestion form for UI * automated localization changes for other languages * updated descriptions for includeTags field for dbtPipeline and databaseServiceMetadataPipeline json * fixed issue where includeTags field was being sent in the dbtConfigSource * Added flow to input taxonomy while adding BigQuery service. --------- Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
3.4 KiB
title | slug |
---|---|
Ingest dbt CLI | /connectors/ingestion/workflows/dbt/ingest-dbt-cli |
dbt Workflow CLI
Learn how to configure the dbt workflow from the cli to ingest dbt data from your data sources.
CLI Configuration
Once the metadata ingestion runs correctly and we are able to explore the service Entities, we can add the dbt information.
This will populate the dbt tab from the Table Entity Page.

We can create a workflow that will obtain the dbt information from the dbt files and feed it to OpenMetadata. The dbt Ingestion will be in charge of obtaining this data.
1. Create the workflow configuration
Configure the dbt.yaml file according keeping only one of the required source (local, http, gcs, s3). The dbt files should be present on the source mentioned and should have the necssary permissions to be able to access the files.
Enter the name of your database service from OpenMetadata in the serviceName
key in the yaml
source:
type: dbt
serviceName: service_name
sourceConfig:
config:
type: DBT
# For dbt, choose one of Cloud, Local, HTTP, S3 or GCS configurations
# dbtConfigSource:
# # For cloud
# dbtCloudAuthToken: token
# dbtCloudAccountId: ID
# dbtCloudProjectId: project_id
# # For Local
# dbtCatalogFilePath: path-to-catalog.json
# dbtManifestFilePath: path-to-manifest.json
# dbtRunResultsFilePath: path-to-run_results.json
# # For HTTP
# dbtCatalogHttpPath: http://path-to-catalog.json
# dbtManifestHttpPath: http://path-to-manifest.json
# dbtRunResultsHttpPath: http://path-to-run_results.json
# # For S3
# dbtSecurityConfig: # These are modeled after all AWS credentials
# awsAccessKeyId: KEY
# awsSecretAccessKey: SECRET
# awsRegion: us-east-2
# dbtPrefixConfig:
# dbtBucketName: bucket
# dbtObjectPrefix: "dbt/"
# # For GCS Values
# dbtSecurityConfig: # These are modeled after all GCS credentials
# gcsConfig:
# type: My Type
# projectId: project ID
# privateKeyId: us-east-2
# privateKey: |
# -----BEGIN PRIVATE KEY-----
# Super secret key
# -----END PRIVATE KEY-----
# clientEmail: client@mail.com
# clientId: 1234
# authUri: https://accounts.google.com/o/oauth2/auth (default)
# tokenUri: https://oauth2.googleapis.com/token (default)
# authProviderX509CertUrl: https://www.googleapis.com/oauth2/v1/certs (default)
# clientX509CertUrl: https://cert.url (URI)
# # For GCS Values
# dbtSecurityConfig: # These are modeled after all GCS credentials
# gcsConfig: path-to-credentials-file.json
# dbtPrefixConfig:
# dbtBucketName: bucket
# dbtObjectPrefix: "dbt/"
# dbtUpdateDescriptions: true or false
# includeTags: true or false
# dbtClassificationName: dbtTags
sink:
type: metadata-rest
config: {}
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: no-auth
2. Run the dbt ingestion
After saving the YAML config, we will run the command for dbt ingestion
metadata ingest -c <path-to-yaml>