OpenMetadata/openmetadata-docs/content/sdk/python/ingestion/dbt.md

---
title: Python SDK for DBT
slug: /sdk/python/ingestion/dbt
---

# Python SDK for DBT
We are going to show the integration of DBT with the Python SDK.

We will be going through a series of steps on how to configure DBT in OpenMetadata and how the python SDK parses the DBT `manifest.json`, `catalog.json` and `run_results.json` files.

## Adding DBT configuration in JSON Config
Below is an example showing the yaml config of the Redshift connector. The below example shows how to fetch the DBT files from AWS s3 bucket.

For more information on getting the DBT files from other sources like gcs, file server etc. please take a look [here](/sdk/python/ingestion/dbt#locate-the-dbt-files).

```yaml
source:
  type: redshift
  serviceName: aws_redshift
  serviceConnection:
    config:
      type: Redshift
      hostPort: cluster.name.region.redshift.amazonaws.com:5439
      username: username
      password: strong_password
      database: dev
  sourceConfig:
    config:
      dbtConfigSource:
      dbtSecurityConfig:
        awsAccessKeyId: <AWS Access Key Id>
        awsSecretAccessKey: <AWS Secret Access Key>
        awsRegion: AWS Region
      dbtPrefixConfig:
        dbtBucketName: <Bucket Name>
        dbtObjectPrefix: <Path of the folder in which dbt files are stored>
sink:
  type: metadata-rest
  config: {}
workflowConfig:
  openMetadataServerConfig:
    hostPort: <OpenMetadata host and port>
    authProvider: <OpenMetadata auth provider>
```

Add the details of the AWS s3 bucket in the above config:

- `dbtConfigSource`: Details of the AWS s3 resource
- `dbtPrefixConfig`: Bucket name and path of the dbt files in bucket

## Locate the DBT Files
The `get_dbt_details` method takes in the source config provided in the json and detects source type (gcs, s3, local or file server) based on the fields provided in the config.

```python
from metadata.utils.dbt_config import get_dbt_details

dbt_details = get_dbt_details(self.source_config.dbtConfigSource)
self.dbt_catalog = dbt_details[0] if dbt_details else None
self.dbt_manifest = dbt_details[1] if dbt_details else None
self.dbt_run_results = dbt_details[2] if dbt_details else None
```

After scanning the buckets or the path provided, the `manifest`,`catalog` and `run_results` files are fetched.

## Parsing the DBT Files
The `_parse_data_model` method parses the manifest, catalog and run_results files that are fetched and converts the DBT data into `DataModel`.

```python
from metadata.ingestion.source.database.dbt_source import _parse_data_model()

_parse_data_model()
```

## Extracting the DBT data
The models which are extracted are shown in the Openmetada UI in the `DBT` tab

<Image
src={"/images/sdk/python/ingestion/extracting-dbt-data.png"}
alt="Extracting DBT data"
/>
Docs structure and migrations (#5999) Docs structure and migrations (#5999) 2022-07-11 15:19:11 +02:00			`---`
			`title: Python SDK for DBT`
			`slug: /sdk/python/ingestion/dbt`
			`---`

			`# Python SDK for DBT`
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00			`We are going to show the integration of DBT with the Python SDK.`
Docs structure and migrations (#5999) Docs structure and migrations (#5999) 2022-07-11 15:19:11 +02:00
Fixed module imports in docs (#7254) Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local> 2022-09-06 17:45:27 +05:30			We will be going through a series of steps on how to configure DBT in OpenMetadata and how the python SDK parses the DBT `manifest.json`, `catalog.json` and `run_results.json` files.
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00
			`## Adding DBT configuration in JSON Config`
			`Below is an example showing the yaml config of the Redshift connector. The below example shows how to fetch the DBT files from AWS s3 bucket.`

			`For more information on getting the DBT files from other sources like gcs, file server etc. please take a look [here](/sdk/python/ingestion/dbt#locate-the-dbt-files).`

			```yaml
			`source:`
			`type: redshift`
			`serviceName: aws_redshift`
			`serviceConnection:`
			`config:`
			`type: Redshift`
			`hostPort: cluster.name.region.redshift.amazonaws.com:5439`
			`username: username`
			`password: strong_password`
			`database: dev`
			`sourceConfig:`
			`config:`
			`dbtConfigSource:`
			`dbtSecurityConfig:`
			`awsAccessKeyId: <AWS Access Key Id>`
			`awsSecretAccessKey: <AWS Secret Access Key>`
			`awsRegion: AWS Region`
			`dbtPrefixConfig:`
			`dbtBucketName: <Bucket Name>`
			`dbtObjectPrefix: <Path of the folder in which dbt files are stored>`
			`sink:`
			`type: metadata-rest`
			`config: {}`
			`workflowConfig:`
			`openMetadataServerConfig:`
			`hostPort: <OpenMetadata host and port>`
			`authProvider: <OpenMetadata auth provider>`
			```

			`Add the details of the AWS s3 bucket in the above config:`

			- `dbtConfigSource`: Details of the AWS s3 resource
			- `dbtPrefixConfig`: Bucket name and path of the dbt files in bucket

			`## Locate the DBT Files`
			The `get_dbt_details` method takes in the source config provided in the json and detects source type (gcs, s3, local or file server) based on the fields provided in the config.

			```python
			`from metadata.utils.dbt_config import get_dbt_details`

Fixed module imports in docs (#7254) Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local> 2022-09-06 17:45:27 +05:30			`dbt_details = get_dbt_details(self.source_config.dbtConfigSource)`
			`self.dbt_catalog = dbt_details[0] if dbt_details else None`
			`self.dbt_manifest = dbt_details[1] if dbt_details else None`
			`self.dbt_run_results = dbt_details[2] if dbt_details else None`
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00			```

Fixed module imports in docs (#7254) Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local> 2022-09-06 17:45:27 +05:30			After scanning the buckets or the path provided, the `manifest`,`catalog` and `run_results` files are fetched.
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00
			`## Parsing the DBT Files`
Fixed module imports in docs (#7254) Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local> 2022-09-06 17:45:27 +05:30			The `_parse_data_model` method parses the manifest, catalog and run_results files that are fetched and converts the DBT data into `DataModel`.
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00
			```python
Fixed module imports in docs (#7254) Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local> 2022-09-06 17:45:27 +05:30			`from metadata.ingestion.source.database.dbt_source import _parse_data_model()`
Fix #6020: Doc migration - Dev & SDK (#6116) * Add SDK documentation * Add developers documentation 2022-07-17 18:18:22 +02:00
			`_parse_data_model()`
			```

			`## Extracting the DBT data`
			The models which are extracted are shown in the Openmetada UI in the `DBT` tab

			`<Image`
			`src={"/images/sdk/python/ingestion/extracting-dbt-data.png"}`
			`alt="Extracting DBT data"`
			`/>`