mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-04 23:57:03 +00:00
42 lines
2.0 KiB
Markdown
42 lines
2.0 KiB
Markdown
![]() |
### Setup
|
|||
|
|
|||
|
The artifacts used by this source are:
|
|||
|
|
|||
|
- [dbt manifest file](https://docs.getdbt.com/reference/artifacts/manifest-json)
|
|||
|
- This file contains model, source, tests and lineage data.
|
|||
|
- [dbt catalog file](https://docs.getdbt.com/reference/artifacts/catalog-json)
|
|||
|
- This file contains schema data.
|
|||
|
- dbt does not record schema data for Ephemeral models, as such datahub will show Ephemeral models in the lineage, however there will be no associated schema for Ephemeral models
|
|||
|
- [dbt sources file](https://docs.getdbt.com/reference/artifacts/sources-json)
|
|||
|
- This file contains metadata for sources with freshness checks.
|
|||
|
- We transfer dbt's freshness checks to DataHub's last-modified fields.
|
|||
|
- Note that this file is optional – if not specified, we'll use time of ingestion instead as a proxy for time last-modified.
|
|||
|
- [dbt run_results file](https://docs.getdbt.com/reference/artifacts/run-results-json)
|
|||
|
- This file contains metadata from the result of a dbt run, e.g. dbt test
|
|||
|
- When provided, we transfer dbt test run results into assertion run events to see a timeline of test runs on the dataset
|
|||
|
|
|||
|
To generate these files, we recommend this workflow for dbt build and datahub ingestion.
|
|||
|
|
|||
|
```sh
|
|||
|
dbt source snapshot-freshness
|
|||
|
dbt build
|
|||
|
cp target/run_results.json target/run_results_backup.json
|
|||
|
dbt docs generate
|
|||
|
cp target/run_results_backup.json target/run_results.json
|
|||
|
|
|||
|
# Run datahub ingestion, pointing at the files in the target/ directory
|
|||
|
```
|
|||
|
|
|||
|
The necessary artifact files will then appear in the `target/` directory of your dbt project.
|
|||
|
|
|||
|
We also have guides on handling more complex dbt orchestration techniques and multi-project setups below.
|
|||
|
|
|||
|
:::note Entity is in manifest but missing from catalog
|
|||
|
|
|||
|
This warning usually appears when the catalog.json file was not generated by a `dbt docs generate` command.
|
|||
|
Most other dbt commands generate a partial catalog file, which may impact the completeness of the metadata in ingested into DataHub.
|
|||
|
|
|||
|
Following the above workflow should ensure that the catalog file is generated correctly.
|
|||
|
|
|||
|
:::
|