42 lines
2.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

### Setup
The artifacts used by this source are:
- [dbt manifest file](https://docs.getdbt.com/reference/artifacts/manifest-json)
- This file contains model, source, tests and lineage data.
- [dbt catalog file](https://docs.getdbt.com/reference/artifacts/catalog-json)
- This file contains schema data.
- dbt does not record schema data for Ephemeral models, as such datahub will show Ephemeral models in the lineage, however there will be no associated schema for Ephemeral models
- [dbt sources file](https://docs.getdbt.com/reference/artifacts/sources-json)
- This file contains metadata for sources with freshness checks.
- We transfer dbt's freshness checks to DataHub's last-modified fields.
- Note that this file is optional if not specified, we'll use time of ingestion instead as a proxy for time last-modified.
- [dbt run_results file](https://docs.getdbt.com/reference/artifacts/run-results-json)
- This file contains metadata from the result of a dbt run, e.g. dbt test
- When provided, we transfer dbt test run results into assertion run events to see a timeline of test runs on the dataset
To generate these files, we recommend this workflow for dbt build and datahub ingestion.
```sh
dbt source snapshot-freshness
dbt build
cp target/run_results.json target/run_results_backup.json
dbt docs generate
cp target/run_results_backup.json target/run_results.json
# Run datahub ingestion, pointing at the files in the target/ directory
```
The necessary artifact files will then appear in the `target/` directory of your dbt project.
We also have guides on handling more complex dbt orchestration techniques and multi-project setups below.
:::note Entity is in manifest but missing from catalog
This warning usually appears when the catalog.json file was not generated by a `dbt docs generate` command.
Most other dbt commands generate a partial catalog file, which may impact the completeness of the metadata in ingested into DataHub.
Following the above workflow should ensure that the catalog file is generated correctly.
:::