mirror of
https://github.com/datahub-project/datahub.git
synced 2025-11-01 11:19:05 +00:00
docs(ingest): Add instructions to install required dependency (#2995)
This commit is contained in:
parent
aa253f5b3b
commit
2712f5587e
@ -979,8 +979,11 @@ If you're simply looking to run ingestion on a schedule, take a look at these sa
|
||||
The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
|
||||
|
||||
:::
|
||||
|
||||
1. First, you must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
|
||||
1. You need to install the required dependency in your airflow. See https://registry.astronomer.io/providers/datahub/modules/datahublineagebackend
|
||||
```shell
|
||||
pip install acryl-datahub[airflow]
|
||||
```
|
||||
2. You must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
|
||||
|
||||
```shell
|
||||
# For REST-based:
|
||||
@ -989,7 +992,7 @@ The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
|
||||
airflow connections add --conn-type 'datahub_kafka' 'datahub_kafka_default' --conn-host 'broker:9092' --conn-extra '{}'
|
||||
```
|
||||
|
||||
2. Add the following lines to your `airflow.cfg` file.
|
||||
3. Add the following lines to your `airflow.cfg` file.
|
||||
```ini
|
||||
[lineage]
|
||||
backend = datahub_provider.lineage.datahub.DatahubLineageBackend
|
||||
@ -1005,8 +1008,8 @@ The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
|
||||
- `capture_ownership_info` (defaults to true): If true, the owners field of the DAG will be capture as a DataHub corpuser.
|
||||
- `capture_tags_info` (defaults to true): If true, the tags field of the DAG will be captured as DataHub tags.
|
||||
- `graceful_exceptions` (defaults to true): If set to true, most runtime errors in the lineage backend will be suppressed and will not cause the overall task to fail. Note that configuration issues will still throw exceptions.
|
||||
3. Configure `inlets` and `outlets` for your Airflow operators. For reference, look at the sample DAG in [`lineage_backend_demo.py`](./src/datahub_provider/example_dags/lineage_backend_demo.py), or reference [`lineage_backend_taskflow_demo.py`](./src/datahub_provider/example_dags/lineage_backend_taskflow_demo.py) if you're using the [TaskFlow API](https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html).
|
||||
4. [optional] Learn more about [Airflow lineage](https://airflow.apache.org/docs/apache-airflow/stable/lineage.html), including shorthand notation and some automation.
|
||||
4. Configure `inlets` and `outlets` for your Airflow operators. For reference, look at the sample DAG in [`lineage_backend_demo.py`](./src/datahub_provider/example_dags/lineage_backend_demo.py), or reference [`lineage_backend_taskflow_demo.py`](./src/datahub_provider/example_dags/lineage_backend_taskflow_demo.py) if you're using the [TaskFlow API](https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html).
|
||||
5. [optional] Learn more about [Airflow lineage](https://airflow.apache.org/docs/apache-airflow/stable/lineage.html), including shorthand notation and some automation.
|
||||
|
||||
### Emitting lineage via a separate operator
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user