mirror of
				https://github.com/datahub-project/datahub.git
				synced 2025-11-04 12:51:23 +00:00 
			
		
		
		
	docs(ingest): Add instructions to install required dependency (#2995)
This commit is contained in:
		
							parent
							
								
									aa253f5b3b
								
							
						
					
					
						commit
						2712f5587e
					
				@ -979,8 +979,11 @@ If you're simply looking to run ingestion on a schedule, take a look at these sa
 | 
			
		||||
The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
 | 
			
		||||
 | 
			
		||||
:::
 | 
			
		||||
 | 
			
		||||
1. First, you must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
 | 
			
		||||
1. You need to install the required dependency in your airflow. See https://registry.astronomer.io/providers/datahub/modules/datahublineagebackend
 | 
			
		||||
  ```shell
 | 
			
		||||
    pip install acryl-datahub[airflow]
 | 
			
		||||
  ```
 | 
			
		||||
2. You must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
 | 
			
		||||
 | 
			
		||||
   ```shell
 | 
			
		||||
   # For REST-based:
 | 
			
		||||
@ -989,7 +992,7 @@ The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
 | 
			
		||||
   airflow connections add  --conn-type 'datahub_kafka' 'datahub_kafka_default' --conn-host 'broker:9092' --conn-extra '{}'
 | 
			
		||||
   ```
 | 
			
		||||
 | 
			
		||||
2. Add the following lines to your `airflow.cfg` file.
 | 
			
		||||
3. Add the following lines to your `airflow.cfg` file.
 | 
			
		||||
   ```ini
 | 
			
		||||
   [lineage]
 | 
			
		||||
   backend = datahub_provider.lineage.datahub.DatahubLineageBackend
 | 
			
		||||
@ -1005,8 +1008,8 @@ The Airflow lineage backend is only supported in Airflow 1.10.15+ and 2.0.2+.
 | 
			
		||||
   - `capture_ownership_info` (defaults to true): If true, the owners field of the DAG will be capture as a DataHub corpuser.
 | 
			
		||||
   - `capture_tags_info` (defaults to true): If true, the tags field of the DAG will be captured as DataHub tags.
 | 
			
		||||
   - `graceful_exceptions` (defaults to true): If set to true, most runtime errors in the lineage backend will be suppressed and will not cause the overall task to fail. Note that configuration issues will still throw exceptions.
 | 
			
		||||
3. Configure `inlets` and `outlets` for your Airflow operators. For reference, look at the sample DAG in [`lineage_backend_demo.py`](./src/datahub_provider/example_dags/lineage_backend_demo.py), or reference [`lineage_backend_taskflow_demo.py`](./src/datahub_provider/example_dags/lineage_backend_taskflow_demo.py) if you're using the [TaskFlow API](https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html).
 | 
			
		||||
4. [optional] Learn more about [Airflow lineage](https://airflow.apache.org/docs/apache-airflow/stable/lineage.html), including shorthand notation and some automation.
 | 
			
		||||
4. Configure `inlets` and `outlets` for your Airflow operators. For reference, look at the sample DAG in [`lineage_backend_demo.py`](./src/datahub_provider/example_dags/lineage_backend_demo.py), or reference [`lineage_backend_taskflow_demo.py`](./src/datahub_provider/example_dags/lineage_backend_taskflow_demo.py) if you're using the [TaskFlow API](https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html).
 | 
			
		||||
5. [optional] Learn more about [Airflow lineage](https://airflow.apache.org/docs/apache-airflow/stable/lineage.html), including shorthand notation and some automation.
 | 
			
		||||
 | 
			
		||||
### Emitting lineage via a separate operator
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user