Learn how to capture lineage information directly from Airflow DAGs using the OpenMetadata Lineage Backend.
---
# Airflow Lineage
## Introduction
Obtaining metadata should be as simple as possible. Not only that, we want developers to be able to keep using their tools without any major changes.
We can directly use [Airflow code](https://airflow.apache.org/docs/apache-airflow/stable/lineage.html#lineage-backend) to help us track data lineage.
What we want to achieve through this backend is the ability to link OpenMetadata Table Entities and the pipelines that have those instances as inputs or outputs.
Being able to control and monitor these relationships can play a major role in helping discover and communicate issues to your company data practitioners and stakeholders.
This document will guide you through the installation, configuration and internals of the process to help you unlock as much value as possible from within your Airflow pipelines.
## Quickstart
### Installation
The Lineage Backend can be directly installed to the Airflow instances as part of the usual OpenMetadata Python distribution:
{% tabs %}
{% tab title="Install Using PyPI" %}
```bash
pip install 'openmetadata-ingestion'
```
{% endtab %}
{% endtabs %}
### Adding Lineage Config
After the installation, we need to update the Airflow configuration. This can be done following this example on `airflow.cfg`:
contain the attributes inlets and outlets. When creating our tasks, we can pass any of these two parameters as follows:
```python
BashOperator(
task_id='print_date',
bash_command='date',
outlets={
"tables": ["bigquery_gcp.shopify.dim_address"]
}
)
```
Note how in this example we are defining a Python `dict` with the key `tables` and value a list. This `list` should contain the
FQDN of tables ingested through any of our connectors or APIs.
When each task is processed, we will use the OpenMetadata client to add the lineage information
(upstream for inlets and downstream for outlets) between the Pipeline and Table Entities.
It is important to get the naming right, as we will fetch the Table Entity by its FQDN. If no information is specified
in terms of lineage, we will just ingest the Pipeline Entity without adding further information.
You can see another example [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/examples/airflow_lineage/openmetadata_airflow_lineage_example.py)
## Lineage Callback
One of the downsides of the Lineage Backend is that it does not get executed when a task fails.
In order to still get the metadata information from the workflow, we can configure the OpenMetadata lineage callback.