lineage-docs (#13472)
@ -0,0 +1,45 @@
|
||||
---
|
||||
title: How Column-Level Lineage Works
|
||||
slug: /how-to-guides/openmetadata/data-lineage/column
|
||||
---
|
||||
|
||||
# How Column-Level Lineage Works
|
||||
|
||||
OpenMetadata supports rich column-level lineage for understanding the relationship between tables and to perform impact analysis. Users can manually edit both the table and column level lineage to capture any information that is not automatically surfaced.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/lineage1.png"
|
||||
alt="Column-Level Data Lineage in OpenMetadata"
|
||||
caption="Column-Level Data Lineage in OpenMetadata"
|
||||
/%}
|
||||
|
||||
{% note noteType="Tip" %} **Quick Tip:** Drilldown to view all the available columns for a table when viewing column-level lineage. {% /note %}
|
||||
|
||||
You can generate the column-level lineage automatically by running the **Lineage Ingestion**.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/ingestion.png"
|
||||
alt="Lineage Ingestion"
|
||||
caption="Lineage Ingestion"
|
||||
/%}
|
||||
|
||||
## Manually Edit Column Level Lineage
|
||||
|
||||
OpenMetadata supports manual editing of both table and column level lineage. You can edit the lineage for the individual columns by clicking on the edit option on the top right. User the anchor points on either side of the columns to create links and trace individual columns through their lineage. You can also add new tables that have columns you want to trace. Connect the relevant columns to the current lineage.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/column1.png"
|
||||
alt="Manually Edit Column Level Lineage"
|
||||
caption="Manually Edit Column Level Lineage"
|
||||
/%}
|
||||
|
||||
Watch the video on editing column-level lineage.
|
||||
{% youtube videoId="HTkbTvi2H9c" start="0:00" end="00:51" /%}
|
||||
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Manual Lineage"
|
||||
icon="MdArrowForward"
|
||||
href="/how-to-guides/openmetadata/data-lineage/manual"%}
|
||||
Edit the table and column level lineage manually.
|
||||
{%/inlineCallout%}
|
||||
@ -0,0 +1,63 @@
|
||||
---
|
||||
title: Explore the Lineage View
|
||||
slug: /how-to-guides/openmetadata/data-lineage/explore
|
||||
---
|
||||
|
||||
# Explore the Lineage View
|
||||
|
||||
OpenMetadata UI displays end-to-end lineage traceability for the table and column levels. OpenMetadata supports lineage for Database, Dashboard, and Pipelines. Just search for an data asset and expand the graph to unfold lineage. It’ll display the upstreams and downstreams edges for each node. The lineage details specify the SQL query, pipeline information, and column lineage.
|
||||
|
||||
In the lineage view, in the example below, the table on the left is the parent or **Source** node. The table on the right is the **Target** node. You can also identify the target node by looking at the arrow attached to it. The arrow connecting the data assets or tables is the **Edge**. Clicking on an edge connecting a source and a destination will display all the edge information: the Source, Target, Description, and SQL Query. It displays the SQL query used to generate the view (The table is of the Type View). The SQL query provides information on how the target table was generated from the source table.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/edge.png"
|
||||
alt="Edge Information: Source and Target"
|
||||
caption="Edge Information: Source and Target"
|
||||
/%}
|
||||
|
||||
{% note noteType="Tip" %} **Tip:** Metadata ingestion also brings in the View Lineage, if the database has views (Data assets of the Type View). {% /note %}
|
||||
|
||||
You can set up the **Lineage Config** to display the required number of Upstream and Downstream Nodes, as well as the Nodes per layer. You can set up to **3** Upstream and Downstream Nodes.
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/nodes.png"
|
||||
alt="Lineage Config"
|
||||
caption="Lineage Config"
|
||||
/%}
|
||||
|
||||
You can click on the data assets to view the data asset details.
|
||||
- Users can view the Source, Name of the Data Asset, Description, Owner (Team/User details), Tier, and Usage information for the data asset.
|
||||
- Based on the **type of data asset** (Table, Topic, Dashboard, Pipeline, ML Model, Container), the quick preview provides additional information. For example, for `tables`, the type of table, the number of queries, and columns are displayed.
|
||||
- The **data quality and profiler metrics** displays the details on the Tests Passed, Aborted, and Failed.
|
||||
- Users can view all the **tags** associated with the data asset.
|
||||
- The **Schema** provides the details on the column names, type of column, and column description.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/lineage2.png"
|
||||
alt="Quick Glance at the Data Asset from Lineage View"
|
||||
caption="Quick Glance at the Data Asset from Lineage View"
|
||||
/%}
|
||||
|
||||
Clicking on the tables will display the list of columns and column-level lineage.
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/lineage1.png"
|
||||
alt="Column-Level Data Lineage in OpenMetadata"
|
||||
caption="Column-Level Data Lineage in OpenMetadata"
|
||||
/%}
|
||||
|
||||
In case of **Pipelines**, we first have the lineage ingested from the databases. Further, when setting up the pipeline ingestion, we specify the database service name. That way we display the lineage of the database tables connected via pipelines. If a lineage is created through a pipeline, the same is displayed in the Edge information.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/pipeline.png"
|
||||
alt="Database and Pipeline Lineage"
|
||||
caption="Database and Pipeline Lineage"
|
||||
/%}
|
||||
|
||||
Similarly for a **Dashboard**, we first have the lineage ingested from the databases. Further, when setting up the dashboard ingestion, the data models and charts are ingested. That way we display the lineage of the database tables connected using the dashboard data models.
|
||||
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Column-Level Lineage"
|
||||
icon="MdArrowForward"
|
||||
href="/how-to-guides/openmetadata/data-lineage/column"%}
|
||||
Explore and edit the rich column-level lineage.
|
||||
{%/inlineCallout%}
|
||||
@ -5,8 +5,45 @@ slug: /how-to-guides/openmetadata/data-lineage
|
||||
|
||||
# Overview of Data Lineage
|
||||
|
||||
OpenMetadata tracks data lineage, showing how data moves through the organization's systems. Users can visualize how data is transformed and where it is used, helping with data traceability and impact analysis.
|
||||
OpenMetadata tracks data lineage, showing how data moves through the organization's systems. Users can visualize how data is transformed and where it is used, helping with data traceability and impact analysis. OpenMetadata supports lineage for Database, Dashboard, and Pipelines.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/lineage1.png"
|
||||
alt="Data Lineage in OpenMetadata"
|
||||
caption="Data Lineage in OpenMetadata"
|
||||
/%}
|
||||
|
||||
Watch the video on data lineage to understand the different options to automatically extract the lineage from your data warehouses such as Snowflake, dashboard service like metabase. Also learn about creating lineage programmatically with python SDK.
|
||||
|
||||
{% youtube videoId="jEbN1tt89H0" start="0:00" end="41:43" /%}
|
||||
{% youtube videoId="jEbN1tt89H0" start="0:00" end="41:43" /%}
|
||||
|
||||
{%inlineCalloutContainer%}
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Lineage Workflow"
|
||||
icon="MdPolyline"
|
||||
href="/how-to-guides/openmetadata/data-lineage/workflow"%}
|
||||
Configure a lineage workflow right from the UI.
|
||||
{%/inlineCallout%}
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Explore Lineage"
|
||||
icon="MdPolyline"
|
||||
href="/how-to-guides/openmetadata/data-lineage/explore"%}
|
||||
Explore the rich lineage view in OpenMetadata.
|
||||
{%/inlineCallout%}
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Column-Level Lineage"
|
||||
icon="MdViewColumn"
|
||||
href="/how-to-guides/openmetadata/data-lineage/column"%}
|
||||
Explore and edit the rich column-level lineage.
|
||||
{%/inlineCallout%}
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Manual Lineage"
|
||||
icon="MdPolyline"
|
||||
href="/how-to-guides/openmetadata/data-lineage/manual"%}
|
||||
Edit the table and column level lineage manually.
|
||||
{%/inlineCallout%}
|
||||
{%/inlineCalloutContainer%}
|
||||
@ -0,0 +1,42 @@
|
||||
---
|
||||
title: How to Manually Add or Edit Lineage
|
||||
slug: /how-to-guides/openmetadata/data-lineage/manual
|
||||
---
|
||||
|
||||
# How to Manually Add or Edit Lineage
|
||||
|
||||
Edit lineage to provide a richer understanding of the provenance of data. The OpenMetadata no-code editor provides a drag and drop interface. Drop tables, topics, pipelines, dashboards, ML models, containers, and pipelines onto the lineage graph. You may add new edges or delete existing edges to better represent data lineage.
|
||||
|
||||
OpenMetadata supports manual editing of both table and column level lineage. We can build the lineage by creating edges. You can connect the source of the lineage to the destination by connecting the nodes.
|
||||
|
||||
Once you have ingested your database and dashboard services.
|
||||
- Start by picking one database service, and select a table. In the data asset details page, navigate to the Lineage Tab.
|
||||
- Click on the Edit option to enable the lineage editor.
|
||||
- Select the type of data asset (table, topic, dashboard, ML model, container, pipeline) to connect to as the destination.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/l1.png"
|
||||
alt="Data Asset: Lineage Tab"
|
||||
caption="Data Asset: Lineage Tab"
|
||||
/%}
|
||||
|
||||
- Search and select the relevant data asset.
|
||||
- Create an edge between these two data assets.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/l2.png"
|
||||
alt="Link the Table to the Dashboard to Add Lineage Manually"
|
||||
caption="Link the Table to the Dashboard to Add Lineage Manually"
|
||||
/%}
|
||||
|
||||
- You can also expand a table to view the available columns
|
||||
- Link the relevant columns together by connecting the column edges to trace column-level lineage.
|
||||
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/l3.png"
|
||||
alt="Column-Level Lineage"
|
||||
caption="Column-Level Lineage"
|
||||
/%}
|
||||
|
||||
Watch the video about lineage (13:30 to 15:50)
|
||||
{% youtube videoId="jEbN1tt89H0" start="13:30" end="15:48" /%}
|
||||
@ -0,0 +1,89 @@
|
||||
---
|
||||
title: How to Deploy a Lineage Workflow
|
||||
slug: /how-to-guides/openmetadata/data-lineage/workflow
|
||||
---
|
||||
|
||||
# How to Deploy a Lineage Workflow
|
||||
|
||||
Lineage data can be ingested from your data sources right from the OpenMetadata UI. Currently, the lineage workflow is supported for a limited set of connectors, like [BigQuery](/connectors/database/bigquery), [Snowflake](/connectors/database/snowflake), [MSSQL](/connectors/database/mssql), [Redshift](/connectors/database/redshift), [Clickhouse](/connectors/database/clickhouse), [Postgres](/connectors/database/postgres), [Databricks](/connectors/database/databricks).
|
||||
|
||||
{% note noteType="Tip" %} **Tip:** Trace the upstream and downstream dependencies with Lineage. {% /note %}
|
||||
|
||||
## View Lineage from Metadata Ingestion
|
||||
Once the metadata ingestion runs correctly, and we are able to explore the service Entities, we can add the view lineage information for the data assets. This will populate the Lineage tab in the data asset page. During the Metadata Ingestion workflow we differentiate if a Table is a View. For those sources, where we can obtain the query that generates the View, we bring in the view lineage along with the metadata. After all Tables have been ingested in the workflow, it's time to parse all the queries generating Views. During the query parsing, we will obtain the source and target tables, search if the Tables exist in OpenMetadata, and finally create the lineage relationship between the involved Entities.
|
||||
|
||||
If the database has views, then the view lineage would be generated automatically, along with the column-level lineage. In such a case, the table type is **View** as shown in the example below.
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/view.png"
|
||||
alt="View Lineage through Metadata Ingestion"
|
||||
caption="View Lineage through Metadata Ingestion"
|
||||
/%}
|
||||
|
||||
## Lineage Ingestion from UI
|
||||
Apart from the Metadata ingestion, we can create a workflow that will obtain the query log and table creation information from the underlying database and feed it to OpenMetadata. The Lineage Ingestion will be in charge of obtaining this data. The metadata ingestion will only bring in the View lineage queries, whereas the lineage ingestion workflow will be bring in all those queries that can be used to generate lineage information.
|
||||
|
||||
### 1. Add a Lineage Ingestion
|
||||
|
||||
Navigate to **Settings >> Services**. Select the required service
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/wkf1.png"
|
||||
alt="Select a Service"
|
||||
caption="Select a Service"
|
||||
/%}
|
||||
|
||||
Go the the **Ingestions** tab. Click on **Add Ingestion** and select **Add Lineage Ingestion**.
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/wkf2.png"
|
||||
alt="Add a Lineage Ingestion"
|
||||
caption="Add a Lineage Ingestion"
|
||||
/%}
|
||||
|
||||
### 2. Configure the Lineage Ingestion
|
||||
|
||||
Here you can enter the Lineage Ingestion details:
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/wkf3.png"
|
||||
alt="Configure the Lineage Ingestion"
|
||||
caption="Configure the Lineage Ingestion"
|
||||
/%}
|
||||
|
||||
### Lineage Options
|
||||
|
||||
**Query Log Duration:** Specify the duration in days for which the profiler should capture lineage data from the query logs. For example, if you specify 2 as the value for the duration, the data profiler will capture lineage information for 2 **days** or 48 hours prior to when the ingestion workflow is run.
|
||||
|
||||
**Parsing Timeout Limit:** Specify the timeout limit for parsing the sql queries to perform the lineage analysis. This must be specified in **seconds**.
|
||||
|
||||
**Result Limit:** Set the limit for the query log results to be run at a time. This is the **number of rows**.
|
||||
|
||||
**Filter Condition:** We execute a query on query history table of the respective data source to perform the query analysis and extract the lineage and usage information. This field will be useful when you want to restrict some queries from being part of this analysis. In this field you can specify a sql condition that will be applied on the query history result set. You can check more about [Usage Query Filtering here](/connectors/ingestion/workflows/usage/filter-query-set).
|
||||
|
||||
### 3. Schedule and Deploy
|
||||
|
||||
After clicking Next, you will be redirected to the Scheduling form. This will be the same as the Metadata Ingestion. Select your desired schedule and click on Deploy to find the lineage pipeline being added to the Service Ingestions.
|
||||
{% image
|
||||
src="/images/v1.1/how-to-guides/lineage/wkf4.png"
|
||||
alt="Schedule and Deploy the Lineage Ingestion"
|
||||
caption="Schedule and Deploy the Lineage Ingestion"
|
||||
/%}
|
||||
|
||||
## dbt Ingestion
|
||||
|
||||
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
|
||||
|
||||
You can learn more about [lineage ingestion here](/connectors/ingestion/lineage).
|
||||
|
||||
## Query Logs using CSV File
|
||||
|
||||
Lineage ingestion is supported for a few connectors as mentioned earlier. For the unsupported connectors, you can set up [Lineage Workflows using Query Logs](/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs) using a CSV file.
|
||||
|
||||
## Manual Lineage
|
||||
|
||||
Lineage can also be added and edited manually in OpenMetadata. Refer for more information on [adding lineage manually](/how-to-guides/openmetadata/data-lineage/manual).
|
||||
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
bold="Explore Lineage"
|
||||
icon="MdArrowForward"
|
||||
href="/how-to-guides/openmetadata/data-lineage/explore"%}
|
||||
Explore the rich lineage view in OpenMetadata.
|
||||
{%/inlineCallout%}
|
||||
@ -623,6 +623,14 @@ site_menu:
|
||||
url: /how-to-guides/openmetadata/data-quality-profiler
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage
|
||||
url: /how-to-guides/openmetadata/data-lineage
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How to Deploy a Lineage Workflow
|
||||
url: /how-to-guides/openmetadata/data-lineage/workflow
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / Explore the Lineage View
|
||||
url: /how-to-guides/openmetadata/data-lineage/explore
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How Column-Level Lineage Works
|
||||
url: /how-to-guides/openmetadata/data-lineage/column
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How to Manually Add or Edit Lineage
|
||||
url: /how-to-guides/openmetadata/data-lineage/manual
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Insights
|
||||
url: /how-to-guides/openmetadata/data-insights
|
||||
- category: How to Guides / The Six Pillars of OpenMetadata / Data Governance
|
||||
|
||||
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/column1.png
Normal file
|
After Width: | Height: | Size: 739 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/edge.png
Normal file
|
After Width: | Height: | Size: 379 KiB |
|
After Width: | Height: | Size: 619 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/l1.png
Normal file
|
After Width: | Height: | Size: 727 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/l2.png
Normal file
|
After Width: | Height: | Size: 702 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/l3.png
Normal file
|
After Width: | Height: | Size: 793 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/lineage1.png
Normal file
|
After Width: | Height: | Size: 1.7 MiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/lineage2.png
Normal file
|
After Width: | Height: | Size: 990 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/lineage3.png
Normal file
|
After Width: | Height: | Size: 950 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/nodes.png
Normal file
|
After Width: | Height: | Size: 64 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/pipeline.png
Normal file
|
After Width: | Height: | Size: 951 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/view.png
Normal file
|
After Width: | Height: | Size: 501 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/wkf1.png
Normal file
|
After Width: | Height: | Size: 948 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/wkf2.png
Normal file
|
After Width: | Height: | Size: 667 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/wkf3.png
Normal file
|
After Width: | Height: | Size: 770 KiB |
BIN
openmetadata-docs/images/v1.1/how-to-guides/lineage/wkf4.png
Normal file
|
After Width: | Height: | Size: 639 KiB |