Updated documentation for Test Suite (#7167)

* Added part 1 of test suite doc

* Added documentation for custom TestCase
This commit is contained in:
Teddy 2022-09-03 18:19:55 +02:00 committed by GitHub
parent dc84cdbc8e
commit 7ae1b15d88
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
81 changed files with 477 additions and 358 deletions

View File

@ -441,6 +441,10 @@ site_menu:
url: /openmetadata/ingestion/workflows/profiler
- category: OpenMetadata / Ingestion / Workflows / Profiler / Metrics
url: /openmetadata/ingestion/workflows/profiler/metrics
- category: OpenMetadata / Ingestion / Workflows / Data Quality
url: /openmetadata/ingestion/workflows/data-quality
- category: OpenMetadata / Ingestion / Workflows / Data Quality / Tests
url: /openmetadata/ingestion/workflows/data-quality/tests
- category: OpenMetadata / Ingestion / Lineage
url: /openmetadata/ingestion/lineage
- category: OpenMetadata / Ingestion / Lineage / Edit Data Lineage Manually

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Athena connector.
Configure and schedule Athena metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -361,7 +361,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Athena connector.
Configure and schedule Athena metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -314,7 +314,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Athena connector.
Configure and schedule Athena metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -221,7 +221,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the AzureSQL connector.
Configure and schedule AzureSQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -357,7 +357,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the AzureSQL connector.
Configure and schedule AzureSQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -310,7 +310,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the AzureSQL connector.
Configure and schedule AzureSQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -218,7 +218,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -11,7 +11,7 @@ Configure and schedule BigQuery metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -518,7 +518,7 @@ pip3 install --upgrade 'openmetadata-ingestion[bigquery-usage]'
For the usage workflow creation, the Airflow file will look the same as for the metadata ingestion. Updating the YAML configuration will be enough.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule BigQuery metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -475,7 +475,7 @@ After saving the YAML config, we will run the command the same way we did for th
metadata ingest -c <path-to-yaml>
```
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule BigQuery metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -268,7 +268,7 @@ text="Learn more about how to configure the Usage Workflow to ingest Query and L
link="/openmetadata/ingestion/workflows/usage"
/>
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -11,7 +11,7 @@ Configure and schedule Clickhouse metadata and profiler workflows from the OpenM
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -443,7 +443,7 @@ pip3 install --upgrade 'openmetadata-ingestion[clickhouse-usage]'
For the usage workflow creation, the Airflow file will look the same as for the metadata ingestion. Updating the YAML configuration will be enough.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Clickhouse metadata and profiler workflows from the OpenM
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -400,7 +400,7 @@ After saving the YAML config, we will run the command the same way we did for th
metadata ingest -c <path-to-yaml>
```
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Clickhouse metadata and profiler workflows from the OpenM
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -226,7 +226,7 @@ text="Learn more about how to configure the Usage Workflow to ingest Query and L
link="/openmetadata/ingestion/workflows/usage"
/>
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Databricks connecto
Configure and schedule Databricks metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Databricks connecto
Configure and schedule Databricks metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Databricks connecto
Configure and schedule Databricks metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -217,7 +217,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the DB2 connector.
Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -354,7 +354,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the DB2 connector.
Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -307,7 +307,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the DB2 connector.
Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -216,7 +216,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Druid connector.
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -355,7 +355,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Druid connector.
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -308,7 +308,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Druid connector.
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -215,7 +215,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Hive connector.
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -354,7 +354,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Hive connector.
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -307,7 +307,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Hive connector.
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -217,7 +217,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MariaDB connector.
Configure and schedule MariaDB metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MariaDB connector.
Configure and schedule MariaDB metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MariaDB connector.
Configure and schedule MariaDB metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -216,7 +216,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -11,7 +11,7 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -442,7 +442,7 @@ pip3 install --upgrade 'openmetadata-ingestion[mssql-usage]'
For the usage workflow creation, the Airflow file will look the same as for the metadata ingestion. Updating the YAML configuration will be enough.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -399,7 +399,7 @@ After saving the YAML config, we will run the command the same way we did for th
metadata ingest -c <path-to-yaml>
```
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -229,7 +229,7 @@ text="Learn more about how to configure the Usage Workflow to ingest Query and L
link="/openmetadata/ingestion/workflows/usage"
/>
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MySQL connector.
Configure and schedule MySQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MySQL connector.
Configure and schedule MySQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the MySQL connector.
Configure and schedule MySQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -218,7 +218,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Oracle connector.
Configure and schedule Oracle metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -357,7 +357,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Oracle connector.
Configure and schedule Oracle metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -310,7 +310,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Oracle connector.
Configure and schedule Oracle metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -217,7 +217,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Postgres connector.
Configure and schedule Postgres metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Postgres connector.
Configure and schedule Postgres metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the PostgreSQL connecto
Configure and schedule PostgreSQL metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -216,7 +216,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Presto connector.
Configure and schedule Presto metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -358,7 +358,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Presto connector.
Configure and schedule Presto metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -311,7 +311,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Presto connector.
Configure and schedule Presto metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -217,7 +217,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -11,7 +11,7 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -451,7 +451,7 @@ pip3 install --upgrade 'openmetadata-ingestion[redshift-usage]'
For the usage workflow creation, the Airflow file will look the same as for the metadata ingestion. Updating the YAML configuration will be enough.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -408,7 +408,7 @@ After saving the YAML config, we will run the command the same way we did for th
metadata ingest -c <path-to-yaml>
```
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -232,7 +232,7 @@ text="Learn more about how to configure the Usage Workflow to ingest Query and L
link="/openmetadata/ingestion/workflows/usage"
/>
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Salesforce connecto
Configure and schedule Salesforce metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -356,7 +356,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Salesforce connecto
Configure and schedule Salesforce metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -309,7 +309,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Salesforce connecto
Configure and schedule Salesforce metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -218,7 +218,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Singlestore connect
Configure and schedule Singlestore metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Singlestore connect
Configure and schedule Singlestore metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Singlestore connect
Configure and schedule Singlestore metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -215,7 +215,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -11,7 +11,7 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -463,7 +463,7 @@ pip3 install --upgrade 'openmetadata-ingestion[snowflake-usage]'
For the usage workflow creation, the Airflow file will look the same as for the metadata ingestion. Updating the YAML configuration will be enough.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -419,7 +419,7 @@ After saving the YAML config, we will run the command the same way we did for th
metadata ingest -c <path-to-yaml>
```
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -11,7 +11,7 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage and Lineage Ingestion](#query-usage-and-lineage-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -239,7 +239,7 @@ text="Learn more about how to configure the Usage Workflow to ingest Query and L
link="/openmetadata/ingestion/workflows/usage"
/>
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Trino connector.
Configure and schedule Trino metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -361,7 +361,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Trino connector.
Configure and schedule Trino metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -314,7 +314,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Trino connector.
Configure and schedule Trino metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -217,7 +217,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Vertica connector.
Configure and schedule Vertica metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -353,7 +353,7 @@ with DAG(
Note that from connector to connector, this recipe will always be the same.
By updating the YAML configuration, you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Vertica connector.
Configure and schedule Vertica metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
## Requirements
@ -306,7 +306,7 @@ metadata ingest -c <path-to-yaml>
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
you will be able to extract metadata from different sources.
## Data Profiler and Quality Tests
## Data Profiler
The Data Profiler workflow will be using the `orm-profiler` processor.
While the `serviceConnection` will still be the same to reach the source system, the `sourceConfig` will be

View File

@ -10,7 +10,7 @@ In this section, we provide guides and references to use the Vertica connector.
Configure and schedule Vertica metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler and Quality Tests](#data-profiler-and-quality-tests)
- [Data Profiler](#data-profiler)
- [DBT Integration](#dbt-integration)
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
@ -216,7 +216,7 @@ caption="Edit and Deploy the Ingestion Pipeline"
From the Connection tab, you can also Edit the Service if needed.
## Data Profiler and Quality Tests
## Data Profiler
<Tile
icon="schema"

View File

@ -1,237 +0,0 @@
---
title: Data Quality
slug: /openmetadata/data-quality
---
# Data Quality
Learn how you can use OpenMetadata to define Data Quality tests and measure your data reliability.
## Requirements
### OpenMetadata (version 0.10 or later)
You must have a running deployment of OpenMetadata to use this guide. OpenMetadata includes the following services:
* OpenMetadata server supporting the metadata APIs and user interface
* Elasticsearch for metadata search and discovery
* MySQL as the backing store for all metadata
* Airflow for metadata ingestion workflows
To deploy OpenMetadata checkout the [deployment guide](/deployment)
### Python (version 3.8.0 or later)
Please use the following command to check the version of Python you have.
```
python3 --version
```
## Building Trust
OpenMetadata aims to be where all users share and collaborate around data. One of the main benefits of ingesting metadata into OpenMetadata is to make assets discoverable.
However, we need to ask ourselves: What happens after a user stumbles upon our assets? Then, we can help other teams use the data by adding proper descriptions with up-to-date information or even examples on how to extract information properly.
What is imperative to do, though, is to build **trust**. For example, users might find a Table that looks useful for their use case, but how can they be sure it correctly follows the SLAs? What issues has this source undergone in the past? Data Quality & tests play a significant role in making any asset trustworthy. Being able to show together the Entity information and its reliability will help our users think, "This is safe to use".
This section will show you how to configure and run Data Profiling and Quality pipelines with the supported tests.
## Data Profiling
### Workflows
The **Ingestion Framework** currently supports two types of workflows:
* **Ingestion:** Captures metadata from the sources and updates the Entities' instances. This is a lightweight process that can be scheduled to have fast feedback on metadata changes in our sources. This workflow handles both the metadata ingestion as well as the usage and lineage information from the sources, when available.
* **Profiling:** Extracts metrics from SQL sources and sets up and runs Data Quality tests. It requires previous executions of the Ingestion Pipeline. This is a more time-consuming workflow that will run metrics and compare their result to the configured tests of both Tables and Columns.
<Note>
Note that you can configure the ingestion pipelines with `source.config.data_profiler_enabled` as `"true"` or `"false"` to run the profiler as well during the metadata ingestion. This, however, **does not support** Quality Tests.
</Note>
### Profiling Overview
#### Requirements
The source layer of the Profiling workflow is the OpenMetadata API. Based on the source configuration, this process lists the tables to be executed.
#### Description
The steps of the **Profiling** pipeline are the following:
1. First, use the source configuration to create a connection.
2. Next, iterate over the selected tables and schemas that the Ingestion has previously recorded in OpenMetadata.
3. Run a default set of metrics to all the table's columns. (We will add more customization in the future releases).
4. Finally, compare the metrics' results against the configured Data Quality tests.
<Note>
Note that all the results are published to the OpenMetadata API, both the Profiling and the tests executions. This will allow users to visit the evolution of the data and its reliability directly in the UI.
</Note>
You can take a look at the supported metrics and tests here:
<TileContainer>
<Tile
icon="manage_accounts"
title="Metrics"
text="Supported metrics"
link={"/openmetadata/data-quality/metrics"}
size="half"
/>
<Tile
icon="manage_accounts"
title="Tests"
text="Supported tests and how to configure them"
link={"/openmetadata/data-quality/tests"}
size="half"
/>
</TileContainer>
## How to Add Tests
Tests are part of the Table Entity. We can add new tests to a Table from the UI or directly use the JSON configuration of the workflows.
<Note>
Note that in order to add tests and run the Profiler workflow, the metadata should have already been ingested.
</Note>
### Add Tests in the UI
To create a new test, we can go to the _Table_ page under the _Data Quality_ tab:
<Image
src={"/images/openmetadata/data-quality/data-quality-tab.png"}
alt="Data Quality Tab in the Table Page"
caption="Data Quality Tab in the Table Page"
/>
Clicking on _Add Test_ will allow us two options: **Table Test** or **Column Test**. A Table Test will be run on metrics from the whole table, such as the number of rows or columns, while Column Tests are specific to each column's values.
#### Add Table Tests
Adding a Table Test will show us the following view:
<Image
src={"/images/openmetadata/data-quality/table-test.png"}
alt="Add a Table Test"
caption="Add a Table Test"
/>
* **Test Type**: It allows us to specify the test we want to configure.
* **Description**: To explain why the test is necessary and what scenarios we want to validate.
* **Value**: Different tests will show different values here. For example, the `tableColumnCountToEqual` requires us to specify the number of columns we expect. Other tests will have other forms when we need to add values such as `min` and `max`, while other tests require no value at all, such as tests validating that there are no nulls in a column.
#### Add Column Tests
Adding a Column Test will have a similar view:
<Image
src={"/images/openmetadata/data-quality/column-test.png"}
alt="Add Column Test"
caption="Add Column Test"
/>
The Column Test form will be similar to the Table Test one. The only difference is the **Column Name** field, where we need to select the column we will be targeting for the test.
<Note>
You can review the supported tests [here](/openmetadata/data-quality/tests). We will keep expanding the support for new tests in the upcoming releases.
</Note>
Once tests are added, we will be able to see them in the _Data Quality_ tab:
<Image
src={"/images/openmetadata/data-quality/created-tests.png"}
alt="Freshly created tests"
caption="Freshly created tests"
/>
Note how the tests are grouped in Table and Column tests. All tests from the same column will also be grouped together. From this view, we can both edit and delete the tests if needed.
In the global Table information at the top, we will also be able to see how many Table Tests have been configured.
### Add Tests with the JSON Config
In the [connectors](/openmetadata/connectors) documentation for each source, we showcase how to run the Profiler Workflow using the Airflow SDK or the `metadata` CLI. When configuring the JSON configuration for the workflow, we can add tests as well.
Any tests added to the JSON configuration will also be reflected in the Data Quality tab. This JSON configuration can be used for both the Airflow SDK and to run the workflow with the CLI.
You can find further information on how to prepare the JSON configuration for each of the sources. However, adding any number of tests is a matter of updating the `processor` configuration as follows:
```json
"processor": {
"type": "orm-profiler",
"config": {
"test_suite": {
"name": "<Test Suite name>",
"tests": [
{
"table": "<Table FQN>",
"table_tests": [
{
"testCase": {
"config": {
"value": 100
},
"tableTestType": "tableRowCountToEqual"
}
}
],
"column_tests": [
{
"columnName": "<Column Name>",
"testCase": {
"config": {
"minValue": 0,
"maxValue": 99
},
"columnTestType": "columnValuesToBeBetween"
}
}
]
}
]
}
}
},son
```
`tests` is a list of test definitions that will be applied to the `table`, informed by its FQN. For each table, one can then define a list of `table_tests` and `column_tests`. Review the supported tests and their definitions to learn how to configure the different cases [here](/openmetadata/data-quality/tests).
## How to Run Tests
Both the Profiler and Tests are executed in the Profiler Workflow. All the results will be available through the UI in the _Profiler_ and _Data Quality_ tabs.
<Image
src={"/images/openmetadata/data-quality/test-results.png"}
alt="Tests results in the Data Quality tab"
caption="Tests results in the Data Quality tab"
/>
To learn how to prepare and run the Profiler Workflow for a given source, you can take a look at the documentation for that specific [connector](/openmetadata/connectors).
## Where are the Tests stored?
Once you create a Test definition for a Table or any of its Columns, that Test becomes a part of the Table Entity. This means that it does not matter from where you create tests (JSON Configuration vs. UI). As once the test gets registered to OpenMetadata, it will always be executed as part of the Profiler Workflow.
You can check what tests an Entity has configured in the **Data Quality** tab of the UI, or by using the API:
```python
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.ingestion.ometa.openmetadata_rest import MetadataServerConfig
from metadata.generated.schema.entity.data.table import Table
server_config = MetadataServerConfig(api_endpoint="http://localhost:8585/api")
metadata = OpenMetadata(server_config)
table = metadata.get_by_name(entity=Table, fqdn="FQDN", fields=["tests"])
```
You can then check `table.tableTests`, or for each Column `column.columnTests` to get the test information.

View File

@ -0,0 +1,352 @@
---
title: Data Quality
slug: /openmetadata/ingestion/workflows/data-quality
---
# Data Quality
Learn how you can use OpenMetadata to define Data Quality tests and measure your data reliability.
## Requirements
### OpenMetadata (version 0.12 or later)
You must have a running deployment of OpenMetadata to use this guide. OpenMetadata includes the following services:
* OpenMetadata server supporting the metadata APIs and user interface
* Elasticsearch for metadata search and discovery
* MySQL as the backing store for all metadata
* Airflow for metadata ingestion workflows
To deploy OpenMetadata checkout the [deployment guide](/deployment)
### Python (version 3.8.0 or later)
Please use the following command to check the version of Python you have.
```
python3 --version
```
## Building Trust with Data Quality
OpenMetadata is where all users share and collaborate around data. It is where you make your assets discoverable; with data quality you make these assets **trustable**.
This section will show you how to configure and run Data Quality pipelines with the OpenMetadata built-in tests.
## Main Concepts
### Test Suite
Test Suites are containers allowing you to group related Test Cases together. Once configured, a Test Suite can easily be deployed to execute all the Test Cases it contains.
### Test Definition
Test Definitions are generic tests definition elements specific to a test such as:
- test name
- column name
- data type
### Test Cases
Test Cases specify a Test Definition. It will define what condition a test must meet to be successful (e.g. `max=n`, etc.). One Test Definition can be linked to multiple Test Cases.
## Adding Tests Through the UI
**Note:** you will need to make sure you have the right permission in OpenMetadata to create a test.
### Step 1: Creating a Test Suite
From your table service click on the `profiler` tab. From there you will be able to create table tests by clicking on the purple background `Add Test` top button or column tests by clicking on the white background `Add Test` button.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/profiler-tab-view.png"}
alt="Write your first test"
caption="Write your first test"
/>
On the next page you will be able to either select an existing Test Suite or Create a new one. If you select an existing one your Test Case will automatically be added to the Test Suite
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/test-suite-page.png"}
alt="Create test suite"
caption="Create test suite"
/>
### Step 2: Create a Test Case
On the next page, you will create a Test Case. You will need to select a Test Definition from the drop down menu and specify the parameters of your Test Case.
**Note:** Test Case name needs to be unique across the whole platform. A warning message will show if your Test Case name is not unique.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/test-case-page.png"}
alt="Create test case"
caption="Create test case"
/>
### Step 3: Add Ingestion Workflow
If you have created a new test suite you will see a purple background `Add Ingestion` button after clicking `submit`. This will allow you to schedule the execution of your Test Suite. If you have selected an existing Test Suite you are all set.
After clicking `Add Ingestion` you will be able to select an execution schedule for your Test Suite (note that you can edit this later). Once you have selected the desired scheduling time, click submit and you are all set.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/ingestion-page.png"}
alt="Create ingestion workflow"
caption="Create ingestion workflow"
/>
## Adding Tests with the YAML Config
When creating a JSON config for a test workflow the source configuration is very simple.
```
source:
type: TestSuite
serviceName: <your_service_name>
sourceConfig:
config:
type: TestSuite
```
The only section you need to modify here is the `serviceName` key. Note that this name needs to be unique across OM platform Test Suite name.
Once you have defined your source configuration you'll need to define te processor configuration.
```
processor:
type: "orm-test-runner"
config:
testSuites:
- name: [test_suite_name]
description: [test suite description]
testCases:
- name: [test_case_name]
description: [test case description]
testDefinitionName: [test definition name*]
entityLink: ["<#E::table::fqn> or <#E::table::fqn::columns::column_name>"]
parameterValues:
- name: [column parameter name]
value: [value]
- ...
```
The processor type should be set to ` "orm-test-runner"`. For accepted test definition names and parameter value names refer to the [tests page](/content/openmetadata/ingestion/workflows/data-quality/tests.md).
`sink` and `workflowConfig` will have the same settings than the ingestion and profiler workflow.
### Full `yaml` config example
```
source:
type: TestSuite
serviceName: MyAwesomeTestSuite
sourceConfig:
config:
type: TestSuite
processor:
type: "orm-test-runner"
config:
testSuites:
- name: test_suite_one
description: this is a test testSuite to confirm test suite workflow works as expected
testCases:
- name: a_column_test
description: A test case
testDefinitionName: columnValuesToBeBetween
entityLink: "<#E::table::local_redshift.dev.dbt_jaffle.customers::columns::number_of_orders>"
parameterValues:
- name: minValue
value: 2
- name: maxValue
value: 20
sink:
type: metadata-rest
config: {}
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: no-auth
```
### How to Run Tests
To run the tests from the CLI execute the following command
```
metadata test -c /path/to/my/config.yaml
```
## How to Visualize Test Results
### From the Test Suite View
From the home page click on the Test Suite menu in the left pannel.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/test-suite-home-page.png"}
alt="Test suite home page"
caption="Test suite home page"
/>
This will bring you to the Test Suite page where you can select a specific Test Suite.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/test-suite-landing.png"}
alt="Test suite landing page"
caption="Test suite landing page"
/>
From there you can select a Test Suite and visualize the results associated with this specific Test Suite.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/test-suite-results.png"}
alt="Test suite results page"
caption="Test suite results page"
/>
### From a Table Entity
Navigate to your table and click on the `profiler` tab. From there you'll be able to see test results at the table or column level.
#### Table Level Test Results
In the top pannel, click on the white background `Data Quality` button. This will bring you to a summary of all your quality tests at the table level
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/table-results-entity.png"}
alt="Test suite results table"
caption="Test suite results table"
/>
#### Column Level Test Results
On the profiler page, click on a specific column name. This will bring you to a new page where you can click the white background `Quality Test` button to see all the tests results related to your column.
<Image
src={"/images/openmetadata/ingestion/workflows/data-quality/colum-level-test-results.png"}
alt="Test suite results table"
caption="Test suite results table"
/>
## Adding Custom Tests
While OpenMetadata provides out of the box tests, you may want to write your test results from your own custom quality test suite. This is very easy to do using the API.
### Creating a `TestDefinition`
First, you'll need to create a Test Definition for your test. You can use the following endpoint `/api/v1/testDefinition` using a POST protocol to create your Test Definition. You will need to pass the following data in the body your request at minimum.
```
{
"description": "<you test definition description>",
"entityType": "<TABLE or COLUMN>",
"name": "<your_test_name>",
"testPlatforms": ["<any of OpenMetadata,GreatExpectations, DBT, Deequ, Soda, Other>"],
"parameterDefinition": [
{
"name": "<name>"
},
{
"name": "<name>"
}
]
}
```
Here is a complete CURL request
```
curl --request POST 'http://localhost:8585/api/v1/testDefinition' \
--header 'Content-Type: application/json' \
--data-raw '{
"description": "A demo custom test",
"entityType": "TABLE",
"name": "demo_test_definition",
"testPlatforms": ["Soda", "DBT"],
"parameterDefinition": [{
"name": "ColumnOne"
}]
}'
```
Make sure to keep the `UUID` from the response as you will need it to create the Test Case.
### Creating a `TestSuite`
You'll also need to create a Test Suite for your Test Case -- note that you can also use an existing one if you want to. You can use the following endpoint `/api/v1/testSuite` using a POST protocol to create your Test Definition. You will need to pass the following data in the body your request at minimum.
```
{
"name": "<test_suite_name>",
"description": "<test suite description>"
}
```
Here is a complete CURL request
```
curl --request POST 'http://localhost:8585/api/v1/testSuite' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "<test_suite_name>",
"description": "<test suite description>"
}'
```
Make sure to keep the `UUID` from the response as you will need it to create the Test Case.
### Creating a `TestCase`
Once you have your Test Definition created you can create a Test Case -- which is a specification of your Test Definition. You can use the following endpoint `/api/v1/testCase` using a POST protocol to create your Test Case. You will need to pass the following data in the body your request at minimum.
```
{
"entityLink": "<#E::table::fqn> or <#E::table::fqn::columns::column name>",
"name": "<test_case_name>",
"testDefinition": {
"id": "<test definition UUID>",
"type": "testDefinition"
},
"testSuite": {
"id": "<test suite UUID>",
"type": "testSuite"
}
}
```
**Important:** for `entityLink` make sure to include the starting and ending `<>`
Here is a complete CURL request
```
curl --request POST 'http://localhost:8585/api/v1/testCase' \
--header 'Content-Type: application/json' \
--data-raw '{
"entityLink": "<#E::table::local_redshift.dev.dbt_jaffle.customers>",
"name": "custom_test_Case",
"testDefinition": {
"id": "1f3ce6f5-67be-45db-8314-2ee42d73239f",
"type": "testDefinition"
},
"testSuite": {
"id": "3192ed9b-5907-475d-a623-1b3a1ef4a2f6",
"type": "testSuite"
},
"parameterValues": [
{
"name": "colName",
"value": 10
}
]
}'
```
Make sure to keep the `UUID` from the response as you will need it to create the Test Case.
### Writing `TestCaseResults`
Once you have your Test Case created you can write your results to it. You can use the following endpoint `/api/v1/testCase/{test FQN}/testCaseResult` using a PUT protocol to add Test Case Results. You will need to pass the following data in the body your request at minimum.
```
{
"result": "<result message>",
"testCaseStatus": "<Success or Failed or Aborted>",
"timestamp": <Unix timestamp>,
"testResultValue": [
{
"value": "<value>"
}
]
}
```
Here is a complete CURL request
```
curl --location --request PUT 'http://localhost:8585/api/v1/testCase/local_redshift.dev.dbt_jaffle.customers.custom_test_Case/testCaseResult' \
--header 'Content-Type: application/json' \
--data-raw '{
"result": "found 1 values expected n",
"testCaseStatus": "Success",
"timestamp": 1662129151,
"testResultValue": [{
"value": "10"
}]
}'
```
You will now be able to see your test in the Test Suite or the table entity.

View File

@ -1,6 +1,6 @@
---
title: Tests
slug: /openmetadata/data-quality/tests
slug: /openmetadata/ingestion/workflows/data-quality/tests
---
# Tests

Binary file not shown.

Before

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 196 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 178 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 331 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB