mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-11-26 15:46:17 +00:00
Docs: Updating broken links and Missing Docs (#17642)
* Doc: Adding SSL Docs for Messaging & Dashboard * Docs: Updating Broken Links in Docs --------- Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
This commit is contained in:
parent
ee11760576
commit
4ccfe886a4
@ -77,7 +77,7 @@ If instead we use a local file path that contains the metastore information (e.g
|
||||
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
|
||||
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
**Metastore Database**
|
||||
|
||||
@ -100,7 +100,7 @@ To update the `Derby` information. More information about this in a great [SO th
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find
|
||||
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
|
||||
|
||||
@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
|
||||
|
||||
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
|
||||
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
|
||||
|
||||
{% /note %}
|
||||
|
||||
|
||||
@ -51,7 +51,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -48,7 +48,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -23,8 +23,8 @@ Configure and schedule Databricks metadata and profiler workflows from the OpenM
|
||||
- [Unity Catalog](#unity-catalog)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Datalake connector.
|
||||
Configure and schedule Datalake metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/datalake/yaml"} /%}
|
||||
|
||||
|
||||
@ -33,8 +33,8 @@ Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/db2/yaml"} /%}
|
||||
@ -65,7 +65,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
{% partial
|
||||
|
||||
@ -48,7 +48,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Python Requirements
|
||||
|
||||
|
||||
@ -104,7 +104,7 @@ If instead we use a local file path that contains the metastore information (e.g
|
||||
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
|
||||
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
**Metastore Database**
|
||||
|
||||
@ -109,7 +109,7 @@ To update the `Derby` information. More information about this in a great [SO th
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find
|
||||
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
|
||||
|
||||
@ -17,7 +17,7 @@ Configure and schedule DomoDatabase metadata and profiler workflows from the Ope
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/domo-database/yaml"} /%}
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Doris metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-doris-connection-with-ssl-in-openmetadata)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Druid connector.
|
||||
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/athena/yaml"} /%}
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule Greenplum metadata and profiler workflows from the OpenMe
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-greenplum-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -17,8 +17,8 @@ In this section, we provide guides and references to use the Hive connector.
|
||||
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-hive-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -31,7 +31,7 @@ Configure and schedule Hive metadata and profiler workflows from the OpenMetadat
|
||||
To extract metadata, the user used in the connection needs to be able to perform `SELECT`, `SHOW`, and `DESCRIBE` operations in the database/schema where the metadata needs to be extracted from.
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -15,8 +15,8 @@ In this section, we provide guides and references to use the Impala connector.
|
||||
|
||||
Configure and schedule Impala metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-impala-connection-with-ssl-in-openmetadata)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule MariaDB metadata and profiler workflows from the OpenMeta
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/mariadb/yaml"} /%}
|
||||
@ -43,7 +43,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -70,7 +70,7 @@ To fetch the metadata from MongoDB to OpenMetadata, the MongoDB user must have a
|
||||
To deploy OpenMetadata, check the Deployment guides.
|
||||
{%/inlineCallout%}
|
||||
|
||||
[Profiler deployment](/connectors/ingestion/workflows/profiler)
|
||||
[Profiler deployment](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
|
||||
### Limitations
|
||||
|
||||
|
||||
@ -292,7 +292,7 @@ workflowConfig:
|
||||
|
||||
{% /codePreview %}
|
||||
|
||||
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/connectors/ingestion/workflows/profiler)
|
||||
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
|
||||
|
||||
|
||||
|
||||
@ -19,8 +19,8 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule MySQL metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-mysql-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -45,7 +45,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Oracle metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the PinotDB connector.
|
||||
Configure and schedule PinotDB metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/pinotdb/yaml"} /%}
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule PostgreSQL metadata and profiler workflows from the OpenM
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-postgres-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/presto/yaml"} /%}
|
||||
@ -30,7 +30,7 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
To extract metadata, the user needs to be able to perform `SHOW CATALOGS`, `SHOW TABLES`, and `SHOW COLUMNS FROM` on the catalogs/tables you wish to extract metadata from and have `SELECT` permission on the `INFORMATION_SCHEMA`. Access to resources will be different based on the connector used. You can find more details in the Presto documentation website [here](https://prestodb.io/docs/current/connector.html). You can also get more information regarding system access control in Presto [here](https://prestodb.io/docs/current/security/built-in-system-access-control.html).
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -19,8 +19,8 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/redshift)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-redshift-connection-with-ssl-in-openmetadata)
|
||||
@ -42,7 +42,7 @@ GRANT SELECT ON TABLE svv_table_info to test_user;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege on `STL_QUERY` table. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule SAP Hana metadata and profiler workflows from the OpenMet
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sap-hana/yaml"} /%}
|
||||
@ -57,7 +57,7 @@ The same applies to the `_SYS_REPO` schema, required for lineage extraction.
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Singlestore metadata and profiler workflows from the Open
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/singlestore/yaml"} /%}
|
||||
@ -44,7 +44,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -20,8 +20,8 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/snowflake)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sqlite/yaml"} /%}
|
||||
|
||||
@ -0,0 +1,75 @@
|
||||
---
|
||||
title: Teradata
|
||||
slug: /connectors/database/teradata
|
||||
---
|
||||
|
||||
{% connectorDetailsHeader
|
||||
name="Teradata"
|
||||
stage="BETA"
|
||||
platform="OpenMetadata"
|
||||
availableFeatures=["Metadata", "Data Profiler"]
|
||||
unavailableFeatures=["Query Usage", "Data Quality", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
|
||||
/ %}
|
||||
|
||||
In this section, we provide guides and references to use the Teradata connector.
|
||||
|
||||
Configure and schedule Teradata metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality/configure)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/greenplum/yaml"} /%}
|
||||
|
||||
## Requirements
|
||||
{%inlineCallout icon="description" bold="OpenMetadata 1.5 or later" href="/deployment"%}
|
||||
To deploy OpenMetadata, check the Deployment guides.
|
||||
{%/inlineCallout%}
|
||||
|
||||
Connector was tested on Teradata DBS version 17.20. Since there are no significant changes in metadata objects, so it should work with 15.x, 16.x versions.
|
||||
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
By default, all valid users in Teradata DB has full access to metadata objects, so there are no any specific requirements to user privileges.
|
||||
|
||||
{% partial
|
||||
file="/v1.5/connectors/metadata-ingestion-ui.md"
|
||||
variables={
|
||||
connector: "Teradata",
|
||||
selectServicePath: "/images/v1.5/connectors/teradata/select-service.png",
|
||||
addNewServicePath: "/images/v1.5/connectors/teradata/add-new-service.png",
|
||||
serviceConnectionPath: "/images/v1.5/connectors/teradata/service-connection.png",
|
||||
}
|
||||
/%}
|
||||
|
||||
{% stepsContainer %}
|
||||
{% extraContent parentTagName="stepsContainer" %}
|
||||
|
||||
#### Connection Details
|
||||
|
||||
- **Username**: Specify the User to connect to Teradata.
|
||||
- **Password**: Password to connect to Teradata
|
||||
- **Logmech**: Specifies the logon authentication method. Possible values are TD2 (the default), JWT, LDAP, KRB5 for Kerberos, or TDNEGO.
|
||||
- **LOGDATA**: Specifies additional data needed by a logon mechanism, such as a secure token, Distinguished Name, or a domain/realm name. LOGDATA values are specific to each logon mechanism.
|
||||
- **Host and Port**: Enter the fully qualified hostname and port number (default port for Teradata is 1025) for your Teradata deployment in the Host and Port field.
|
||||
- **Transaction Mode**: Specifies the transaction mode for the connection. Possible values are DEFAULT (the default), ANSI, or TERA.
|
||||
- **Teradata Database Account**: Specifies an account string to override the default account string defined for the database user. Accounts are used by the database for workload management and resource usage monitoring.
|
||||
- **Connection Options** and **Connection Arguments**: additional connection parameters. For more information please view teradatasql [docs](https://pypi.org/project/teradatasql/).
|
||||
|
||||
{% partial file="/v1.5/connectors/database/advanced-configuration.md" /%}
|
||||
|
||||
{% /extraContent %}
|
||||
|
||||
{% partial file="/v1.5/connectors/test-connection.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/database/configure-ingestion.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-schedule-and-deploy.md" /%}
|
||||
|
||||
{% /stepsContainer %}
|
||||
|
||||
{% partial file="/v1.5/connectors/troubleshooting.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/database/related.md" /%}
|
||||
@ -0,0 +1,117 @@
|
||||
---
|
||||
title: Run the Teradata Connector Externally
|
||||
slug: /connectors/database/teradata/yaml
|
||||
---
|
||||
|
||||
{% connectorDetailsHeader
|
||||
name="Teradata"
|
||||
stage="BETA"
|
||||
platform="OpenMetadata"
|
||||
availableFeatures=["Metadata", "Data Profiler", "Data Quality"]
|
||||
unavailableFeatures=["Query Usage", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
|
||||
/ %}
|
||||
|
||||
In this section, we provide guides and references to use the Teradata connector.
|
||||
|
||||
Configure and schedule Greenplum Teradata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](#data-profiler)
|
||||
- [Data Quality](#data-quality)
|
||||
|
||||
|
||||
{% partial file="/v1.5/connectors/external-ingestion-deployment.md" /%}
|
||||
|
||||
## Requirements
|
||||
|
||||
### Python Requirements
|
||||
|
||||
{% partial file="/v1.5/connectors/python-requirements.md" /%}
|
||||
|
||||
To run the Teradata ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[teradata]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/teradataConnection.json)
|
||||
you can find the structure to create a connection to Teradata.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Teradata:
|
||||
|
||||
{% codePreview %}
|
||||
|
||||
{% codeInfoContainer %}
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
{% codeInfo srNumber=1 %}
|
||||
|
||||
**username**: Specify the User to connect to Teradata.
|
||||
|
||||
{% /codeInfo %}
|
||||
{% codeInfo srNumber=2 %}
|
||||
|
||||
**password**: User password to connect to Teradata
|
||||
|
||||
{% /codeInfo %}
|
||||
|
||||
{% codeInfo srNumber=3 %}
|
||||
|
||||
**hostPort**: Enter the fully qualified hostname and port number for your Greenplum deployment in the Host and Port field.
|
||||
|
||||
{% /codeInfo %}
|
||||
|
||||
|
||||
|
||||
|
||||
{% /codeInfoContainer %}
|
||||
|
||||
{% codeBlock fileName="filename.yaml" %}
|
||||
|
||||
```yaml {% isCodeBlock=true %}
|
||||
source:
|
||||
type: teradata
|
||||
serviceName: example_teradata
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Teradata
|
||||
```
|
||||
```yaml {% srNumber=1 %}
|
||||
username: username
|
||||
```
|
||||
```yaml {% srNumber=2 %}
|
||||
password: <password>
|
||||
```
|
||||
```yaml {% srNumber=3 %}
|
||||
hostPort: teradata:1025
|
||||
```
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/database/source-config.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/ingestion-sink.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/workflow-config.md" /%}
|
||||
|
||||
{% /codeBlock %}
|
||||
|
||||
{% /codePreview %}
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/ingestion-cli.md" /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/data-profiler.md" variables={connector: "teradata"} /%}
|
||||
|
||||
{% partial file="/v1.5/connectors/yaml/data-quality.md" /%}
|
||||
@ -17,8 +17,8 @@ Configure and schedule Trino metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/trino/yaml"} /%}
|
||||
@ -33,7 +33,7 @@ Access to resources will be based on the user access permission to access specif
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
{% partial
|
||||
|
||||
@ -18,7 +18,7 @@ Configure and schedule Unity Catalog metadata workflow from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule Vertica metadata and profiler workflows from the OpenMeta
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/vertica/yaml"} /%}
|
||||
|
||||
@ -117,6 +117,6 @@ alt="Run Great Expectations checkpoint"
|
||||
/%}
|
||||
|
||||
### List of Great Expectations Supported Test
|
||||
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/connectors/ingestion/workflows/data-quality/tests) section.
|
||||
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/how-to-guides/data-quality-observability/quality/tests) section.
|
||||
|
||||
If a test is not supported, there is no need to worry about the execution of your Great Expectations test. We will simply skip the tests that are not supported and continue the execution of your test suite.
|
||||
@ -30,14 +30,14 @@ Learn more about how to ingest metadata from dozens of connectors.
|
||||
{%inlineCallout
|
||||
bold="Metadata Profiler"
|
||||
icon="cable"
|
||||
href="/connectors/ingestion/workflows/profiler"%}
|
||||
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
|
||||
To get metrics from your Tables=
|
||||
{%/inlineCallout%}
|
||||
|
||||
{%inlineCallout
|
||||
bold="Metadata Data Quality Tests"
|
||||
icon="cable"
|
||||
href="/connectors/ingestion/workflows/data-quality"%}
|
||||
href="/how-to-guides/data-quality-observability/quality"%}
|
||||
To run automated Quality Tests on your Tables.
|
||||
{%/inlineCallout%}
|
||||
|
||||
|
||||
@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
|
||||
|
||||
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
|
||||
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
|
||||
|
||||
{% /note %}
|
||||
|
||||
|
||||
@ -40,14 +40,14 @@ Configure dbt metadata
|
||||
{%inlineCallout
|
||||
icon="fit_screen"
|
||||
bold="Data Profiler"
|
||||
href="/connectors/ingestion/workflows/profiler"%}
|
||||
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
|
||||
Compute metrics and ingest sample data.
|
||||
{%/inlineCallout%}
|
||||
|
||||
{%inlineCallout
|
||||
icon="fit_screen"
|
||||
bold="Data Quality"
|
||||
href="/connectors/ingestion/workflows/data-quality"%}
|
||||
href="/how-to-guides/data-quality-observability/quality"%}
|
||||
Monitor your data and avoid surprises.
|
||||
{%/inlineCallout%}
|
||||
|
||||
|
||||
@ -304,6 +304,6 @@ processor:
|
||||
- Bumped up ElasticSearch version for Docker and Kubernetes OpenMetadata Dependencies Helm Chart to `7.16.3`
|
||||
|
||||
### Data Quality Migration
|
||||
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/connectors/ingestion/workflows/data-quality).
|
||||
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
As a user you will need to redeploy data quality workflows. You can go to `Quality > By Tables` to view the tables with test cases that need a workflow to be set up.
|
||||
|
||||
@ -93,7 +93,7 @@ Then, you can prepare `Run Configurations` to execute the ingestion as you would
|
||||
{% image src="/images/v1.5/developers/contribute/build-code-and-run-tests/pycharm-run-config.png" alt="PyCharm run config" caption=" " /%}
|
||||
|
||||
Note that in the example we are preparing a configuration to run and test Superset. In order to understand how to run
|
||||
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/cli).
|
||||
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/yaml).
|
||||
|
||||
The important part is that we are not running a script, but rather a `module`: `metadata`. Based on this, we can work as
|
||||
we would usually do with the CLI for any ingestion, profiler, or test workflow.
|
||||
|
||||
@ -147,7 +147,7 @@ OpenMetadata supports MySQL version `8.0.0` and up.
|
||||
$$
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
You can find further information on the MySQL connector in the [docs](/connectors/database/mysql).
|
||||
|
||||
|
||||
@ -153,7 +153,7 @@ By connecting to a database service, you can ingest the databases, schemas, tabl
|
||||
/%}
|
||||
|
||||
{% note %}
|
||||
**Note:** Once you’ve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/connectors/ingestion/workflows/profiler). To add ingestion pipelines, select the required type of ingestion and enter the required details.
|
||||
**Note:** Once you’ve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/how-to-guides/data-quality-observability/profiler/workflow). To add ingestion pipelines, select the required type of ingestion and enter the required details.
|
||||
{% /note %}
|
||||
|
||||
{% image
|
||||
|
||||
@ -31,7 +31,7 @@ alt="Column Data provides information"
|
||||
caption="Column Data provides information"
|
||||
/%}
|
||||
|
||||
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
|
||||
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
|
||||
|
||||
## Tag Mapping
|
||||
|
||||
|
||||
@ -0,0 +1,117 @@
|
||||
---
|
||||
title: Run Data Insights using Airflow SDK
|
||||
slug: /how-to-guides/data-insights/airflow-sdk
|
||||
---
|
||||
|
||||
# Run Data Insights using Airflow SDK
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Data Insights:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dataInsight
|
||||
serviceName: OpenMetadata
|
||||
sourceConfig:
|
||||
config:
|
||||
type: MetadataToElasticSearch
|
||||
processor:
|
||||
type: data-insight-processor
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: false
|
||||
workflowConfig:
|
||||
loggerLevel: DEBUG
|
||||
openMetadataServerConfig:
|
||||
hostPort: '<OpenMetadata host and port>'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
### Source Configuration - Source Config
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
|
||||
|
||||
### Processor Configuration
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
|
||||
|
||||
### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
## 2. Prepare the Data Insights DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
from metadata.workflow.data_insight import DataInsightWorkflow
|
||||
from metadata.workflow.workflow_output_handler import print_status
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = DataInsightWorkflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
print_status(workflow)
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
@ -28,7 +28,7 @@ To have cost analysis data available you will need to execute the below workflow
|
||||
2. **Profiler Workflow**:
|
||||
- Purpose: Gather size information (in bytes) for data assets.
|
||||
- Description: The Profiler Workflow is responsible for obtaining the size of data assets in bytes. This information is vital for generating the size-related data used in the Cost Analysis charts. It helps in assessing the resource consumption and cost implications of each asset.
|
||||
- Click [here](/connectors/ingestion/workflows/profiler) for documentation on the profiler workflow.
|
||||
- Click [here](/how-to-guides/data-quality-observability/profiler/workflow) for documentation on the profiler workflow.
|
||||
|
||||
3. **Data Insights Workflow**:
|
||||
- Purpose: Aggregate information from Usage Workflow and Profiler Workflow.
|
||||
|
||||
@ -0,0 +1,90 @@
|
||||
---
|
||||
title: Run Elasticsearch Reindex using Airflow SDK
|
||||
slug: /how-to-guides/data-insights/elasticsearch-reindex
|
||||
---
|
||||
|
||||
# Run Elasticsearch Reindex using Airflow SDK
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Elasticsearch Reindex:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
source:
|
||||
type: metadata_elasticsearch
|
||||
serviceName: openMetadata
|
||||
serviceConnection:
|
||||
config:
|
||||
type: MetadataES
|
||||
sourceConfig:
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: true
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
|
||||
```
|
||||
|
||||
### 2. Prepare the Ingestion DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from metadata.workflow.metadata import MetadataWorkflow
|
||||
from metadata.workflow.workflow_output_handler import print_status
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = MetadataWorkflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
print_status(workflow)
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
@ -0,0 +1,69 @@
|
||||
---
|
||||
title: Run Data Insights using Metadata CLI
|
||||
slug: /how-to-guides/data-insights/metadata-cli
|
||||
---
|
||||
|
||||
# Run Data Insights using Metadata CLI
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Data Insights:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dataInsight
|
||||
serviceName: OpenMetadata
|
||||
sourceConfig:
|
||||
config:
|
||||
type: MetadataToElasticSearch
|
||||
processor:
|
||||
type: data-insight-processor
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: false
|
||||
workflowConfig:
|
||||
loggerLevel: DEBUG
|
||||
openMetadataServerConfig:
|
||||
hostPort: '<OpenMetadata host and port>'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
### Source Configuration - Source Config
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
|
||||
|
||||
### Processor Configuration
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
|
||||
|
||||
### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
## 2. Run with the CLI
|
||||
|
||||
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
|
||||
|
||||
```bash
|
||||
metadata insight -c <path-to-yaml>
|
||||
```
|
||||
@ -72,7 +72,7 @@ After clicking Next, you will be redirected to the Scheduling form. This will be
|
||||
|
||||
## dbt Ingestion
|
||||
|
||||
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
|
||||
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/configure-dbt-workflow-from-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
|
||||
|
||||
You can learn more about [lineage ingestion here](/connectors/ingestion/lineage).
|
||||
|
||||
|
||||
@ -66,7 +66,7 @@ alt="Column Data provides information"
|
||||
caption="Column Data provides information"
|
||||
/%}
|
||||
|
||||
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
|
||||
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
|
||||
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
|
||||
@ -380,6 +380,10 @@ site_menu:
|
||||
url: /connectors/database/synapse/yaml
|
||||
- category: Connectors / Database / Synapse / Troubleshooting
|
||||
url: /connectors/database/synapse/troubleshooting
|
||||
- category: Connectors / Database / Teradata
|
||||
url: /connectors/database/teradata
|
||||
- category: Connectors / Database / Teradata / Run Externally
|
||||
url: /connectors/database/teradata/yaml
|
||||
- category: Connectors / Database / Trino
|
||||
url: /connectors/database/trino
|
||||
- category: Connectors / Database / Trino / Run Externally
|
||||
@ -657,8 +661,6 @@ site_menu:
|
||||
url: /connectors/ingestion/lineage/spark-lineage
|
||||
- category: Connectors / Ingestion / Versioning
|
||||
url: /connectors/ingestion/versioning
|
||||
- category: Connectors / Ingestion / Auto Tagging
|
||||
url: /connectors/ingestion/auto_tagging
|
||||
- category: Connectors / Ingestion / Versioning / Change Feeds
|
||||
url: /connectors/ingestion/versioning/change-feeds
|
||||
- category: Connectors / Ingestion / Versioning / Change Events
|
||||
@ -839,6 +841,12 @@ site_menu:
|
||||
url: /how-to-guides/data-insights/ingestion
|
||||
- category: How-to Guides / Data Insights / Key Performance Indicators (KPI)
|
||||
url: /how-to-guides/data-insights/kpi
|
||||
- category: How-to Guides / Data Insights / Run Data Insights using Airflow SDK
|
||||
url: /how-to-guides/data-insights/airflow-sdk
|
||||
- category: How-to Guides / Data Insights / Run Data Insights using Metadata CLI
|
||||
url: /how-to-guides/data-insights/metadata-cli
|
||||
- category: How-to Guides / Data Insights / Run Elasticsearch Reindex using Airflow SDK
|
||||
url: /how-to-guides/data-insights/elasticsearch-reindex
|
||||
- category: How-to Guides / Data Insights / Data Insights Report
|
||||
url: /how-to-guides/data-insights/report
|
||||
- category: How-to Guides / Data Insights / Cost Analysis
|
||||
|
||||
@ -51,7 +51,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -48,7 +48,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -23,8 +23,8 @@ Configure and schedule Databricks metadata and profiler workflows from the OpenM
|
||||
- [Unity Catalog](#unity-catalog)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Datalake connector.
|
||||
Configure and schedule Datalake metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/datalake/yaml"} /%}
|
||||
|
||||
|
||||
@ -33,8 +33,8 @@ Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/db2/yaml"} /%}
|
||||
@ -65,7 +65,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
{% partial
|
||||
|
||||
@ -48,7 +48,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Python Requirements
|
||||
|
||||
|
||||
@ -104,7 +104,7 @@ If instead we use a local file path that contains the metastore information (e.g
|
||||
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
|
||||
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
|
||||
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
**Metastore Database**
|
||||
|
||||
@ -109,7 +109,7 @@ To update the `Derby` information. More information about this in a great [SO th
|
||||
|
||||
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
|
||||
- If you need further information regarding the Hive metastore, you can find
|
||||
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
|
||||
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
|
||||
|
||||
|
||||
|
||||
@ -17,7 +17,7 @@ Configure and schedule DomoDatabase metadata and profiler workflows from the Ope
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/domo-database/yaml"} /%}
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Doris metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-doris-connection-with-ssl-in-openmetadata)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Druid connector.
|
||||
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/athena/yaml"} /%}
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule Greenplum metadata and profiler workflows from the OpenMe
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-greenplum-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -17,8 +17,8 @@ In this section, we provide guides and references to use the Hive connector.
|
||||
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-hive-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -31,7 +31,7 @@ Configure and schedule Hive metadata and profiler workflows from the OpenMetadat
|
||||
To extract metadata, the user used in the connection needs to be able to perform `SELECT`, `SHOW`, and `DESCRIBE` operations in the database/schema where the metadata needs to be extracted from.
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -15,8 +15,8 @@ In this section, we provide guides and references to use the Impala connector.
|
||||
|
||||
Configure and schedule Impala metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-impala-connection-with-ssl-in-openmetadata)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule MariaDB metadata and profiler workflows from the OpenMeta
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/mariadb/yaml"} /%}
|
||||
@ -43,7 +43,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -70,7 +70,7 @@ To fetch the metadata from MongoDB to OpenMetadata, the MongoDB user must have a
|
||||
To deploy OpenMetadata, check the Deployment guides.
|
||||
{%/inlineCallout%}
|
||||
|
||||
[Profiler deployment](/connectors/ingestion/workflows/profiler)
|
||||
[Profiler deployment](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
|
||||
### Limitations
|
||||
|
||||
|
||||
@ -292,7 +292,7 @@ workflowConfig:
|
||||
|
||||
{% /codePreview %}
|
||||
|
||||
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/connectors/ingestion/workflows/profiler)
|
||||
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
|
||||
|
||||
|
||||
|
||||
@ -19,8 +19,8 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule MySQL metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-mysql-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -45,7 +45,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Oracle metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -16,8 +16,8 @@ In this section, we provide guides and references to use the PinotDB connector.
|
||||
Configure and schedule PinotDB metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/pinotdb/yaml"} /%}
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule PostgreSQL metadata and profiler workflows from the OpenM
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-postgres-connection-with-ssl-in-openmetadata)
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/presto/yaml"} /%}
|
||||
@ -30,7 +30,7 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
To extract metadata, the user needs to be able to perform `SHOW CATALOGS`, `SHOW TABLES`, and `SHOW COLUMNS FROM` on the catalogs/tables you wish to extract metadata from and have `SELECT` permission on the `INFORMATION_SCHEMA`. Access to resources will be different based on the connector used. You can find more details in the Presto documentation website [here](https://prestodb.io/docs/current/connector.html). You can also get more information regarding system access control in Presto [here](https://prestodb.io/docs/current/security/built-in-system-access-control.html).
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -19,8 +19,8 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/redshift)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
- [Enable Security](#securing-redshift-connection-with-ssl-in-openmetadata)
|
||||
@ -42,7 +42,7 @@ GRANT SELECT ON TABLE svv_table_info to test_user;
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
### Usage & Lineage
|
||||
For the usage and lineage workflow, the user will need `SELECT` privilege on `STL_QUERY` table. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule SAP Hana metadata and profiler workflows from the OpenMet
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sap-hana/yaml"} /%}
|
||||
@ -57,7 +57,7 @@ The same applies to the `_SYS_REPO` schema, required for lineage extraction.
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Singlestore metadata and profiler workflows from the Open
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/singlestore/yaml"} /%}
|
||||
@ -44,7 +44,7 @@ GRANT SELECT ON world.hello TO '<username>';
|
||||
```
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
|
||||
@ -20,8 +20,8 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/snowflake)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sqlite/yaml"} /%}
|
||||
|
||||
@ -0,0 +1,75 @@
|
||||
---
|
||||
title: Teradata
|
||||
slug: /connectors/database/teradata
|
||||
---
|
||||
|
||||
{% connectorDetailsHeader
|
||||
name="Teradata"
|
||||
stage="BETA"
|
||||
platform="OpenMetadata"
|
||||
availableFeatures=["Metadata", "Data Profiler"]
|
||||
unavailableFeatures=["Query Usage", "Data Quality", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
|
||||
/ %}
|
||||
|
||||
In this section, we provide guides and references to use the Teradata connector.
|
||||
|
||||
Configure and schedule Teradata metadata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality/configure)
|
||||
|
||||
{% partial file="/v1.6/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/greenplum/yaml"} /%}
|
||||
|
||||
## Requirements
|
||||
{%inlineCallout icon="description" bold="OpenMetadata 1.6 or later" href="/deployment"%}
|
||||
To deploy OpenMetadata, check the Deployment guides.
|
||||
{%/inlineCallout%}
|
||||
|
||||
Connector was tested on Teradata DBS version 17.20. Since there are no significant changes in metadata objects, so it should work with 15.x, 16.x versions.
|
||||
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
By default, all valid users in Teradata DB has full access to metadata objects, so there are no any specific requirements to user privileges.
|
||||
|
||||
{% partial
|
||||
file="/v1.6/connectors/metadata-ingestion-ui.md"
|
||||
variables={
|
||||
connector: "Teradata",
|
||||
selectServicePath: "/images/v1.6/connectors/teradata/select-service.png",
|
||||
addNewServicePath: "/images/v1.6/connectors/teradata/add-new-service.png",
|
||||
serviceConnectionPath: "/images/v1.6/connectors/teradata/service-connection.png",
|
||||
}
|
||||
/%}
|
||||
|
||||
{% stepsContainer %}
|
||||
{% extraContent parentTagName="stepsContainer" %}
|
||||
|
||||
#### Connection Details
|
||||
|
||||
- **Username**: Specify the User to connect to Teradata.
|
||||
- **Password**: Password to connect to Teradata
|
||||
- **Logmech**: Specifies the logon authentication method. Possible values are TD2 (the default), JWT, LDAP, KRB5 for Kerberos, or TDNEGO.
|
||||
- **LOGDATA**: Specifies additional data needed by a logon mechanism, such as a secure token, Distinguished Name, or a domain/realm name. LOGDATA values are specific to each logon mechanism.
|
||||
- **Host and Port**: Enter the fully qualified hostname and port number (default port for Teradata is 1025) for your Teradata deployment in the Host and Port field.
|
||||
- **Transaction Mode**: Specifies the transaction mode for the connection. Possible values are DEFAULT (the default), ANSI, or TERA.
|
||||
- **Teradata Database Account**: Specifies an account string to override the default account string defined for the database user. Accounts are used by the database for workload management and resource usage monitoring.
|
||||
- **Connection Options** and **Connection Arguments**: additional connection parameters. For more information please view teradatasql [docs](https://pypi.org/project/teradatasql/).
|
||||
|
||||
{% partial file="/v1.6/connectors/database/advanced-configuration.md" /%}
|
||||
|
||||
{% /extraContent %}
|
||||
|
||||
{% partial file="/v1.6/connectors/test-connection.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/database/configure-ingestion.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/ingestion-schedule-and-deploy.md" /%}
|
||||
|
||||
{% /stepsContainer %}
|
||||
|
||||
{% partial file="/v1.6/connectors/troubleshooting.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/database/related.md" /%}
|
||||
@ -0,0 +1,117 @@
|
||||
---
|
||||
title: Run the Teradata Connector Externally
|
||||
slug: /connectors/database/teradata/yaml
|
||||
---
|
||||
|
||||
{% connectorDetailsHeader
|
||||
name="Teradata"
|
||||
stage="BETA"
|
||||
platform="OpenMetadata"
|
||||
availableFeatures=["Metadata", "Data Profiler", "Data Quality"]
|
||||
unavailableFeatures=["Query Usage", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
|
||||
/ %}
|
||||
|
||||
In this section, we provide guides and references to use the Teradata connector.
|
||||
|
||||
Configure and schedule Greenplum Teradata and profiler workflows from the OpenMetadata UI:
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](#data-profiler)
|
||||
- [Data Quality](#data-quality)
|
||||
|
||||
|
||||
{% partial file="/v1.6/connectors/external-ingestion-deployment.md" /%}
|
||||
|
||||
## Requirements
|
||||
|
||||
### Python Requirements
|
||||
|
||||
{% partial file="/v1.6/connectors/python-requirements.md" /%}
|
||||
|
||||
To run the Teradata ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[teradata]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/teradataConnection.json)
|
||||
you can find the structure to create a connection to Teradata.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Teradata:
|
||||
|
||||
{% codePreview %}
|
||||
|
||||
{% codeInfoContainer %}
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
{% codeInfo srNumber=1 %}
|
||||
|
||||
**username**: Specify the User to connect to Teradata.
|
||||
|
||||
{% /codeInfo %}
|
||||
{% codeInfo srNumber=2 %}
|
||||
|
||||
**password**: User password to connect to Teradata
|
||||
|
||||
{% /codeInfo %}
|
||||
|
||||
{% codeInfo srNumber=3 %}
|
||||
|
||||
**hostPort**: Enter the fully qualified hostname and port number for your Greenplum deployment in the Host and Port field.
|
||||
|
||||
{% /codeInfo %}
|
||||
|
||||
|
||||
|
||||
|
||||
{% /codeInfoContainer %}
|
||||
|
||||
{% codeBlock fileName="filename.yaml" %}
|
||||
|
||||
```yaml {% isCodeBlock=true %}
|
||||
source:
|
||||
type: teradata
|
||||
serviceName: example_teradata
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Teradata
|
||||
```
|
||||
```yaml {% srNumber=1 %}
|
||||
username: username
|
||||
```
|
||||
```yaml {% srNumber=2 %}
|
||||
password: <password>
|
||||
```
|
||||
```yaml {% srNumber=3 %}
|
||||
hostPort: teradata:1025
|
||||
```
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/database/source-config.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/ingestion-sink.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/workflow-config.md" /%}
|
||||
|
||||
{% /codeBlock %}
|
||||
|
||||
{% /codePreview %}
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/ingestion-cli.md" /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/data-profiler.md" variables={connector: "teradata"} /%}
|
||||
|
||||
{% partial file="/v1.6/connectors/yaml/data-quality.md" /%}
|
||||
@ -17,8 +17,8 @@ Configure and schedule Trino metadata and profiler workflows from the OpenMetada
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/trino/yaml"} /%}
|
||||
@ -33,7 +33,7 @@ Access to resources will be based on the user access permission to access specif
|
||||
|
||||
### Profiler & Data Quality
|
||||
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
## Metadata Ingestion
|
||||
{% partial
|
||||
|
||||
@ -18,7 +18,7 @@ Configure and schedule Unity Catalog metadata workflow from the OpenMetadata UI:
|
||||
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Query Usage](/connectors/ingestion/workflows/usage)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [Lineage](/connectors/ingestion/lineage)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
|
||||
@ -18,8 +18,8 @@ Configure and schedule Vertica metadata and profiler workflows from the OpenMeta
|
||||
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
- [Data Profiler](/connectors/ingestion/workflows/profiler)
|
||||
- [Data Quality](/connectors/ingestion/workflows/data-quality)
|
||||
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
|
||||
- [Data Quality](/how-to-guides/data-quality-observability/quality)
|
||||
- [dbt Integration](/connectors/ingestion/workflows/dbt)
|
||||
|
||||
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/vertica/yaml"} /%}
|
||||
|
||||
@ -117,6 +117,6 @@ alt="Run Great Expectations checkpoint"
|
||||
/%}
|
||||
|
||||
### List of Great Expectations Supported Test
|
||||
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/connectors/ingestion/workflows/data-quality/tests) section.
|
||||
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/how-to-guides/data-quality-observability/quality/tests) section.
|
||||
|
||||
If a test is not supported, there is no need to worry about the execution of your Great Expectations test. We will simply skip the tests that are not supported and continue the execution of your test suite.
|
||||
@ -30,14 +30,14 @@ Learn more about how to ingest metadata from dozens of connectors.
|
||||
{%inlineCallout
|
||||
bold="Metadata Profiler"
|
||||
icon="cable"
|
||||
href="/connectors/ingestion/workflows/profiler"%}
|
||||
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
|
||||
To get metrics from your Tables=
|
||||
{%/inlineCallout%}
|
||||
|
||||
{%inlineCallout
|
||||
bold="Metadata Data Quality Tests"
|
||||
icon="cable"
|
||||
href="/connectors/ingestion/workflows/data-quality"%}
|
||||
href="/how-to-guides/data-quality-observability/quality"%}
|
||||
To run automated Quality Tests on your Tables.
|
||||
{%/inlineCallout%}
|
||||
|
||||
|
||||
@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
|
||||
|
||||
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
|
||||
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
|
||||
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
|
||||
|
||||
{% /note %}
|
||||
|
||||
|
||||
@ -40,14 +40,14 @@ Configure dbt metadata
|
||||
{%inlineCallout
|
||||
icon="fit_screen"
|
||||
bold="Data Profiler"
|
||||
href="/connectors/ingestion/workflows/profiler"%}
|
||||
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
|
||||
Compute metrics and ingest sample data.
|
||||
{%/inlineCallout%}
|
||||
|
||||
{%inlineCallout
|
||||
icon="fit_screen"
|
||||
bold="Data Quality"
|
||||
href="/connectors/ingestion/workflows/data-quality"%}
|
||||
href="/how-to-guides/data-quality-observability/quality"%}
|
||||
Monitor your data and avoid surprises.
|
||||
{%/inlineCallout%}
|
||||
|
||||
|
||||
@ -304,6 +304,6 @@ processor:
|
||||
- Bumped up ElasticSearch version for Docker and Kubernetes OpenMetadata Dependencies Helm Chart to `7.16.3`
|
||||
|
||||
### Data Quality Migration
|
||||
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/connectors/ingestion/workflows/data-quality).
|
||||
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
As a user you will need to redeploy data quality workflows. You can go to `Quality > By Tables` to view the tables with test cases that need a workflow to be set up.
|
||||
|
||||
@ -93,7 +93,7 @@ Then, you can prepare `Run Configurations` to execute the ingestion as you would
|
||||
{% image src="/images/v1.5/developers/contribute/build-code-and-run-tests/pycharm-run-config.png" alt="PyCharm run config" caption=" " /%}
|
||||
|
||||
Note that in the example we are preparing a configuration to run and test Superset. In order to understand how to run
|
||||
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/cli).
|
||||
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/yaml).
|
||||
|
||||
The important part is that we are not running a script, but rather a `module`: `metadata`. Based on this, we can work as
|
||||
we would usually do with the CLI for any ingestion, profiler, or test workflow.
|
||||
|
||||
@ -147,7 +147,7 @@ OpenMetadata supports MySQL version `8.0.0` and up.
|
||||
$$
|
||||
|
||||
### Profiler & Data Quality
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
|
||||
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
|
||||
|
||||
You can find further information on the MySQL connector in the [docs](/connectors/database/mysql).
|
||||
|
||||
|
||||
@ -153,7 +153,7 @@ By connecting to a database service, you can ingest the databases, schemas, tabl
|
||||
/%}
|
||||
|
||||
{% note %}
|
||||
**Note:** Once you’ve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/connectors/ingestion/workflows/profiler). To add ingestion pipelines, select the required type of ingestion and enter the required details.
|
||||
**Note:** Once you’ve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/how-to-guides/data-quality-observability/profiler/workflow). To add ingestion pipelines, select the required type of ingestion and enter the required details.
|
||||
{% /note %}
|
||||
|
||||
{% image
|
||||
|
||||
@ -31,7 +31,7 @@ alt="Column Data provides information"
|
||||
caption="Column Data provides information"
|
||||
/%}
|
||||
|
||||
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
|
||||
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
|
||||
|
||||
## Tag Mapping
|
||||
|
||||
|
||||
@ -0,0 +1,117 @@
|
||||
---
|
||||
title: Run Data Insights using Airflow SDK
|
||||
slug: /how-to-guides/data-insights/airflow-sdk
|
||||
---
|
||||
|
||||
# Run Data Insights using Airflow SDK
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Data Insights:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dataInsight
|
||||
serviceName: OpenMetadata
|
||||
sourceConfig:
|
||||
config:
|
||||
type: MetadataToElasticSearch
|
||||
processor:
|
||||
type: data-insight-processor
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: false
|
||||
workflowConfig:
|
||||
loggerLevel: DEBUG
|
||||
openMetadataServerConfig:
|
||||
hostPort: '<OpenMetadata host and port>'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
### Source Configuration - Source Config
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
|
||||
|
||||
### Processor Configuration
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
|
||||
|
||||
### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
## 2. Prepare the Data Insights DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
from metadata.workflow.data_insight import DataInsightWorkflow
|
||||
from metadata.workflow.workflow_output_handler import print_status
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = DataInsightWorkflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
print_status(workflow)
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
@ -28,7 +28,7 @@ To have cost analysis data available you will need to execute the below workflow
|
||||
2. **Profiler Workflow**:
|
||||
- Purpose: Gather size information (in bytes) for data assets.
|
||||
- Description: The Profiler Workflow is responsible for obtaining the size of data assets in bytes. This information is vital for generating the size-related data used in the Cost Analysis charts. It helps in assessing the resource consumption and cost implications of each asset.
|
||||
- Click [here](/connectors/ingestion/workflows/profiler) for documentation on the profiler workflow.
|
||||
- Click [here](/how-to-guides/data-quality-observability/profiler/workflow) for documentation on the profiler workflow.
|
||||
|
||||
3. **Data Insights Workflow**:
|
||||
- Purpose: Aggregate information from Usage Workflow and Profiler Workflow.
|
||||
|
||||
@ -0,0 +1,90 @@
|
||||
---
|
||||
title: Run Elasticsearch Reindex using Airflow SDK
|
||||
slug: /how-to-guides/data-insights/elasticsearch-reindex
|
||||
---
|
||||
|
||||
# Run Elasticsearch Reindex using Airflow SDK
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Elasticsearch Reindex:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
source:
|
||||
type: metadata_elasticsearch
|
||||
serviceName: openMetadata
|
||||
serviceConnection:
|
||||
config:
|
||||
type: MetadataES
|
||||
sourceConfig:
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: true
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
|
||||
```
|
||||
|
||||
### 2. Prepare the Ingestion DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from metadata.workflow.metadata import MetadataWorkflow
|
||||
from metadata.workflow.workflow_output_handler import print_status
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = MetadataWorkflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
print_status(workflow)
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
@ -0,0 +1,69 @@
|
||||
---
|
||||
title: Run Data Insights using Metadata CLI
|
||||
slug: /how-to-guides/data-insights/metadata-cli
|
||||
---
|
||||
|
||||
# Run Data Insights using Metadata CLI
|
||||
|
||||
## 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Data Insights:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dataInsight
|
||||
serviceName: OpenMetadata
|
||||
sourceConfig:
|
||||
config:
|
||||
type: MetadataToElasticSearch
|
||||
processor:
|
||||
type: data-insight-processor
|
||||
config: {}
|
||||
sink:
|
||||
type: elasticsearch
|
||||
config:
|
||||
es_host: localhost
|
||||
es_port: 9200
|
||||
recreate_indexes: false
|
||||
workflowConfig:
|
||||
loggerLevel: DEBUG
|
||||
openMetadataServerConfig:
|
||||
hostPort: '<OpenMetadata host and port>'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
### Source Configuration - Source Config
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
|
||||
|
||||
### Processor Configuration
|
||||
|
||||
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
|
||||
|
||||
### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: openmetadata
|
||||
securityConfig:
|
||||
jwtToken: '{bot_jwt_token}'
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
## 2. Run with the CLI
|
||||
|
||||
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
|
||||
|
||||
```bash
|
||||
metadata insight -c <path-to-yaml>
|
||||
```
|
||||
@ -72,7 +72,7 @@ After clicking Next, you will be redirected to the Scheduling form. This will be
|
||||
|
||||
## dbt Ingestion
|
||||
|
||||
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
|
||||
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/configure-dbt-workflow-from-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
|
||||
|
||||
You can learn more about [lineage ingestion here](/connectors/ingestion/lineage).
|
||||
|
||||
|
||||
@ -66,7 +66,7 @@ alt="Column Data provides information"
|
||||
caption="Column Data provides information"
|
||||
/%}
|
||||
|
||||
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
|
||||
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
|
||||
|
||||
{%inlineCallout
|
||||
color="violet-70"
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user