Docs: Updating broken links and Missing Docs (#17642)

* Doc: Adding SSL Docs for Messaging & Dashboard

* Docs: Updating Broken Links in Docs

---------

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
This commit is contained in:
Prajwal214 2024-09-02 10:22:51 +05:30 committed by GitHub
parent ee11760576
commit 4ccfe886a4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
101 changed files with 1111 additions and 159 deletions

View File

@ -77,7 +77,7 @@ If instead we use a local file path that contains the metastore information (e.g
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
**Metastore Database**

View File

@ -100,7 +100,7 @@ To update the `Derby` information. More information about this in a great [SO th
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).

View File

@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
{% /note %}

View File

@ -51,7 +51,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -48,7 +48,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -23,8 +23,8 @@ Configure and schedule Databricks metadata and profiler workflows from the OpenM
- [Unity Catalog](#unity-catalog)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Datalake connector.
Configure and schedule Datalake metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/datalake/yaml"} /%}

View File

@ -33,8 +33,8 @@ Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/db2/yaml"} /%}
@ -65,7 +65,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion
{% partial

View File

@ -48,7 +48,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Python Requirements

View File

@ -104,7 +104,7 @@ If instead we use a local file path that contains the metastore information (e.g
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
**Metastore Database**

View File

@ -109,7 +109,7 @@ To update the `Derby` information. More information about this in a great [SO th
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).

View File

@ -17,7 +17,7 @@ Configure and schedule DomoDatabase metadata and profiler workflows from the Ope
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/domo-database/yaml"} /%}

View File

@ -17,8 +17,8 @@ Configure and schedule Doris metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-doris-connection-with-ssl-in-openmetadata)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Druid connector.
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/athena/yaml"} /%}

View File

@ -18,8 +18,8 @@ Configure and schedule Greenplum metadata and profiler workflows from the OpenMe
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-greenplum-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ In this section, we provide guides and references to use the Hive connector.
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-hive-connection-with-ssl-in-openmetadata)
@ -31,7 +31,7 @@ Configure and schedule Hive metadata and profiler workflows from the OpenMetadat
To extract metadata, the user used in the connection needs to be able to perform `SELECT`, `SHOW`, and `DESCRIBE` operations in the database/schema where the metadata needs to be extracted from.
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -15,8 +15,8 @@ In this section, we provide guides and references to use the Impala connector.
Configure and schedule Impala metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-impala-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ Configure and schedule MariaDB metadata and profiler workflows from the OpenMeta
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/mariadb/yaml"} /%}
@ -43,7 +43,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -70,7 +70,7 @@ To fetch the metadata from MongoDB to OpenMetadata, the MongoDB user must have a
To deploy OpenMetadata, check the Deployment guides.
{%/inlineCallout%}
[Profiler deployment](/connectors/ingestion/workflows/profiler)
[Profiler deployment](/how-to-guides/data-quality-observability/profiler/workflow)
### Limitations

View File

@ -292,7 +292,7 @@ workflowConfig:
{% /codePreview %}
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/connectors/ingestion/workflows/profiler)
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/how-to-guides/data-quality-observability/profiler/workflow)

View File

@ -19,8 +19,8 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -17,8 +17,8 @@ Configure and schedule MySQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-mysql-connection-with-ssl-in-openmetadata)
@ -45,7 +45,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -17,8 +17,8 @@ Configure and schedule Oracle metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the PinotDB connector.
Configure and schedule PinotDB metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/pinotdb/yaml"} /%}

View File

@ -18,8 +18,8 @@ Configure and schedule PostgreSQL metadata and profiler workflows from the OpenM
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-postgres-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/presto/yaml"} /%}
@ -30,7 +30,7 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
To extract metadata, the user needs to be able to perform `SHOW CATALOGS`, `SHOW TABLES`, and `SHOW COLUMNS FROM` on the catalogs/tables you wish to extract metadata from and have `SELECT` permission on the `INFORMATION_SCHEMA`. Access to resources will be different based on the connector used. You can find more details in the Presto documentation website [here](https://prestodb.io/docs/current/connector.html). You can also get more information regarding system access control in Presto [here](https://prestodb.io/docs/current/security/built-in-system-access-control.html).
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -19,8 +19,8 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
- [Metadata Ingestion](#metadata-ingestion)
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/redshift)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-redshift-connection-with-ssl-in-openmetadata)
@ -42,7 +42,7 @@ GRANT SELECT ON TABLE svv_table_info to test_user;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege on `STL_QUERY` table. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -18,8 +18,8 @@ Configure and schedule SAP Hana metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sap-hana/yaml"} /%}
@ -57,7 +57,7 @@ The same applies to the `_SYS_REPO` schema, required for lineage extraction.
### Profiler & Data Quality
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -17,8 +17,8 @@ Configure and schedule Singlestore metadata and profiler workflows from the Open
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/singlestore/yaml"} /%}
@ -44,7 +44,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -20,8 +20,8 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
- [Metadata Ingestion](#metadata-ingestion)
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/snowflake)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sqlite/yaml"} /%}

View File

@ -0,0 +1,75 @@
---
title: Teradata
slug: /connectors/database/teradata
---
{% connectorDetailsHeader
name="Teradata"
stage="BETA"
platform="OpenMetadata"
availableFeatures=["Metadata", "Data Profiler"]
unavailableFeatures=["Query Usage", "Data Quality", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
/ %}
In this section, we provide guides and references to use the Teradata connector.
Configure and schedule Teradata metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality/configure)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/greenplum/yaml"} /%}
## Requirements
{%inlineCallout icon="description" bold="OpenMetadata 1.5 or later" href="/deployment"%}
To deploy OpenMetadata, check the Deployment guides.
{%/inlineCallout%}
Connector was tested on Teradata DBS version 17.20. Since there are no significant changes in metadata objects, so it should work with 15.x, 16.x versions.
## Metadata Ingestion
By default, all valid users in Teradata DB has full access to metadata objects, so there are no any specific requirements to user privileges.
{% partial
file="/v1.5/connectors/metadata-ingestion-ui.md"
variables={
connector: "Teradata",
selectServicePath: "/images/v1.5/connectors/teradata/select-service.png",
addNewServicePath: "/images/v1.5/connectors/teradata/add-new-service.png",
serviceConnectionPath: "/images/v1.5/connectors/teradata/service-connection.png",
}
/%}
{% stepsContainer %}
{% extraContent parentTagName="stepsContainer" %}
#### Connection Details
- **Username**: Specify the User to connect to Teradata.
- **Password**: Password to connect to Teradata
- **Logmech**: Specifies the logon authentication method. Possible values are TD2 (the default), JWT, LDAP, KRB5 for Kerberos, or TDNEGO.
- **LOGDATA**: Specifies additional data needed by a logon mechanism, such as a secure token, Distinguished Name, or a domain/realm name. LOGDATA values are specific to each logon mechanism.
- **Host and Port**: Enter the fully qualified hostname and port number (default port for Teradata is 1025) for your Teradata deployment in the Host and Port field.
- **Transaction Mode**: Specifies the transaction mode for the connection. Possible values are DEFAULT (the default), ANSI, or TERA.
- **Teradata Database Account**: Specifies an account string to override the default account string defined for the database user. Accounts are used by the database for workload management and resource usage monitoring.
- **Connection Options** and **Connection Arguments**: additional connection parameters. For more information please view teradatasql [docs](https://pypi.org/project/teradatasql/).
{% partial file="/v1.5/connectors/database/advanced-configuration.md" /%}
{% /extraContent %}
{% partial file="/v1.5/connectors/test-connection.md" /%}
{% partial file="/v1.5/connectors/database/configure-ingestion.md" /%}
{% partial file="/v1.5/connectors/ingestion-schedule-and-deploy.md" /%}
{% /stepsContainer %}
{% partial file="/v1.5/connectors/troubleshooting.md" /%}
{% partial file="/v1.5/connectors/database/related.md" /%}

View File

@ -0,0 +1,117 @@
---
title: Run the Teradata Connector Externally
slug: /connectors/database/teradata/yaml
---
{% connectorDetailsHeader
name="Teradata"
stage="BETA"
platform="OpenMetadata"
availableFeatures=["Metadata", "Data Profiler", "Data Quality"]
unavailableFeatures=["Query Usage", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
/ %}
In this section, we provide guides and references to use the Teradata connector.
Configure and schedule Greenplum Teradata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](#data-profiler)
- [Data Quality](#data-quality)
{% partial file="/v1.5/connectors/external-ingestion-deployment.md" /%}
## Requirements
### Python Requirements
{% partial file="/v1.5/connectors/python-requirements.md" /%}
To run the Teradata ingestion, you will need to install:
```bash
pip3 install "openmetadata-ingestion[teradata]"
```
## Metadata Ingestion
All connectors are defined as JSON Schemas.
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/teradataConnection.json)
you can find the structure to create a connection to Teradata.
In order to create and run a Metadata Ingestion workflow, we will follow
the steps to create a YAML configuration able to connect to the source,
process the Entities if needed, and reach the OpenMetadata server.
The workflow is modeled around the following
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/workflow.json)
### 1. Define the YAML Config
This is a sample config for Teradata:
{% codePreview %}
{% codeInfoContainer %}
#### Source Configuration - Service Connection
{% codeInfo srNumber=1 %}
**username**: Specify the User to connect to Teradata.
{% /codeInfo %}
{% codeInfo srNumber=2 %}
**password**: User password to connect to Teradata
{% /codeInfo %}
{% codeInfo srNumber=3 %}
**hostPort**: Enter the fully qualified hostname and port number for your Greenplum deployment in the Host and Port field.
{% /codeInfo %}
{% /codeInfoContainer %}
{% codeBlock fileName="filename.yaml" %}
```yaml {% isCodeBlock=true %}
source:
type: teradata
serviceName: example_teradata
serviceConnection:
config:
type: Teradata
```
```yaml {% srNumber=1 %}
username: username
```
```yaml {% srNumber=2 %}
password: <password>
```
```yaml {% srNumber=3 %}
hostPort: teradata:1025
```
{% partial file="/v1.5/connectors/yaml/database/source-config.md" /%}
{% partial file="/v1.5/connectors/yaml/ingestion-sink.md" /%}
{% partial file="/v1.5/connectors/yaml/workflow-config.md" /%}
{% /codeBlock %}
{% /codePreview %}
{% partial file="/v1.5/connectors/yaml/ingestion-cli.md" /%}
{% partial file="/v1.5/connectors/yaml/data-profiler.md" variables={connector: "teradata"} /%}
{% partial file="/v1.5/connectors/yaml/data-quality.md" /%}

View File

@ -17,8 +17,8 @@ Configure and schedule Trino metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/trino/yaml"} /%}
@ -33,7 +33,7 @@ Access to resources will be based on the user access permission to access specif
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion
{% partial

View File

@ -18,7 +18,7 @@ Configure and schedule Unity Catalog metadata workflow from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -18,8 +18,8 @@ Configure and schedule Vertica metadata and profiler workflows from the OpenMeta
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/vertica/yaml"} /%}

View File

@ -117,6 +117,6 @@ alt="Run Great Expectations checkpoint"
/%}
### List of Great Expectations Supported Test
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/connectors/ingestion/workflows/data-quality/tests) section.
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/how-to-guides/data-quality-observability/quality/tests) section.
If a test is not supported, there is no need to worry about the execution of your Great Expectations test. We will simply skip the tests that are not supported and continue the execution of your test suite.

View File

@ -30,14 +30,14 @@ Learn more about how to ingest metadata from dozens of connectors.
{%inlineCallout
bold="Metadata Profiler"
icon="cable"
href="/connectors/ingestion/workflows/profiler"%}
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
To get metrics from your Tables=
{%/inlineCallout%}
{%inlineCallout
bold="Metadata Data Quality Tests"
icon="cable"
href="/connectors/ingestion/workflows/data-quality"%}
href="/how-to-guides/data-quality-observability/quality"%}
To run automated Quality Tests on your Tables.
{%/inlineCallout%}

View File

@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
{% /note %}

View File

@ -40,14 +40,14 @@ Configure dbt metadata
{%inlineCallout
icon="fit_screen"
bold="Data Profiler"
href="/connectors/ingestion/workflows/profiler"%}
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
Compute metrics and ingest sample data.
{%/inlineCallout%}
{%inlineCallout
icon="fit_screen"
bold="Data Quality"
href="/connectors/ingestion/workflows/data-quality"%}
href="/how-to-guides/data-quality-observability/quality"%}
Monitor your data and avoid surprises.
{%/inlineCallout%}

View File

@ -304,6 +304,6 @@ processor:
- Bumped up ElasticSearch version for Docker and Kubernetes OpenMetadata Dependencies Helm Chart to `7.16.3`
### Data Quality Migration
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/connectors/ingestion/workflows/data-quality).
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/how-to-guides/data-quality-observability/quality).
As a user you will need to redeploy data quality workflows. You can go to `Quality > By Tables` to view the tables with test cases that need a workflow to be set up.

View File

@ -93,7 +93,7 @@ Then, you can prepare `Run Configurations` to execute the ingestion as you would
{% image src="/images/v1.5/developers/contribute/build-code-and-run-tests/pycharm-run-config.png" alt="PyCharm run config" caption=" " /%}
Note that in the example we are preparing a configuration to run and test Superset. In order to understand how to run
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/cli).
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/yaml).
The important part is that we are not running a script, but rather a `module`: `metadata`. Based on this, we can work as
we would usually do with the CLI for any ingestion, profiler, or test workflow.

View File

@ -147,7 +147,7 @@ OpenMetadata supports MySQL version `8.0.0` and up.
$$
### Profiler & Data Quality
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
You can find further information on the MySQL connector in the [docs](/connectors/database/mysql).

View File

@ -153,7 +153,7 @@ By connecting to a database service, you can ingest the databases, schemas, tabl
/%}
{% note %}
**Note:** Once youve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/connectors/ingestion/workflows/profiler). To add ingestion pipelines, select the required type of ingestion and enter the required details.
**Note:** Once youve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/how-to-guides/data-quality-observability/profiler/workflow). To add ingestion pipelines, select the required type of ingestion and enter the required details.
{% /note %}
{% image

View File

@ -31,7 +31,7 @@ alt="Column Data provides information"
caption="Column Data provides information"
/%}
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
## Tag Mapping

View File

@ -0,0 +1,117 @@
---
title: Run Data Insights using Airflow SDK
slug: /how-to-guides/data-insights/airflow-sdk
---
# Run Data Insights using Airflow SDK
## 1. Define the YAML Config
This is a sample config for Data Insights:
```yaml
source:
type: dataInsight
serviceName: OpenMetadata
sourceConfig:
config:
type: MetadataToElasticSearch
processor:
type: data-insight-processor
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: false
workflowConfig:
loggerLevel: DEBUG
openMetadataServerConfig:
hostPort: '<OpenMetadata host and port>'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
### Source Configuration - Source Config
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
### Processor Configuration
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
### Workflow Configuration
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
For a simple, local installation using our docker containers, this looks like:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: 'http://localhost:8585/api'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
You can find the different implementation of the ingestion below.
## 2. Prepare the Data Insights DAG
Create a Python file in your Airflow DAGs directory with the following contents:
```python
import pathlib
import yaml
from datetime import timedelta
from airflow import DAG
from metadata.workflow.data_insight import DataInsightWorkflow
from metadata.workflow.workflow_output_handler import print_status
try:
from airflow.operators.python import PythonOperator
except ModuleNotFoundError:
from airflow.operators.python_operator import PythonOperator
from metadata.config.common import load_config_file
from airflow.utils.dates import days_ago
default_args = {
"owner": "user_name",
"email": ["username@org.com"],
"email_on_failure": False,
"retries": 3,
"retry_delay": timedelta(minutes=5),
"execution_timeout": timedelta(minutes=60)
}
config = """
<your YAML configuration>
"""
def metadata_ingestion_workflow():
workflow_config = yaml.safe_load(config)
workflow = DataInsightWorkflow.create(workflow_config)
workflow.execute()
workflow.raise_from_status()
print_status(workflow)
workflow.stop()
with DAG(
"sample_data",
default_args=default_args,
description="An example DAG which runs a OpenMetadata ingestion workflow",
start_date=days_ago(1),
is_paused_upon_creation=False,
schedule_interval='*/5 * * * *',
catchup=False,
) as dag:
ingest_task = PythonOperator(
task_id="ingest_using_recipe",
python_callable=metadata_ingestion_workflow,
)
```

View File

@ -28,7 +28,7 @@ To have cost analysis data available you will need to execute the below workflow
2. **Profiler Workflow**:
- Purpose: Gather size information (in bytes) for data assets.
- Description: The Profiler Workflow is responsible for obtaining the size of data assets in bytes. This information is vital for generating the size-related data used in the Cost Analysis charts. It helps in assessing the resource consumption and cost implications of each asset.
- Click [here](/connectors/ingestion/workflows/profiler) for documentation on the profiler workflow.
- Click [here](/how-to-guides/data-quality-observability/profiler/workflow) for documentation on the profiler workflow.
3. **Data Insights Workflow**:
- Purpose: Aggregate information from Usage Workflow and Profiler Workflow.

View File

@ -0,0 +1,90 @@
---
title: Run Elasticsearch Reindex using Airflow SDK
slug: /how-to-guides/data-insights/elasticsearch-reindex
---
# Run Elasticsearch Reindex using Airflow SDK
## 1. Define the YAML Config
This is a sample config for Elasticsearch Reindex:
```yaml
source:
source:
type: metadata_elasticsearch
serviceName: openMetadata
serviceConnection:
config:
type: MetadataES
sourceConfig:
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: true
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: openmetadata
securityConfig:
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
```
### 2. Prepare the Ingestion DAG
Create a Python file in your Airflow DAGs directory with the following contents:
```python
import pathlib
import yaml
from datetime import timedelta
from airflow import DAG
try:
from airflow.operators.python import PythonOperator
except ModuleNotFoundError:
from airflow.operators.python_operator import PythonOperator
from metadata.config.common import load_config_file
from metadata.workflow.metadata import MetadataWorkflow
from metadata.workflow.workflow_output_handler import print_status
from airflow.utils.dates import days_ago
default_args = {
"owner": "user_name",
"email": ["username@org.com"],
"email_on_failure": False,
"retries": 3,
"retry_delay": timedelta(minutes=5),
"execution_timeout": timedelta(minutes=60)
}
config = """
<your YAML configuration>
"""
def metadata_ingestion_workflow():
workflow_config = yaml.safe_load(config)
workflow = MetadataWorkflow.create(workflow_config)
workflow.execute()
workflow.raise_from_status()
print_status(workflow)
workflow.stop()
with DAG(
"sample_data",
default_args=default_args,
description="An example DAG which runs a OpenMetadata ingestion workflow",
start_date=days_ago(1),
is_paused_upon_creation=False,
schedule_interval='*/5 * * * *',
catchup=False,
) as dag:
ingest_task = PythonOperator(
task_id="ingest_using_recipe",
python_callable=metadata_ingestion_workflow,
)
```

View File

@ -0,0 +1,69 @@
---
title: Run Data Insights using Metadata CLI
slug: /how-to-guides/data-insights/metadata-cli
---
# Run Data Insights using Metadata CLI
## 1. Define the YAML Config
This is a sample config for Data Insights:
```yaml
source:
type: dataInsight
serviceName: OpenMetadata
sourceConfig:
config:
type: MetadataToElasticSearch
processor:
type: data-insight-processor
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: false
workflowConfig:
loggerLevel: DEBUG
openMetadataServerConfig:
hostPort: '<OpenMetadata host and port>'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
### Source Configuration - Source Config
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
### Processor Configuration
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
### Workflow Configuration
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
For a simple, local installation using our docker containers, this looks like:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: 'http://localhost:8585/api'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
You can find the different implementation of the ingestion below.
## 2. Run with the CLI
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
```bash
metadata insight -c <path-to-yaml>
```

View File

@ -72,7 +72,7 @@ After clicking Next, you will be redirected to the Scheduling form. This will be
## dbt Ingestion
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/configure-dbt-workflow-from-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
You can learn more about [lineage ingestion here](/connectors/ingestion/lineage).

View File

@ -66,7 +66,7 @@ alt="Column Data provides information"
caption="Column Data provides information"
/%}
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
{%inlineCallout
color="violet-70"

View File

@ -380,6 +380,10 @@ site_menu:
url: /connectors/database/synapse/yaml
- category: Connectors / Database / Synapse / Troubleshooting
url: /connectors/database/synapse/troubleshooting
- category: Connectors / Database / Teradata
url: /connectors/database/teradata
- category: Connectors / Database / Teradata / Run Externally
url: /connectors/database/teradata/yaml
- category: Connectors / Database / Trino
url: /connectors/database/trino
- category: Connectors / Database / Trino / Run Externally
@ -657,8 +661,6 @@ site_menu:
url: /connectors/ingestion/lineage/spark-lineage
- category: Connectors / Ingestion / Versioning
url: /connectors/ingestion/versioning
- category: Connectors / Ingestion / Auto Tagging
url: /connectors/ingestion/auto_tagging
- category: Connectors / Ingestion / Versioning / Change Feeds
url: /connectors/ingestion/versioning/change-feeds
- category: Connectors / Ingestion / Versioning / Change Events
@ -839,6 +841,12 @@ site_menu:
url: /how-to-guides/data-insights/ingestion
- category: How-to Guides / Data Insights / Key Performance Indicators (KPI)
url: /how-to-guides/data-insights/kpi
- category: How-to Guides / Data Insights / Run Data Insights using Airflow SDK
url: /how-to-guides/data-insights/airflow-sdk
- category: How-to Guides / Data Insights / Run Data Insights using Metadata CLI
url: /how-to-guides/data-insights/metadata-cli
- category: How-to Guides / Data Insights / Run Elasticsearch Reindex using Airflow SDK
url: /how-to-guides/data-insights/elasticsearch-reindex
- category: How-to Guides / Data Insights / Data Insights Report
url: /how-to-guides/data-insights/report
- category: How-to Guides / Data Insights / Cost Analysis

View File

@ -51,7 +51,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -48,7 +48,7 @@ GRANT SELECT ON <schema_name>.* to <username>;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -23,8 +23,8 @@ Configure and schedule Databricks metadata and profiler workflows from the OpenM
- [Unity Catalog](#unity-catalog)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Datalake connector.
Configure and schedule Datalake metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/datalake/yaml"} /%}

View File

@ -33,8 +33,8 @@ Configure and schedule DB2 metadata and profiler workflows from the OpenMetadata
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/db2/yaml"} /%}
@ -65,7 +65,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion
{% partial

View File

@ -48,7 +48,7 @@ GRANT SELECT ON SYSCAT.VIEWS TO USER_NAME;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Python Requirements

View File

@ -104,7 +104,7 @@ If instead we use a local file path that contains the metastore information (e.g
To update the `Derby` information. More information about this in a great [SO thread](https://stackoverflow.com/questions/38377188/how-to-get-rid-of-derby-log-metastore-db-from-spark-shell).
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html),
- If you need further information regarding the Hive metastore, you can find it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html),
and in The Internals of Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).
**Metastore Database**

View File

@ -109,7 +109,7 @@ To update the `Derby` information. More information about this in a great [SO th
- You can find all supported configurations [here](https://spark.apache.org/docs/latest/configuration.html)
- If you need further information regarding the Hive metastore, you can find
it [here](https://spark.apache.org/docs/3.0.0-preview/sql-data-sources-hive-tables.html), and in The Internals of
it [here](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html), and in The Internals of
Spark SQL [book](https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html).

View File

@ -17,7 +17,7 @@ Configure and schedule DomoDatabase metadata and profiler workflows from the Ope
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/domo-database/yaml"} /%}

View File

@ -17,8 +17,8 @@ Configure and schedule Doris metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-doris-connection-with-ssl-in-openmetadata)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the Druid connector.
Configure and schedule Druid metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/athena/yaml"} /%}

View File

@ -18,8 +18,8 @@ Configure and schedule Greenplum metadata and profiler workflows from the OpenMe
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-greenplum-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ In this section, we provide guides and references to use the Hive connector.
Configure and schedule Hive metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-hive-connection-with-ssl-in-openmetadata)
@ -31,7 +31,7 @@ Configure and schedule Hive metadata and profiler workflows from the OpenMetadat
To extract metadata, the user used in the connection needs to be able to perform `SELECT`, `SHOW`, and `DESCRIBE` operations in the database/schema where the metadata needs to be extracted from.
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -15,8 +15,8 @@ In this section, we provide guides and references to use the Impala connector.
Configure and schedule Impala metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-impala-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ Configure and schedule MariaDB metadata and profiler workflows from the OpenMeta
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/mariadb/yaml"} /%}
@ -43,7 +43,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -70,7 +70,7 @@ To fetch the metadata from MongoDB to OpenMetadata, the MongoDB user must have a
To deploy OpenMetadata, check the Deployment guides.
{%/inlineCallout%}
[Profiler deployment](/connectors/ingestion/workflows/profiler)
[Profiler deployment](/how-to-guides/data-quality-observability/profiler/workflow)
### Limitations

View File

@ -292,7 +292,7 @@ workflowConfig:
{% /codePreview %}
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/connectors/ingestion/workflows/profiler)
- You can learn more about how to configure and run the Profiler Workflow to extract Profiler data and execute the Data Quality from [here](/how-to-guides/data-quality-observability/profiler/workflow)

View File

@ -19,8 +19,8 @@ Configure and schedule MSSQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -17,8 +17,8 @@ Configure and schedule MySQL metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-mysql-connection-with-ssl-in-openmetadata)
@ -45,7 +45,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -17,8 +17,8 @@ Configure and schedule Oracle metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -16,8 +16,8 @@ In this section, we provide guides and references to use the PinotDB connector.
Configure and schedule PinotDB metadata and profiler workflows from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/pinotdb/yaml"} /%}

View File

@ -18,8 +18,8 @@ Configure and schedule PostgreSQL metadata and profiler workflows from the OpenM
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-postgres-connection-with-ssl-in-openmetadata)

View File

@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/presto/yaml"} /%}
@ -30,7 +30,7 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
To extract metadata, the user needs to be able to perform `SHOW CATALOGS`, `SHOW TABLES`, and `SHOW COLUMNS FROM` on the catalogs/tables you wish to extract metadata from and have `SELECT` permission on the `INFORMATION_SCHEMA`. Access to resources will be different based on the connector used. You can find more details in the Presto documentation website [here](https://prestodb.io/docs/current/connector.html). You can also get more information regarding system access control in Presto [here](https://prestodb.io/docs/current/security/built-in-system-access-control.html).
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -19,8 +19,8 @@ Configure and schedule Redshift metadata and profiler workflows from the OpenMet
- [Metadata Ingestion](#metadata-ingestion)
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/redshift)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
- [Enable Security](#securing-redshift-connection-with-ssl-in-openmetadata)
@ -42,7 +42,7 @@ GRANT SELECT ON TABLE svv_table_info to test_user;
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
### Usage & Lineage
For the usage and lineage workflow, the user will need `SELECT` privilege on `STL_QUERY` table. You can find more information on the usage workflow [here](/connectors/ingestion/workflows/usage) and the lineage workflow [here](/connectors/ingestion/workflows/lineage).

View File

@ -18,8 +18,8 @@ Configure and schedule SAP Hana metadata and profiler workflows from the OpenMet
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sap-hana/yaml"} /%}
@ -57,7 +57,7 @@ The same applies to the `_SYS_REPO` schema, required for lineage extraction.
### Profiler & Data Quality
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -17,8 +17,8 @@ Configure and schedule Singlestore metadata and profiler workflows from the Open
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/singlestore/yaml"} /%}
@ -44,7 +44,7 @@ GRANT SELECT ON world.hello TO '<username>';
```
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion

View File

@ -20,8 +20,8 @@ Configure and schedule Snowflake metadata and profiler workflows from the OpenMe
- [Metadata Ingestion](#metadata-ingestion)
- [Incremental Extraction](/connectors/ingestion/workflows/metadata/incremental-extraction/snowflake)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -17,8 +17,8 @@ Configure and schedule Presto metadata and profiler workflows from the OpenMetad
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/sqlite/yaml"} /%}

View File

@ -0,0 +1,75 @@
---
title: Teradata
slug: /connectors/database/teradata
---
{% connectorDetailsHeader
name="Teradata"
stage="BETA"
platform="OpenMetadata"
availableFeatures=["Metadata", "Data Profiler"]
unavailableFeatures=["Query Usage", "Data Quality", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
/ %}
In this section, we provide guides and references to use the Teradata connector.
Configure and schedule Teradata metadata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality/configure)
{% partial file="/v1.6/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/greenplum/yaml"} /%}
## Requirements
{%inlineCallout icon="description" bold="OpenMetadata 1.6 or later" href="/deployment"%}
To deploy OpenMetadata, check the Deployment guides.
{%/inlineCallout%}
Connector was tested on Teradata DBS version 17.20. Since there are no significant changes in metadata objects, so it should work with 15.x, 16.x versions.
## Metadata Ingestion
By default, all valid users in Teradata DB has full access to metadata objects, so there are no any specific requirements to user privileges.
{% partial
file="/v1.6/connectors/metadata-ingestion-ui.md"
variables={
connector: "Teradata",
selectServicePath: "/images/v1.6/connectors/teradata/select-service.png",
addNewServicePath: "/images/v1.6/connectors/teradata/add-new-service.png",
serviceConnectionPath: "/images/v1.6/connectors/teradata/service-connection.png",
}
/%}
{% stepsContainer %}
{% extraContent parentTagName="stepsContainer" %}
#### Connection Details
- **Username**: Specify the User to connect to Teradata.
- **Password**: Password to connect to Teradata
- **Logmech**: Specifies the logon authentication method. Possible values are TD2 (the default), JWT, LDAP, KRB5 for Kerberos, or TDNEGO.
- **LOGDATA**: Specifies additional data needed by a logon mechanism, such as a secure token, Distinguished Name, or a domain/realm name. LOGDATA values are specific to each logon mechanism.
- **Host and Port**: Enter the fully qualified hostname and port number (default port for Teradata is 1025) for your Teradata deployment in the Host and Port field.
- **Transaction Mode**: Specifies the transaction mode for the connection. Possible values are DEFAULT (the default), ANSI, or TERA.
- **Teradata Database Account**: Specifies an account string to override the default account string defined for the database user. Accounts are used by the database for workload management and resource usage monitoring.
- **Connection Options** and **Connection Arguments**: additional connection parameters. For more information please view teradatasql [docs](https://pypi.org/project/teradatasql/).
{% partial file="/v1.6/connectors/database/advanced-configuration.md" /%}
{% /extraContent %}
{% partial file="/v1.6/connectors/test-connection.md" /%}
{% partial file="/v1.6/connectors/database/configure-ingestion.md" /%}
{% partial file="/v1.6/connectors/ingestion-schedule-and-deploy.md" /%}
{% /stepsContainer %}
{% partial file="/v1.6/connectors/troubleshooting.md" /%}
{% partial file="/v1.6/connectors/database/related.md" /%}

View File

@ -0,0 +1,117 @@
---
title: Run the Teradata Connector Externally
slug: /connectors/database/teradata/yaml
---
{% connectorDetailsHeader
name="Teradata"
stage="BETA"
platform="OpenMetadata"
availableFeatures=["Metadata", "Data Profiler", "Data Quality"]
unavailableFeatures=["Query Usage", "Owners", "Tags", "Stored Procedures", "Lineage", "Column-level Lineage", "dbt"]
/ %}
In this section, we provide guides and references to use the Teradata connector.
Configure and schedule Greenplum Teradata and profiler workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](#data-profiler)
- [Data Quality](#data-quality)
{% partial file="/v1.6/connectors/external-ingestion-deployment.md" /%}
## Requirements
### Python Requirements
{% partial file="/v1.6/connectors/python-requirements.md" /%}
To run the Teradata ingestion, you will need to install:
```bash
pip3 install "openmetadata-ingestion[teradata]"
```
## Metadata Ingestion
All connectors are defined as JSON Schemas.
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/teradataConnection.json)
you can find the structure to create a connection to Teradata.
In order to create and run a Metadata Ingestion workflow, we will follow
the steps to create a YAML configuration able to connect to the source,
process the Entities if needed, and reach the OpenMetadata server.
The workflow is modeled around the following
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/workflow.json)
### 1. Define the YAML Config
This is a sample config for Teradata:
{% codePreview %}
{% codeInfoContainer %}
#### Source Configuration - Service Connection
{% codeInfo srNumber=1 %}
**username**: Specify the User to connect to Teradata.
{% /codeInfo %}
{% codeInfo srNumber=2 %}
**password**: User password to connect to Teradata
{% /codeInfo %}
{% codeInfo srNumber=3 %}
**hostPort**: Enter the fully qualified hostname and port number for your Greenplum deployment in the Host and Port field.
{% /codeInfo %}
{% /codeInfoContainer %}
{% codeBlock fileName="filename.yaml" %}
```yaml {% isCodeBlock=true %}
source:
type: teradata
serviceName: example_teradata
serviceConnection:
config:
type: Teradata
```
```yaml {% srNumber=1 %}
username: username
```
```yaml {% srNumber=2 %}
password: <password>
```
```yaml {% srNumber=3 %}
hostPort: teradata:1025
```
{% partial file="/v1.6/connectors/yaml/database/source-config.md" /%}
{% partial file="/v1.6/connectors/yaml/ingestion-sink.md" /%}
{% partial file="/v1.6/connectors/yaml/workflow-config.md" /%}
{% /codeBlock %}
{% /codePreview %}
{% partial file="/v1.6/connectors/yaml/ingestion-cli.md" /%}
{% partial file="/v1.6/connectors/yaml/data-profiler.md" variables={connector: "teradata"} /%}
{% partial file="/v1.6/connectors/yaml/data-quality.md" /%}

View File

@ -17,8 +17,8 @@ Configure and schedule Trino metadata and profiler workflows from the OpenMetada
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/trino/yaml"} /%}
@ -33,7 +33,7 @@ Access to resources will be based on the user access permission to access specif
### Profiler & Data Quality
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
## Metadata Ingestion
{% partial

View File

@ -18,7 +18,7 @@ Configure and schedule Unity Catalog metadata workflow from the OpenMetadata UI:
- [Metadata Ingestion](#metadata-ingestion)
- [Query Usage](/connectors/ingestion/workflows/usage)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [Lineage](/connectors/ingestion/lineage)
- [dbt Integration](/connectors/ingestion/workflows/dbt)

View File

@ -18,8 +18,8 @@ Configure and schedule Vertica metadata and profiler workflows from the OpenMeta
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
- [Data Profiler](/connectors/ingestion/workflows/profiler)
- [Data Quality](/connectors/ingestion/workflows/data-quality)
- [Data Profiler](/how-to-guides/data-quality-observability/profiler/workflow)
- [Data Quality](/how-to-guides/data-quality-observability/quality)
- [dbt Integration](/connectors/ingestion/workflows/dbt)
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/database/vertica/yaml"} /%}

View File

@ -117,6 +117,6 @@ alt="Run Great Expectations checkpoint"
/%}
### List of Great Expectations Supported Test
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/connectors/ingestion/workflows/data-quality/tests) section.
We currently only support a certain number of Great Expectations tests. The full list can be found in the [Tests](/how-to-guides/data-quality-observability/quality/tests) section.
If a test is not supported, there is no need to worry about the execution of your Great Expectations test. We will simply skip the tests that are not supported and continue the execution of your test suite.

View File

@ -30,14 +30,14 @@ Learn more about how to ingest metadata from dozens of connectors.
{%inlineCallout
bold="Metadata Profiler"
icon="cable"
href="/connectors/ingestion/workflows/profiler"%}
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
To get metrics from your Tables=
{%/inlineCallout%}
{%inlineCallout
bold="Metadata Data Quality Tests"
icon="cable"
href="/connectors/ingestion/workflows/data-quality"%}
href="/how-to-guides/data-quality-observability/quality"%}
To run automated Quality Tests on your Tables.
{%/inlineCallout%}

View File

@ -151,7 +151,7 @@ Refer to the code [here](https://github.com/open-metadata/OpenMetadata/blob/main
The fields for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` should be numeric values.
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/ingest-dbt-yaml).
To know how to get the values for `Dbt Cloud Account Id`, `Dbt Cloud Project Id` and `Dbt Cloud Job Id` fields check [here](/connectors/ingestion/workflows/dbt/run-dbt-workflow-externally).
{% /note %}

View File

@ -40,14 +40,14 @@ Configure dbt metadata
{%inlineCallout
icon="fit_screen"
bold="Data Profiler"
href="/connectors/ingestion/workflows/profiler"%}
href="/how-to-guides/data-quality-observability/profiler/workflow"%}
Compute metrics and ingest sample data.
{%/inlineCallout%}
{%inlineCallout
icon="fit_screen"
bold="Data Quality"
href="/connectors/ingestion/workflows/data-quality"%}
href="/how-to-guides/data-quality-observability/quality"%}
Monitor your data and avoid surprises.
{%/inlineCallout%}

View File

@ -304,6 +304,6 @@ processor:
- Bumped up ElasticSearch version for Docker and Kubernetes OpenMetadata Dependencies Helm Chart to `7.16.3`
### Data Quality Migration
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/connectors/ingestion/workflows/data-quality).
With 1.1.0 version we are migrating existing test cases defined in a test suite to the corresponding table, with this change you might need to recreate the pipelines for the test suites, since due to this restructuring the existing ones are removed from Test Suites - more details about the new data quality can be found [here](/how-to-guides/data-quality-observability/quality).
As a user you will need to redeploy data quality workflows. You can go to `Quality > By Tables` to view the tables with test cases that need a workflow to be set up.

View File

@ -93,7 +93,7 @@ Then, you can prepare `Run Configurations` to execute the ingestion as you would
{% image src="/images/v1.5/developers/contribute/build-code-and-run-tests/pycharm-run-config.png" alt="PyCharm run config" caption=" " /%}
Note that in the example we are preparing a configuration to run and test Superset. In order to understand how to run
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/cli).
ingestions via the CLI, you can refer to each connector's [docs](/connectors/dashboard/superset/yaml).
The important part is that we are not running a script, but rather a `module`: `metadata`. Based on this, we can work as
we would usually do with the CLI for any ingestion, profiler, or test workflow.

View File

@ -147,7 +147,7 @@ OpenMetadata supports MySQL version `8.0.0` and up.
$$
### Profiler & Data Quality
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/connectors/ingestion/workflows/profiler) and data quality tests [here](/connectors/ingestion/workflows/data-quality).
Executing the profiler Workflow or data quality tests, will require the user to have `SELECT` permission on the tables/schemas where the profiler/tests will be executed. The user should also be allowed to view information in `tables` for all objects in the database. More information on the profiler workflow setup can be found [here](/how-to-guides/data-quality-observability/profiler/workflow) and data quality tests [here](/how-to-guides/data-quality-observability/quality).
You can find further information on the MySQL connector in the [docs](/connectors/database/mysql).

View File

@ -153,7 +153,7 @@ By connecting to a database service, you can ingest the databases, schemas, tabl
/%}
{% note %}
**Note:** Once youve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/connectors/ingestion/workflows/profiler). To add ingestion pipelines, select the required type of ingestion and enter the required details.
**Note:** Once youve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/how-to-guides/data-quality-observability/profiler/workflow). To add ingestion pipelines, select the required type of ingestion and enter the required details.
{% /note %}
{% image

View File

@ -31,7 +31,7 @@ alt="Column Data provides information"
caption="Column Data provides information"
/%}
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
## Tag Mapping

View File

@ -0,0 +1,117 @@
---
title: Run Data Insights using Airflow SDK
slug: /how-to-guides/data-insights/airflow-sdk
---
# Run Data Insights using Airflow SDK
## 1. Define the YAML Config
This is a sample config for Data Insights:
```yaml
source:
type: dataInsight
serviceName: OpenMetadata
sourceConfig:
config:
type: MetadataToElasticSearch
processor:
type: data-insight-processor
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: false
workflowConfig:
loggerLevel: DEBUG
openMetadataServerConfig:
hostPort: '<OpenMetadata host and port>'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
### Source Configuration - Source Config
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
### Processor Configuration
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
### Workflow Configuration
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
For a simple, local installation using our docker containers, this looks like:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: 'http://localhost:8585/api'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
You can find the different implementation of the ingestion below.
## 2. Prepare the Data Insights DAG
Create a Python file in your Airflow DAGs directory with the following contents:
```python
import pathlib
import yaml
from datetime import timedelta
from airflow import DAG
from metadata.workflow.data_insight import DataInsightWorkflow
from metadata.workflow.workflow_output_handler import print_status
try:
from airflow.operators.python import PythonOperator
except ModuleNotFoundError:
from airflow.operators.python_operator import PythonOperator
from metadata.config.common import load_config_file
from airflow.utils.dates import days_ago
default_args = {
"owner": "user_name",
"email": ["username@org.com"],
"email_on_failure": False,
"retries": 3,
"retry_delay": timedelta(minutes=5),
"execution_timeout": timedelta(minutes=60)
}
config = """
<your YAML configuration>
"""
def metadata_ingestion_workflow():
workflow_config = yaml.safe_load(config)
workflow = DataInsightWorkflow.create(workflow_config)
workflow.execute()
workflow.raise_from_status()
print_status(workflow)
workflow.stop()
with DAG(
"sample_data",
default_args=default_args,
description="An example DAG which runs a OpenMetadata ingestion workflow",
start_date=days_ago(1),
is_paused_upon_creation=False,
schedule_interval='*/5 * * * *',
catchup=False,
) as dag:
ingest_task = PythonOperator(
task_id="ingest_using_recipe",
python_callable=metadata_ingestion_workflow,
)
```

View File

@ -28,7 +28,7 @@ To have cost analysis data available you will need to execute the below workflow
2. **Profiler Workflow**:
- Purpose: Gather size information (in bytes) for data assets.
- Description: The Profiler Workflow is responsible for obtaining the size of data assets in bytes. This information is vital for generating the size-related data used in the Cost Analysis charts. It helps in assessing the resource consumption and cost implications of each asset.
- Click [here](/connectors/ingestion/workflows/profiler) for documentation on the profiler workflow.
- Click [here](/how-to-guides/data-quality-observability/profiler/workflow) for documentation on the profiler workflow.
3. **Data Insights Workflow**:
- Purpose: Aggregate information from Usage Workflow and Profiler Workflow.

View File

@ -0,0 +1,90 @@
---
title: Run Elasticsearch Reindex using Airflow SDK
slug: /how-to-guides/data-insights/elasticsearch-reindex
---
# Run Elasticsearch Reindex using Airflow SDK
## 1. Define the YAML Config
This is a sample config for Elasticsearch Reindex:
```yaml
source:
source:
type: metadata_elasticsearch
serviceName: openMetadata
serviceConnection:
config:
type: MetadataES
sourceConfig:
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: true
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: openmetadata
securityConfig:
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
```
### 2. Prepare the Ingestion DAG
Create a Python file in your Airflow DAGs directory with the following contents:
```python
import pathlib
import yaml
from datetime import timedelta
from airflow import DAG
try:
from airflow.operators.python import PythonOperator
except ModuleNotFoundError:
from airflow.operators.python_operator import PythonOperator
from metadata.config.common import load_config_file
from metadata.workflow.metadata import MetadataWorkflow
from metadata.workflow.workflow_output_handler import print_status
from airflow.utils.dates import days_ago
default_args = {
"owner": "user_name",
"email": ["username@org.com"],
"email_on_failure": False,
"retries": 3,
"retry_delay": timedelta(minutes=5),
"execution_timeout": timedelta(minutes=60)
}
config = """
<your YAML configuration>
"""
def metadata_ingestion_workflow():
workflow_config = yaml.safe_load(config)
workflow = MetadataWorkflow.create(workflow_config)
workflow.execute()
workflow.raise_from_status()
print_status(workflow)
workflow.stop()
with DAG(
"sample_data",
default_args=default_args,
description="An example DAG which runs a OpenMetadata ingestion workflow",
start_date=days_ago(1),
is_paused_upon_creation=False,
schedule_interval='*/5 * * * *',
catchup=False,
) as dag:
ingest_task = PythonOperator(
task_id="ingest_using_recipe",
python_callable=metadata_ingestion_workflow,
)
```

View File

@ -0,0 +1,69 @@
---
title: Run Data Insights using Metadata CLI
slug: /how-to-guides/data-insights/metadata-cli
---
# Run Data Insights using Metadata CLI
## 1. Define the YAML Config
This is a sample config for Data Insights:
```yaml
source:
type: dataInsight
serviceName: OpenMetadata
sourceConfig:
config:
type: MetadataToElasticSearch
processor:
type: data-insight-processor
config: {}
sink:
type: elasticsearch
config:
es_host: localhost
es_port: 9200
recreate_indexes: false
workflowConfig:
loggerLevel: DEBUG
openMetadataServerConfig:
hostPort: '<OpenMetadata host and port>'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
### Source Configuration - Source Config
- To send the metadata to OpenMetadata, it needs to be specified as `type: MetadataToElasticSearch`.
### Processor Configuration
- To send the metadata to OpenMetadata, it needs to be specified as `type: data-insight-processor`.
### Workflow Configuration
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
For a simple, local installation using our docker containers, this looks like:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: 'http://localhost:8585/api'
authProvider: openmetadata
securityConfig:
jwtToken: '{bot_jwt_token}'
```
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/openmetadata-spec/src/main/resources/json/schema/security/client).
You can find the different implementation of the ingestion below.
## 2. Run with the CLI
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
```bash
metadata insight -c <path-to-yaml>
```

View File

@ -72,7 +72,7 @@ After clicking Next, you will be redirected to the Scheduling form. This will be
## dbt Ingestion
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/configure-dbt-workflow-from-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.
You can learn more about [lineage ingestion here](/connectors/ingestion/lineage).

View File

@ -66,7 +66,7 @@ alt="Column Data provides information"
caption="Column Data provides information"
/%}
You can read more about [Auto PII Tagging](/connectors/ingestion/auto_tagging) here.
You can read more about [Auto PII Tagging](/how-to-guides/data-quality-observability/profiler/auto-pii-tagging) here.
{%inlineCallout
color="violet-70"

Some files were not shown because too many files have changed in this diff Show More