Docs: Auto classification YAML and Pointers Updation (#19739)

* Docs: Auto classification YAML and Pointers Updation

* Docs: Auto classification YAML and Pointers Updation

---------

Co-authored-by: Rounak Dhillon <rounakdhillon@Rounaks-MacBook-Air.local>
This commit is contained in:
Rounak Dhillon 2025-02-11 14:33:37 +05:30 committed by GitHub
parent c0eb7d08de
commit fbd47e4ed8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 52 additions and 2 deletions

View File

@ -151,4 +151,8 @@ After saving the YAML config, we will run the command the same way we did for th
metadata classify -c <path-to-yaml>
```
Note now instead of running `ingest`, we are using the `classify` command to select the Auto Classification workflow.
{% note %}
Now instead of running `ingest`, we are using the `classify` command to select the Auto Classification workflow.
{% /note %}

View File

@ -151,4 +151,8 @@ After saving the YAML config, we will run the command the same way we did for th
metadata classify -c <path-to-yaml>
```
Note now instead of running `ingest`, we are using the `classify` command to select the Auto Classification workflow.
{% note %}
Now instead of running `ingest`, we are using the `classify` command to select the Auto Classification workflow.
{% /note %}

View File

@ -42,6 +42,19 @@ The Auto Classification Workflow enables automatic tagging of sensitive informat
- When set to `true`, filtering patterns will be applied to the Fully Qualified Name of a table (e.g., `service_name.db_name.schema_name.table_name`).
- When set to `false`, filtering applies only to raw table names.
## Auto Classification Workflow Execution
To execute the **Auto Classification Workflow**, follow the steps below:
### 1. Install the Required Python Package
Ensure you have the correct OpenMetadata ingestion package installed, including the **PII Processor** module:
```bash
pip install "openmetadata-ingestion[pii-processor]"
```
## 2. Define and Execute the Python Workflow
Instead of using a YAML configuration, use the AutoClassificationWorkflow from OpenMetadata to trigger the ingestion process programmatically.
## Sample Auto Classification Workflow yaml
```yaml
@ -103,6 +116,14 @@ workflowConfig:
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
```
### 3. Expected Outcome
- Automatically classifies and tags sensitive data based on predefined patterns and confidence levels.
- Improves metadata enrichment and enhances data governance practices.
- Provides visibility into sensitive data across databases.
This approach ensures that the Auto Classification Workflow is executed correctly using the appropriate OpenMetadata ingestion framework.
{% partial file="/v1.6/connectors/yaml/auto-classification.md" variables={connector: "snowflake"} /%}
## Workflow Execution
### To Execute the Auto Classification Workflow:

View File

@ -42,6 +42,19 @@ The Auto Classification Workflow enables automatic tagging of sensitive informat
- When set to `true`, filtering patterns will be applied to the Fully Qualified Name of a table (e.g., `service_name.db_name.schema_name.table_name`).
- When set to `false`, filtering applies only to raw table names.
## Auto Classification Workflow Execution
To execute the **Auto Classification Workflow**, follow the steps below:
### 1. Install the Required Python Package
Ensure you have the correct OpenMetadata ingestion package installed, including the **PII Processor** module:
```bash
pip install "openmetadata-ingestion[pii-processor]"
```
## 2. Define and Execute the Python Workflow
Instead of using a YAML configuration, use the AutoClassificationWorkflow from OpenMetadata to trigger the ingestion process programmatically.
## Sample Auto Classification Workflow yaml
```yaml
@ -103,6 +116,14 @@ workflowConfig:
jwtToken: "eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcmciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVKwEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfdQllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
```
### 3. Expected Outcome
- Automatically classifies and tags sensitive data based on predefined patterns and confidence levels.
- Improves metadata enrichment and enhances data governance practices.
- Provides visibility into sensitive data across databases.
This approach ensures that the Auto Classification Workflow is executed correctly using the appropriate OpenMetadata ingestion framework.
{% partial file="/v1.7/connectors/yaml/auto-classification.md" variables={connector: "snowflake"} /%}
## Workflow Execution
### To Execute the Auto Classification Workflow: