Data Governance Docs (#13231)

This commit is contained in:
Shilpa Vernekar 2023-09-18 10:54:45 +05:30 committed by GitHub
parent cc47f5618f
commit 520f8c34ea
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
53 changed files with 880 additions and 0 deletions

View File

@ -0,0 +1,56 @@
---
title: How to Add Assets to Glossary Terms
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/assets
---
# How to Add Assets to Glossary Terms
After creating a glossary term, data assets can be associated with the term. In the **Glossary Term > Assets Tab** all the assets associated with the glossary term are displayed. These data assets are further subgrouped as Tables, Topics, Dashboards, etc.
{% image
src="/images/v1.1/how-to-guides/governance/term3.png"
alt="Assets Tab"
caption="Assets Tab"
/%}
You can add more assets by clicking on **Add > Assets**.
{% image
src="/images/v1.1/how-to-guides/governance/asset.png"
alt="Add Asset"
caption="Add Asset"
/%}
You can further search and filter assets by type. Simply select the relevant assets and click **Save**.
{% image
src="/images/v1.1/how-to-guides/governance/asset1.png"
alt="Assets Related to the Glossary Term"
caption="Assets Related to the Glossary Term"
/%}
The glossary term lists the Assets, which makes it easy to discover all the data assets related to the term.
## Glossary Terms and Tags
If **Tags** are associated with a **Glossary Term**, then applying that glossary term to a data asset, will also automatically apply the associated tags to that data asset. For example, the glossary term Account has a PII.Sensitive tag associated with it. When you add a glossary term to a data asset, the associated tags also get added.
{% image
src="/images/v1.1/how-to-guides/governance/tag5.png"
alt="Glossary Term and Associated Tags"
caption="Glossary Term and Associated Tags"
/%}
{% image
src="/images/v1.1/how-to-guides/governance/tag6.png"
alt="Glossary Term and Tag gets Added to the Data Asset"
caption="Glossary Term and Tag gets Added to the Data Asset"
/%}
{%inlineCallout
color="violet-70"
bold="How to Classify Data Assets"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%}
Add tags to data assets, or request them and discuss about the same, all within OpenMetadata.
{%/inlineCallout%}

View File

@ -0,0 +1,78 @@
---
title: Best Practices for Glossary and Classification
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/best-practices
---
# Best Practices for Glossary and Classification
A controlled vocabulary is an organized arrangement of words and phrases to define terminology to organize and retrieve information. **Glossary** and **Classification** are both controlled vocabulary.
Here are the **Top 8 Best Practices** around Terminologies:
## 1. Use Hierarchical Relationships
A hierarchical structure helps in grouping similar concepts and helps in better understanding. Instead of using a flat list of glossary terms, add a hierarchical (Parent-Child) relationship. This provides more context to a glossary term. The additional context helps in classification and policy enforcement.
When using hierarchy, it is better to limit the hierarchy to three levels.
{% image
src="/images/v1.1/how-to-guides/governance/glossary7.png"
alt="Phone Number in the Context of a User and Business"
caption="Phone Number in the Context of a User and Business"
/%}
In a flat list, the term Phone Number lacks context and it would be difficult to ascertain the sensitivity of data. A User Phone Number is PII-Sensitive, whereas a Business Phone Number is not PII-Sensitive. This can be best represented with hierarchical relationships and by grouping concepts.
## 2. Add Classification Tags to Glossary Terms
Classification tags can be added to a glossary term. This helps to define both the semantic meaning and type of data in a single step. Instead of adding classification tags manually, a glossary term can be added to define the **meaning** of the data, and classification tags like PII-sensitive can be added to the term to define the **type** of data. This helps to auto-assign PII tags.
Organizations have data producers who create tables, and build data models. Team members who understand regulatory compliance requirements are good at classifying data. Among them, those who understand the data as well as the regulatory requirements, can help organizations scale by adding glossary terms along with the classification and tags.
{% image
src="/images/v1.1/how-to-guides/governance/glossary4.png"
alt="Add Classification Tags to Glossary Terms"
caption="Add Classification Tags to Glossary Terms"
/%}
## 3. Make Use of Tier Classification
Tiering helps define the importance of data to an organization. By focusing on Tier 1 data, organizations can create the highest impact. Identifying Tier 5 can help declutter the existing data. Learn more about [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers).
## 4. Use Classifications to Simplify Policies
Along with ownership and team membership, tags are a powerful way to group data assets. A single policy can be created at the Resource level instead of managing multiple policies for various resources.
Resources can be grouped using classification tags like sensitive data, restrictive data, external data, raw data, public data, internal data, etc. Further, Policies can be created based on Tags to simplify data governance.
Instead of creating policies for separate tables with sensitive data, the Sensitive tag can be attached to various data assets; and a policy can be created to match based on the Sensitive tag, which will take care of all the resources marked accordingly.
## 5. Use Display Name to Improve Names
When classifications and glossaries are inherited from source systems, the names may not communicate the concept well. For example, dep-prod instead of Product Department. Users are more likely to search using common terms like Product or Department, and this helps in better discovery.
{% image
src="/images/v1.1/how-to-guides/governance/glossary5.png"
alt="Add Display Names for Better Discovery"
caption="Add Display Names for Better Discovery"
/%}
In cases where abbreviations or acronyms are used, a better display name helps in data discovery. For example, `c_id` can be changed to `Customer ID`, and `CAC` can be changed to `Customer Acquisition Cost`
## 6. Use Glossary Import Export
For glossary bulk edits to update descriptions, ownership, reviewer, and status, export the Glossary, make the edits in a CSV file, and import it. Learn more about [Glossary Bulk Import](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary).
## 7. Dont Delete Classification & Glossary Terms; Rename them
When glossary terms or classification has typos, users tend to delete the term. All the effort spent in tagging the data assets is lost when terms are deleted. OpenMetadata supports renaming Glossary and Classification terms. Simply rename the terms.
## 8. Group Similar Concepts Together
When adding terms, building a semantic relationship helps to understand data through concepts. For example, grouping related terms helps in understanding the various terms and their overall relationship.
{% image
src="/images/v1.1/how-to-guides/governance/glossary6.png"
alt="Group Similar Concepts Together"
caption="Group Similar Concepts Together"
/%}

View File

@ -0,0 +1,71 @@
---
title: What is Classification
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/classification
---
# What is Classification
**Classification** is a tag or annotation that categorizes or classifies a data asset. Classification does not define the semantics or meaning of data, but it helps define the type of data. For example, data can be:
- Sensitive or Non-sensitive,
- PII or Non-PII in terms of privacy,
- Verified or Unverified in terms of readiness for data consumption.
Classification is used for policy enforcement purposes. Classification helps in browsing, searching, grouping, and managing data. It also helps in Security, Data Privacy, and Data Protection use cases. All of this is done by defining Policies, like Access Control policies, Retention policies, and Data Management policies.
For Classification in OpenMetadata, we use a flat list of terms from knowledge organization systems. Classification groups together a set of similar terms called **Tags**, which can be accessed from **Govern > Classification**.
In the below example, PersonalData is a Classification and it further has Tags under it. `PersonalData` is also a **System** Classification. System classifications are an important part of OpenMetadata and therefore cannot be deleted. The descriptions for the System tags can be modified. They can also be disabled. `PII` and `Tiers` are the other important system classifications in OpenMetadata.
{% image
src="/images/v1.1/how-to-guides/governance/tag4.png"
alt="Classification: Groups together Tags"
caption="Classification: Groups together Tags"
/%}
## Classification and Categorization Tags
OpenMetadata supports both Classification and Categorization tags.
- **Classification tags** are **mutually exclusive**. A data asset can be in only one class in a hierarchy. Data can either be Public or Private, Sensitive or Non-sensitive. It cannot be both.
- **Categorization tags** are **not mutually exclusive**. A data asset can belong to multiple categories. The same table can have Usage, Financial, Reporting and Compliance tags.
## Mutually Exclusive Tags
There are cases where only one tag from a particular classification is relevant for a data asset. For example, an asset can either be PII Sensitive or PII Non-Sensitive. It cannot be both. For such cases, a Classification can be created where the tags can be mutually exclusive. If this configuration is enabled, you wont be able to assign multiple tags from the same Classification to the same data asset.
{% note %}
**Pro Tip:** The Global Search in OpenMetadata also helps discover related Glossary Terms and Tags.
{% image
src="/images/v1.1/how-to-guides/governance/tag1.png"
alt="Search for Glossary Terms and Tags"
caption="Search for Glossary Terms and Tags"
/%}
{% /note %}
## How Classification Helps?
- You can discover the data assets in the Tags page.
- You can also search for data assets and filter them by tags.
- Tags can be used for authoring Policies.
## Classification APIs
OpenMetadata has extensive classification APIs to automate tagging. These APIs support two kinds of entities - Classification and Tags. These entities are identified by a Unique ID. Tags have a fully qualified name in the form of `classification.tagTerm`
Refer the **[API Documentation on Classification](https://sandbox.open-metadata.org/docs#tag/Classifications)**.
{%inlineCallout
color="violet-70"
bold="What are Tiers"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/tiers"%}
Tiers helps to define the importance of data to an organization.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Classify Data Assets"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%}
Add tags to data assets, or request them and discuss about the same, all within OpenMetadata.
{%/inlineCallout%}

View File

@ -0,0 +1,145 @@
---
title: How to Classify Data Assets
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets
---
# How to Classify Data Assets
## How to Add Classification Tags
- From the Explore page, select a data asset and click on the edit icon or + Add for Tags.
- Search for the relevant tags. You can either type and search, or scroll to select from the options provided.
- Click on the checkmark to save the changes.
{% image
src="/images/v1.1/how-to-guides/governance/tag7.png"
alt="Add Tags to Classify Data Assets"
caption="Add Tags to Classify Data Assets"
/%}
The tagged data assets can be discovered right from the Classification page.
- Navigate to **Govern >> Classification**.
- The list of tags is displayed along with the details of Usage in various data assets.
- Click on the Usage number to view the tagged assets.
{% image
src="/images/v1.1/how-to-guides/governance/tag2.png"
alt="Usage: Number of Assets Tagged"
caption="Usage: Number of Assets Tagged"
/%}
{% image
src="/images/v1.1/how-to-guides/governance/tag3.png"
alt="Discover the Tagged Data Assets"
caption="Discover the Tagged Data Assets"
/%}
You can view all the tags in the right panel.
Data assets can also be classified using Tiers. Learn more about [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers).
Among the Classification Tags, OpenMetadata has some System Classification. Learn more about the [System Tags](/how-to-guides/openmetadata/data-governance/glossary-classification/classification).
## Task: Request to Update Tags
Apart from adding the tags directly, users can also request to update tags. This is typically done when the user wants another opinion on the tag being added, or if the user does not have access to add tags directly.
- Click on the **?** icon next to tags
{% image
src="/images/v1.1/how-to-guides/governance/tag8.png"
alt="Request to Update Tags"
caption="Request to Update Tags"
/%}
- A Task will be created with some pre-populated details. Fill in the other important information:
- **Title** - This is auto-populated
- **Assignees** - Multiple users can be added
- **Update Tags** - It displays 3 tabs.
- You can view the **Current** tags.
- You can add the **New** tags.
- It will display the **Difference** as well.
- Click on **Submit** to create the task.
{% image
src="/images/v1.1/how-to-guides/governance/task1.png"
alt="Add a Task: Request to Update Tags"
caption="Add a Task: Request to Update Tags"
/%}
Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**.
{% image
src="/images/v1.1/how-to-guides/governance/task2.png"
alt="Task: Accept Suggestion and Comment"
caption="Task: Accept Suggestion and Comment"
/%}
## Conversations around Classification
Apart from requesting for tags, users can also create a **Conversation** around the tags assigned to a data asset.
- Click on the **Conversation** icon next to the tag.
{% image
src="/images/v1.1/how-to-guides/governance/ct1.png"
alt="Conversations around Tags"
caption="Conversations around Tags"
/%}
- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset.
{% image
src="/images/v1.1/how-to-guides/governance/ct2.png"
alt="Start a Conversation"
caption="Start a Conversation"
/%}
- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**.
{% image
src="/images/v1.1/how-to-guides/governance/ct3.png"
alt="Conversation: Reply, React, Edit or Delete"
caption="Conversation: Reply, React, Edit or Delete"
/%}
## Auto-Classification in OpenMetadata
OpenMetadata identifies PII data and auto tags or suggests the tags. The data profiler automatically tags the PII-Sensitive data. The addition of tags about PII data helps consumers and governance teams identify data that needs to be treated carefully.
In the example below, the columns user_name and social security number are auto-tagged as PII-sensitive. This works using NLP as part of the profiler during ingestion.
{% image
src="/images/v1.1/how-to-guides/governance/auto1.png"
alt="User_name and Social Security Number are Auto-Classified as PII Sensitive"
caption="User_name and Social Security Number are Auto-Classified as PII Sensitive"
/%}
In the below example, the column dwh_x10 is also auto-tagged as PII Sensitive, even though the column name does not provide much information.
{% image
src="/images/v1.1/how-to-guides/governance/auto2.png"
alt="Column Name does not provide much information"
caption="Column Name does not provide much information"
/%}
When we look at the content of the column dwh_x10 in the Sample Data tab, it becomes clear that the auto-classification is based on the data in the column.
{% image
src="/images/v1.1/how-to-guides/governance/auto3.png"
alt="Column Data provides information"
caption="Column Data provides information"
/%}
You can read more about [Auto PII Tagging](https://docs.open-metadata.org/v1.1.x/connectors/ingestion/auto_tagging) here.
## Tag Mapping
Tag mapping is supported in the backend and not in the OpenMetadata UI. When two related tags are associated with each other, applying one tag, automatically applies the other tag. For example, when the tag `Personal Data.Personal` is applied, it automatically applies another tag `Data Classification.Confidential`. That way, applying the tag `Personal` automatically applies the tag `Confidential`.
{%inlineCallout
color="violet-70"
bold="Best Practices for Glossary and Classification"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices"%}
Here are the Top 8 Best Practices around Terminologies.
{%/inlineCallout%}

View File

@ -0,0 +1,48 @@
---
title: How to Create Glossary Terms
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms
---
# How to Create Glossary Terms
Once a glossary has been created, you can add multiple **Glossary Terms** and **Child Terms** in it.
- Once in the Glossary, click on **Add Term**.
{% image
src="/images/v1.1/how-to-guides/governance/glossary-term.png"
alt="Add Glossary Term"
caption="Add Glossary Term"
/%}
- Enter the required information:
- **Name*** - This contains the name of the glossary term, and is a required field.
- **Display Name** - This contains the Display name of the glossary term.
- **Description*** - A unique and clear definition to establish consistent usage and understanding of the term. This is a required field.
- **Tags** - Classification tags can be added to glossary terms. When adding a glossary term to assets, it will also add the associated tags to that asset. This helps to further describe and categorize the data assets.
- **Synonyms** - Other terms that are used for the same concept. For e.g., for a term Customer, the synonyms can be Client, Shopper, Purchaser.
- **Related Terms** - These terms can build a network of concepts to capture an associative relationship. For e.g., for a term Customer, the related terms can be Customer LTV (LifeTime Value), Customer Acquisition Cost (CAC).
- **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be PII-Sensitive or a PII-NonSensitive. It cannot be both. For such cases, a Glossary Term can be created where the child terms can be mutually exclusive. If this configuration is enabled, you wont be able to assign multiple terms from the same Glossary Term to the same data asset.
- **References** - Add links from the internet from where you inherited the term.
- **Owner** - Either a Team or a User can be the Owner of a Glossary term.
- **Reviewers** - Multiple reviewers can be added.
Once a glossary term has been added, you can create **Child Terms** under it. The child terms help to build a conceptual hierarchy (Parent-Child relationship) to go from generic to specific concepts. For e.g., for a term Customer, the child terms can be Loyal Customer, New Customer, Online Customer.
Instead of creating a glossary manually, you can **[bulk upload glossary terms](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary)** using a CSV file.
{%inlineCallout
color="violet-70"
bold="How to Bulk Import a Glossary"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%}
Save time and effort by bulk uploading glossary terms using a CSV file.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Add Assets to Glossary Terms"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%}
Associate glossary terms to data assets making it easier for data discovery
{%/inlineCallout%}

View File

@ -0,0 +1,120 @@
---
title: What is a Glossary
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary
---
# What is a Glossary
A Glossary is a Controlled Vocabulary to describe important concepts and terminologies within your organization to foster a common and consistent understanding of data. It defines concepts related to a specific domain. For example, Business Glossary or Bank Glossary. A well-defined business glossary helps foster team collaboration with the use of standard terms. Glossaries are important for data discovery, retrieval, and exploration through conceptual terms, and facilitates **Data Governance**.
Glossary adds semantics or meaning to data. OpenMetadata models a Glossary as a Thesauri that organizes terms with **hierarchical**, equivalent, and associative relationships within a domain.
The Glossary in OpenMetadata can be accessed from **Govern >> Glossary**. All the Glossaries are displayed in the left nav bar. Clicking on a specific glossary will display the expanded view to show the entire hierarchy of the glossary terms (parent-child terms).
{% image
src="/images/v1.1/how-to-guides/governance/glossary0.png"
alt="Banking Glossary"
caption="Banking Glossary"
/%}
{% note %}
**Tip:** A well-defined and centralized glossary makes it easy to **onboard new team members** and help them get familiar with the **organizational terminology**.
{% /note %}
## Glossary Term
A Glossary Term is a preferred terminology for a concept. In a Glossary term, you can add tags, synonyms, related terms to build a conceptual semantic graph, and also add reference links.
The glossary term can include additional information as follows:
- **Description** - A unique and clear definition to establish consistent usage and understanding of the term. This is a mandatory requirement.
- **Tags** - Classification tags can be added to glossary terms. When adding a glossary term to assets, it will also add the associated tags to that asset. This helps to further describe and categorize the data assets.
- **Synonyms** - Other terms that are used for the same concept. For e.g., for a term Customer, the synonyms can be Client, Shopper, Purchaser.
- **Child Terms** - Child terms help to build a conceptual hierarchy (Parent-Child relationship) to go from generic to specific concepts. For e.g., for a term Customer, the child terms can be Loyal Customer, New Customer, Online Customer.
- **Related Terms** - These terms can build a network of concepts to capture an associative relationship. For e.g., for a term Customer, the related terms can be Customer LTV (LifeTime Value), Customer Acquisition Cost (CAC).
- **References** - Add links from the internet from where you inherited the term.
- **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be PII-Sensitive or a PII-NonSensitive. It cannot be both. For such cases, a Glossary or a Glossary Term can be created where the child terms can be mutually exclusive. If this configuration is enabled, you wont be able to assign multiple terms from the same Glossary/Term to the same data asset.
- **Reviewers** - Multiple reviewers can be added.
- **Assets** - After creating a glossary term, data assets can be associated with the term.
{% image
src="/images/v1.1/how-to-guides/governance/glossary-term.png"
alt="Glossary Term Requirements"
caption="Glossary Term Requirements"
/%}
The details of a Glossary Term in OpenMetadata are displayed in three tabs: Overview, Glossary Terms, and Assets. The **Overview tab** displays the details of the term, along with the synonyms, related terms, references, and tags. It also displays the Owner and the Reviewers for the Glossary Term.
{% image
src="/images/v1.1/how-to-guides/governance/term1.png"
alt="Overview of a Glossary Term"
caption="Overview of a Glossary Term"
/%}
The **Glossary Term Tab** displays all the child terms associated with the parent term. You can also add more child terms from this tab.
{% image
src="/images/v1.1/how-to-guides/governance/term2.png"
alt="Glossary Terms Tab"
caption="Glossary Terms Tab"
/%}
{% note %}
**Tip:** Glossary terms help to organize as well as discover data assets.
{% /note %}
The **Assets Tab** displays all the assets that are associated with the glossary term. These data assets are further subgrouped as Tables, Topics, Dashboards. The right side panel shows a preview of the data assets selected.
{% image
src="/images/v1.1/how-to-guides/governance/term3.png"
alt="Assets Tab"
caption="Assets Tab"
/%}
You can add more assets by clicking on **Add > Assets**. You can further search and filter assets by type. Simply select the relevant assets and click Save. The glossary term lists the Assets, which makes it easy to discover all the data assets related to the term.
{% note %}
**Pro Tip:** The Global Search in OpenMetadata also helps discover related Glossary Terms and Tags.
{% image
src="/images/v1.1/how-to-guides/governance/tag1.png"
alt="Search for Glossary Terms and Tags"
caption="Search for Glossary Terms and Tags"
/%}
{% /note %}
## Glossary and Glossary Term Version History
The glossary as well as the terms maintain a version history, which can be viewed on the top right. Clicking on the number will display the details of the **Version History**.
{% image
src="/images/v1.1/how-to-guides/governance/version.png"
alt="Glossary Term Version History"
caption="Glossary Term Version History"
/%}
The Backward compatible changes result in a **Minor** version change. A change in the description, tags, or ownership will increase the version of the entity metadata by **0.1** (e.g., from 0.1 to 0.2).
The Backward incompatible changes result in a **Major** version change. For example, when a term is deleted, the version increases by **1.0** (e.g., from 0.2 to 1.2).
## Glossary APIs
OpenMetadata has extensive Glossary APIs. The main entities are **Glossary** and **Glossary Term**. These entities are identified by a Unique ID. Glossary terms have a fully qualified name in the form of `glossary.parentTerm.childTerm`
You can create, delete, modify, and update using APIs. Refer to the **[Glossary API documentation](https://sandbox.open-metadata.org/docs#tag/Glossaries)**.
You can also [export or bulk import the glossary terms](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary) using a CSV file.
{%inlineCallout
color="violet-70"
bold="What is Classification"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/classification"%}
Learn about the classification tags, system tags, and mutually exclusive tags.
{%/inlineCallout%}

View File

@ -0,0 +1,113 @@
---
title: How to Bulk Import a Glossary
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary
---
# How to Bulk Import a Glossary
OpenMetadata supports **Glossary Bulk Upload** to save time and effort by uploading a CSV with thousands of terms in one go. You can create or update multiple glossary terms simultaneously. When bulk uploading, Owners and Reviewers can be defined, who will be further propagated to every glossary term.
To import a glossary into OpenMetadata:
- Navigate to **Govern > Glossary**
- Click on the **⋮** icon and **Export** the glossary file. If you have glossary terms in your Glossary, the same will be exported as a CSV file. If you have If there are no terms in the Glossary, then a blank CSV template will be downloaded.
{% image
src="/images/v1.1/how-to-guides/governance/glossary8.png"
alt="Export Glossary File"
caption="Export Glossary File"
/%}
- Once you have the template, you can fill in the following details:
- **parent** - The parent column helps to define the hierarchy of the glossary terms. If you leave this field blank, the Term will be created at the root level. If you want to create a hierarchy of Glossary Terms, the parent details must be entered as per the hierarchy. For example, from the Glossary level, `Banking.Account.Savings Account`
{% image
src="/images/v1.1/how-to-guides/governance/glossary9.png"
alt="Hierarchy can be defined in the Parent Column"
caption="Hierarchy can be defined in the Parent Column"
/%}
- **name*** - This contains the name of the glossary term, and is a required field.
- **displayName** - This contains the Display name of the glossary term.
- **description*** - This contains the description or details of the glossary term and is a required field.
- **synonyms** - Include words that have the same meaning as the glossary term. For e.g., for a term Customer, the synonyms can be Client, Shopper, Purchaser. In the CSV file, the synonyms must be separated by a semicolon (;) as in `Client;Shopper;Purchaser`
- **relatedTerms** - A term which has a related concept as the glossary term. This term must be available in OpenMetadata. For e.g., for a term Customer, the related terms can be Customer LTV (LifeTime Value), Customer Acquisition Cost (CAC). In the CSV file, the relatedTerms must contain the hierarchy, which is separated by a full stop (.). Multiple terms must be separated by a semicolon (;) as in `Banking.Account.Savings account;Banking.Debit card`
- **references** - Add links from the internet from where you inherited the term. In the CSV file, the references must be in the format (name;url;name;url) `IBM;https://www.ibm.com/;World Bank;https://www.worldbank.org/`
- **tags** - Add the tags which are already existing in OpenMetadata. In the CSV file, the tags must be in the format `PII.Sensitive;PersonalData.Personal`
The * marked fields are required fields.
- To create a new glossary, navigate to **Govern > Glossary** and first **Add** a new glossary. You can also bulk upload terms to an existing glossary.
{% image
src="/images/v1.1/how-to-guides/governance/glossary1.png"
alt="Add a New Glossary"
caption="Add a New Glossary"
/%}
- Add the Name*, Display Name, Description*, Tags, Owner, and Reviewer details for the glossary.
{% image
src="/images/v1.1/how-to-guides/governance/glossary2.png"
alt="Configure the Glossary"
caption="Configure the Glossary"
/%}
## Mutually Exclusive
You can also mark the Glossary as Mutually Exclusive if you want only one of the terms from the glossary to be applicable to the data assets. There are cases where only one glossary term from a Glossary is relevant for a data asset. For example, an asset can either be PII Sensitive or PII Non-Sensitive. It cannot be both. For such cases, a Glossary can be created where the terms can be mutually exclusive. If this configuration is enabled, you wont be able to assign multiple tags from the same Glossary to the same data asset.
## Add Owners and Reviewers to a Glossary
If the Owner details are added while creating the glossary, the same will be inherited for the glossary terms. Either a Team or a User can be the **Owner** of a Glossary. Multiple users can be **Reviewers**. These can be changed later. The glossary **Owner and Reviewers** are inherited for all the glossary terms.
- Once the CSV file is ready, click on the ⋮ icon and select the **Import** button.
- Drag and drop the CSV file, or upload it by clicking on the Browse button.
{% image
src="/images/v1.1/how-to-guides/governance/import0.png"
alt="Import the Glossary CSV File"
caption="Import the Glossary CSV File"
/%}
- The import utility will validate the file and a **Preview** of the elements that will be imported to OpenMetadata is displayed.
- After previewing the uploaded terms, click on **Import**.
{% image
src="/images/v1.1/how-to-guides/governance/import1.png"
alt="Preview of the Glossary"
caption="Preview of the Glossary"
/%}
- The glossary terms will be scanned and imported. After which a Success or Failure message will be displayed.
{% image
src="/images/v1.1/how-to-guides/governance/import2.png"
alt="Glossary Imported Successfully"
caption="Glossary Imported Successfully"
/%}
- Once a part of the terms or all terms are created successfully, the Import button will be displayed. Click on Import to create the glossary terms from the CSV file in OpenMetadata.
- Next you can **View** the imported glossary. You can **Expand All** the terms to view the nested terms. Glossary terms can be **dragged and dropped** as required to rearrange the glossary.
- The glossary **Owner** is inherited for all the glossary terms.
{% image
src="/images/v1.1/how-to-guides/governance/import3.png"
alt="Drag and Drop Glossary Terms to Rearrange the Hierarchy"
caption="Drag and Drop Glossary Terms to Rearrange the Hierarchy"
/%}
Both importing and exporting the Glossary from OpenMetadata is quick and easy!
{%inlineCallout
color="violet-70"
bold="How to Add Assets to Glossary Terms"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%}
Associate glossary terms to data assets making it easier for data discovery
{%/inlineCallout%}

View File

@ -0,0 +1,78 @@
---
title: Glossary and Classification
slug: /how-to-guides/openmetadata/data-governance/glossary-classification
---
# Glossary and Classification
**Glossary** and **Classification** are both controlled vocabulary and can be used for labeling data. A controlled vocabulary is an organized arrangement of words and phrases to define terminology to organize and retrieve information. Glossary adds meaning to data by defining the business terminologies, whereas Classification helps in defining the type of data.
Watch the [Webinar on Glossaries and Classifications in OpenMetadata](https://www.youtube.com/watch?v=LII_5CDo_0s)
[![Watch the video](/images/v1.1/how-to-guides/governance/glossary-webinar.png)](https://www.youtube.com/watch?v=LII_5CDo_0s)
{%inlineCalloutContainer%}
{%inlineCallout
color="violet-70"
bold="What is a Glossary"
icon="MdMenuBook"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary"%}
Create glossaries in OpenMetadata with hierarchically arranged glossary terms.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="What is Classification"
icon="MdDiscount"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/classification"%}
Learn about the classification tags, system tags, and mutually exclusive tags.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="What are Tiers"
icon="MdDiscount"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/tiers"%}
Tiers helps to define the importance of data to an organization.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Setup a Glossary"
icon="MdMenuBook"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary"%}
Learn how to set up a glossary manually in OpenMetadata.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Create Glossary Terms"
icon="MdMenuBook"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms"%}
Setup glossary terms to define the terminology. Add tags, synonyms, related terms, links, etc.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Bulk Import a Glossary"
icon="MdUpload"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%}
Save time and effort by bulk uploading glossary terms using a CSV file.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Add Assets to Glossary Terms"
icon="MdPushPin"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%}
Associate glossary terms to data assets making it easier for data discovery
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Classify Data Assets"
icon="MdDiscount"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%}
Add tags to data assets, or request them and discuss about the same, all within OpenMetadata.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="Best Practices for Glossary and Classification"
icon="MdThumbUp"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices"%}
Here are the Top 8 Best Practices around Terminologies.
{%/inlineCallout%}
{%/inlineCalloutContainer%}

View File

@ -0,0 +1,75 @@
---
title: How to Setup a Glossary
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary
---
# How to Setup a Glossary
To create a glossary manually in OpenMetadata:
- Navigate to **Govern > Glossary**
- Click on **+ Add** to add a new glossary
{% image
src="/images/v1.1/how-to-guides/governance/glossary1.png"
alt="Add a New Glossary"
caption="Add a New Glossary"
/%}
- Enter the details to configure the glossary.
- **Name*** - This is a required field.
- **Display Name**
- **Description*** - Describe the context or domain of the glossary. This is a required field.
- **Tags** - Classification tags can be added to a glossary.
- **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be PII-Sensitive or a PII-NonSensitive. It cannot be both. For such cases, a Glossary can be created where the glossary terms can be mutually exclusive. If this configuration is enabled, you wont be able to assign multiple terms from the same Glossary to the same data asset.
- **Owner** - Either a Team or a User can be the Owner of a Glossary.
- **Reviewers** - Multiple reviewers can be added.
{% image
src="/images/v1.1/how-to-guides/governance/glossary2.png"
alt="Configure the Glossary"
caption="Configure the Glossary"
/%}
## Add a Owner and Reviewers to a Glossary
When creating a glossary, you can add the glossary owner. Either a Team or a User can be a Owner of the Glossary. Simply click on the option for **Owner** to select the user or team.
Multiple users can be added as Reviewers by clicking on the pencil icon. If the **Reviewer** details exist for a glossary, then the same details are reflected when adding a new term manually as well.
{% image
src="/images/v1.1/how-to-guides/governance/owner.png"
alt="Add Owner and Reviewers"
caption="Add Owner and Reviewers"
/%}
If the Owner and Reviewer details are added while creating the glossary, and the glossary terms are **[bulk uploaded using a CSV file](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary)**, then the glossary Owner and Reviewers are inherited for all the glossary terms. These details can be changed later.
{%inlineCallout
color="violet-70"
bold="How to Create Glossary Terms"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms"%}
Setup Glossary Terms to define the terminology. Add tags, synonyms, related terms, links, etc.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Bulk Import a Glossary"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%}
Save time and effort by bulk uploading glossary terms using a CSV file.
{%/inlineCallout%}
{%inlineCallout
color="violet-70"
bold="How to Add Assets to Glossary Terms"
icon="MdArrowForward"
href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%}
Associate glossary terms to data assets making it easier for data discovery
{%/inlineCallout%}

View File

@ -0,0 +1,36 @@
---
title: What are Tiers
slug: /how-to-guides/openmetadata/data-governance/glossary-classification/tiers
---
# What are Tiers
Tiering is an important concept of data classification in OpenMetadata. Tiers should be based on the importance of data. Using Tiers, data producers or owners can define the importance of data to an organization.
In OpenMetadata, Tiers are System Classification tags and can be accessed from **Govern > Classification > Tier**.
{% image
src="/images/v1.1/how-to-guides/governance/tier1.png"
alt="Classification Tags: Tiers"
caption="Classification Tags: Tiers"
/%}
In case of tiering, it is easiest to start with the most important (Tier 1) and the least important (Tier 5) data. Once the **Tier 1** or most important data is identified, organizations can focus on improving the descriptions and data quality. The Data Insights in OpenMetadata helps identify the unused datasets as **Tier 5**. The Tier 5 datasets can be deleted periodically to declutter. Other tiers can be added as per your organizational needs. **Tags** can be added to further mark the data assets.
| **Tier** | **Impact** | **Used for** | **Type of Impact** | **Usage** |
|--- | --- | --- | --- | --- |
| **Tier 1** | High | External & Internal Decisions | Revenue, Regulatory, & Reputational | Highly used |
| **Tier 2** | Moderate | Some External & Mostly Internal Decisions | Some Regulatory | Highly used |
| **Tier 3** | Low | Internal Decisions | - | Highly used (Top N percentile) |
| **Tier 4** | Low | Internal Team Decisions | - | - |
| **Tier 5** | Individual owned | Unused Datasets | - | - |
## How to Add Tiers
From the **Explore** page, select a data asset and click on the edit icon for **Tier**. Select the appropriate tier. Clicking on the arrow next to the tier will provide a description of the tier.
{% image
src="/images/v1.1/how-to-guides/governance/tier2.png"
alt="Add a Tier to Data Asset"
caption="Add a Tier to Data Asset"
/%}

View File

@ -0,0 +1,18 @@
---
title: Data Governance
slug: /how-to-guides/openmetadata/data-governance
---
# Data Governance
OpenMetadata is a rich collaborative platform for data teams. Data producers and data consumers can access all their organizational metadata from OpenMetadata. Users can mutually benefit from the teams collaborative expertise around data. With several teams and users having access to the organizational data assets in OpenMetadata, it is crucial to have some form of governance in place. OpenMetadata supports [fine-grained Access Control Roles and Policies](https://docs.open-metadata.org/v1.1.x/how-to-guides/admin-guide-roles-policies) to ensure data security.
Apart from well-defined access control roles and policies, a common vocabulary within the organization fosters effective collaboration and helps in data governance. A **Business Glossary** plays an important role in defining the common terminology in the organization. Data also needs be classified and tagged for policy enforcement purposes like privacy policy, data management policy, data retention policy, and so on. Using **Classification** you can manage access to the PII sensitive data in OpenMetadata.
{%inlineCallout
color="violet-70"
bold="Glossary and Classification"
icon="MdMenuBook"
href="/how-to-guides/openmetadata/data-governance/glossary-classification"%}
Learn more about the Glossaries and Classification Tags in OpenMetadata.
{%/inlineCallout%}

View File

@ -0,0 +1,18 @@
---
title: The Pillars of OpenMetadata
slug: /how-to-guides/openmetadata
---
# The Six Pillars of OpenMetadata
OpenMetadata is an all-in-one platform for data discovery, lineage, data quality, observability, governance, and team collaboration. Powered by a centralized metadata store based on Open Metadata Standards/APIs, supporting connectors to a wide range of data services, OpenMetadata enables end-to-end metadata management, giving you the freedom to unlock the value of your data assets.
OpenMetadata is a complete package for data teams to break down team silos, share data assets from multiple sources securely, collaborate around data, and build a documentation-first data culture in the organization.
Let us learn more about the six pillars of OpenMetadata that helps maintain its ground as the best in effective metadata management:
1. Data Discovery,
2. Data Collaboration,
3. Data Quality and Profiler,
4. Data Lineage,
5. Data insights, and
6. [Data Governance](/how-to-guides/openmetadata/data-governance).

View File

@ -609,6 +609,30 @@ site_menu:
url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/announcements
- category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Create an Announcement
url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement
- category: How to Guides / The Pillars of OpenMetadata
url: /how-to-guides/openmetadata
- category: How to Guides / The Pillars of OpenMetadata / Data Governance
url: /how-to-guides/openmetadata/data-governance
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification
url: /how-to-guides/openmetadata/data-governance/glossary-classification
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / What is a Glossary
url: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / What is Classification
url: /how-to-guides/openmetadata/data-governance/glossary-classification/classification
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / What are Tiers
url: /how-to-guides/openmetadata/data-governance/glossary-classification/tiers
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Setup a Glossary
url: /how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Create Glossary Terms
url: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Bulk Import a Glossary
url: /how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Add Assets to Glossary Terms
url: /how-to-guides/openmetadata/data-governance/glossary-classification/assets
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Classify Data Assets
url: /how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets
- category: How to Guides / The Pillars of OpenMetadata / Data Governance / Glossary and Classification / Best Practices for Glossary and Classification
url: /how-to-guides/openmetadata/data-governance/glossary-classification/best-practices
- category: How to Guides / CLI Ingestion with basic auth
url: /how-to-guides/cli-ingestion-with-basic-auth
- category: How to Guides / Feature configurations

Binary file not shown.

After

Width:  |  Height:  |  Size: 229 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 242 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 470 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 207 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 437 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 181 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 828 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 331 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 346 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 207 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 706 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 518 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 624 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 481 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 918 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 380 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 559 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 338 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 960 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 149 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 709 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 434 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 442 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 582 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 726 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 602 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 282 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 394 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 878 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 324 KiB