mirror of
https://github.com/datahub-project/datahub.git
synced 2025-09-02 13:53:06 +00:00
docs(business glossary) Update business glossary docs (#8287)
This commit is contained in:
parent
1343082535
commit
8880b47ca1
@ -1,46 +1,251 @@
|
||||
### Business Glossary File Format
|
||||
|
||||
The business glossary source file should be a `.yml` file with the following top-level keys:
|
||||
The business glossary source file should be a .yml file with the following top-level keys:
|
||||
|
||||
**Glossary**: the top level keys of the business glossary file
|
||||
- **version**: the version of business glossary file config the config conforms to. Currently the only version released is `1`.
|
||||
- **source**: the source format of the terms. Currently only supports `DataHub`
|
||||
- **owners**: owners contains two nested fields
|
||||
- **users**: (optional) a list of user ids
|
||||
- **groups**: (optional) a list of group ids
|
||||
- **url**: (optional) external url pointing to where the glossary is defined externally, if applicable.
|
||||
- **nodes**: (optional) list of child **GlossaryNode** objects
|
||||
- **terms**: (optional) list of child **GlossaryTerm** objects
|
||||
|
||||
Example **Glossary**:
|
||||
|
||||
```yaml
|
||||
version: 1 # the version of business glossary file config the config conforms to. Currently the only version released is `1`.
|
||||
source: DataHub # the source format of the terms. Currently only supports `DataHub`
|
||||
owners: # owners contains two nested fields
|
||||
users: # (optional) a list of user IDs
|
||||
- njones
|
||||
groups: # (optional) a list of group IDs
|
||||
- logistics
|
||||
url: "https://github.com/datahub-project/datahub/" # (optional) external url pointing to where the glossary is defined externally, if applicable
|
||||
nodes: # list of child **GlossaryNode** objects. See **GlossaryNode** section below
|
||||
...
|
||||
```
|
||||
|
||||
**GlossaryNode**: a container of **GlossaryNode** and **GlossaryTerm** objects
|
||||
- **name**: name of the node
|
||||
- **description**: description of the node
|
||||
- **id**: (optional) identifier of the node (normally inferred from the name, see `enable_auto_id` config. Use this if you need a stable identifier)
|
||||
- **owners**: (optional) owners contains two nested fields
|
||||
- **users**: (optional) a list of user ids
|
||||
- **groups**: (optional) a list of group ids
|
||||
- **terms**: (optional) list of child **GlossaryTerm** objects
|
||||
- **nodes**: (optional) list of child **GlossaryNode** objects
|
||||
|
||||
Example **GlossaryNode**:
|
||||
|
||||
```yaml
|
||||
- name: Shipping # name of the node
|
||||
description: Provides terms related to the shipping domain # description of the node
|
||||
owners: # (optional) owners contains 2 nested fields
|
||||
users: # (optional) a list of user IDs
|
||||
- njones
|
||||
groups: # (optional) a list of group IDs
|
||||
- logistics
|
||||
nodes: # list of child **GlossaryNode** objects
|
||||
...
|
||||
knowledge_links: # (optional) list of **KnowledgeCard** objects
|
||||
- label: Wiki link for shipping
|
||||
url: "https://en.wikipedia.org/wiki/Freight_transport"
|
||||
```
|
||||
|
||||
**GlossaryTerm**: a term in your business glossary
|
||||
- **name**: name of the term
|
||||
- **description**: description of the term
|
||||
- **id**: (optional) identifier of the term (normally inferred from the name, see `enable_auto_id` config. Use this if you need a stable identifier)
|
||||
- **owners**: (optional) owners contains two nested fields
|
||||
- **users**: (optional) a list of user ids
|
||||
- **groups**: (optional) a list of group ids
|
||||
- **term_source**: One of `EXTERNAL` or `INTERNAL`. Whether the term is coming from an external glossary or one defined in your organization.
|
||||
- **source_ref**: (optional) If external, what is the name of the source the glossary term is coming from?
|
||||
- **source_url**: (optional) If external, what is the url of the source definition?
|
||||
- **inherits**: (optional) List of **GlossaryTerm** that this term inherits from
|
||||
- **contains**: (optional) List of **GlossaryTerm** that this term contains
|
||||
- **custom_properties**: A map of key/value pairs of arbitrary custom properties
|
||||
- **domain**: (optional) domain name or domain urn
|
||||
|
||||
You can also view an example business glossary file checked in [here](https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/bootstrap_data/business_glossary.yml)
|
||||
Example **GlossaryTerm**:
|
||||
|
||||
```yaml
|
||||
- name: FullAddress # name of the term
|
||||
description: A collection of information to give the location of a building or plot of land. # description of the term
|
||||
owners: # (optional) owners contains 2 nested fields
|
||||
users: # (optional) a list of user IDs
|
||||
- njones
|
||||
groups: # (optional) a list of group IDs
|
||||
- logistics
|
||||
term_source: "EXTERNAL" # one of `EXTERNAL` or `INTERNAL`. Whether the term is coming from an external glossary or one defined in your organization.
|
||||
source_ref: FIBO # (optional) if external, what is the name of the source the glossary term is coming from?
|
||||
source_url: "https://www.google.com" # (optional) if external, what is the url of the source definition?
|
||||
inherits: # (optional) list of **GlossaryTerm** that this term inherits from
|
||||
- Privacy.PII
|
||||
contains: # (optional) a list of **GlossaryTerm** that this term contains
|
||||
- Shipping.ZipCode
|
||||
- Shipping.CountryCode
|
||||
- Shipping.StreetAddress
|
||||
custom_properties: # (optional) a map of key/value pairs of arbitrary custom properties
|
||||
- is_used_for_compliance_tracking: true
|
||||
knowledge_links: # (optional) a list of **KnowledgeCard** related to this term. These appear as links on the glossary node's page
|
||||
- url: "https://en.wikipedia.org/wiki/Address"
|
||||
label: Wiki link
|
||||
domain: "urn:li:domain:Logistics" # (optional) domain name or domain urn
|
||||
```
|
||||
|
||||
To see how these all work together, check out this comprehensive example business glossary file below:
|
||||
|
||||
<details>
|
||||
<summary>Example business glossary file</summary>
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
source: DataHub
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
url: "https://github.com/datahub-project/datahub/"
|
||||
nodes:
|
||||
- name: Classification
|
||||
description: A set of terms related to Data Classification
|
||||
knowledge_links:
|
||||
- label: Wiki link for classification
|
||||
url: "https://en.wikipedia.org/wiki/Classification"
|
||||
terms:
|
||||
- name: Sensitive
|
||||
description: Sensitive Data
|
||||
custom_properties:
|
||||
is_confidential: false
|
||||
- name: Confidential
|
||||
description: Confidential Data
|
||||
custom_properties:
|
||||
is_confidential: true
|
||||
- name: HighlyConfidential
|
||||
description: Highly Confidential Data
|
||||
custom_properties:
|
||||
is_confidential: true
|
||||
domain: Marketing
|
||||
- name: PersonalInformation
|
||||
description: All terms related to personal information
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
terms:
|
||||
- name: Email
|
||||
## An example of using an id to pin a term to a specific guid
|
||||
## See "how to generate custom IDs for your terms" section below
|
||||
# id: "urn:li:glossaryTerm:41516e310acbfd9076fffc2c98d2d1a3"
|
||||
description: An individual's email address
|
||||
inherits:
|
||||
- Classification.Confidential
|
||||
owners:
|
||||
groups:
|
||||
- Trust and Safety
|
||||
- name: Address
|
||||
description: A physical address
|
||||
- name: Gender
|
||||
description: The gender identity of the individual
|
||||
inherits:
|
||||
- Classification.Sensitive
|
||||
- name: Shipping
|
||||
description: Provides terms related to the shipping domain
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
terms:
|
||||
- name: FullAddress
|
||||
description: A collection of information to give the location of a building or plot of land.
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://www.google.com"
|
||||
inherits:
|
||||
- Privacy.PII
|
||||
contains:
|
||||
- Shipping.ZipCode
|
||||
- Shipping.CountryCode
|
||||
- Shipping.StreetAddress
|
||||
related_terms:
|
||||
- Housing.Kitchen.Cutlery
|
||||
custom_properties:
|
||||
- is_used_for_compliance_tracking: true
|
||||
knowledge_links:
|
||||
- url: "https://en.wikipedia.org/wiki/Address"
|
||||
label: Wiki link
|
||||
domain: "urn:li:domain:Logistics"
|
||||
knowledge_links:
|
||||
- label: Wiki link for shipping
|
||||
url: "https://en.wikipedia.org/wiki/Freight_transport"
|
||||
- name: ClientsAndAccounts
|
||||
description: Provides basic concepts such as account, account holder, account provider, relationship manager that are commonly used by financial services providers to describe customers and to determine counterparty identities
|
||||
owners:
|
||||
groups:
|
||||
- finance
|
||||
terms:
|
||||
- name: Account
|
||||
description: Container for records associated with a business arrangement for regular transactions and services
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
inherits:
|
||||
- Classification.HighlyConfidential
|
||||
contains:
|
||||
- ClientsAndAccounts.Balance
|
||||
- name: Balance
|
||||
description: Amount of money available or owed
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Balance"
|
||||
- name: Housing
|
||||
description: Provides terms related to the housing domain
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
groups:
|
||||
- interior
|
||||
nodes:
|
||||
- name: Colors
|
||||
description: "Colors that are used in Housing construction"
|
||||
terms:
|
||||
- name: Red
|
||||
description: "red color"
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Green
|
||||
description: "green color"
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Pink
|
||||
description: pink color
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
terms:
|
||||
- name: WindowColor
|
||||
description: Supported window colors
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
values:
|
||||
- Housing.Colors.Red
|
||||
- Housing.Colors.Pink
|
||||
|
||||
- name: Kitchen
|
||||
description: a room or area where food is prepared and cooked.
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Spoon
|
||||
description: an implement consisting of a small, shallow oval or round bowl on a long handle, used for eating, stirring, and serving food.
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
related_terms:
|
||||
- Housing.Kitchen
|
||||
knowledge_links:
|
||||
- url: "https://en.wikipedia.org/wiki/Spoon"
|
||||
label: Wiki link
|
||||
```
|
||||
</details>
|
||||
|
||||
Source file linked [here](https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/bootstrap_data/business_glossary.yml).
|
||||
|
||||
## Generating custom IDs for your terms
|
||||
|
||||
IDs are normally inferred from the glossary term/node's name, see the `enable_auto_id` config. But, if you need a stable
|
||||
identifier, you can generate a custom ID for your term. It should be unique across the entire Glossary.
|
||||
|
||||
Here's an example ID:
|
||||
`id: "urn:li:glossaryTerm:41516e310acbfd9076fffc2c98d2d1a3"`
|
||||
|
||||
A note of caution: once you select a custom ID, it cannot be easily changed.
|
||||
|
||||
## Compatibility
|
||||
|
||||
Compatible with version 1 of business glossary format.
|
||||
The source will be evolved as we publish newer versions of this format.
|
||||
The source will be evolved as we publish newer versions of this format.
|
@ -45,6 +45,39 @@ nodes:
|
||||
description: The gender identity of the individual
|
||||
inherits:
|
||||
- Classification.Sensitive
|
||||
- name: Shipping
|
||||
description: Provides terms related to the shipping domain
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
terms:
|
||||
- name: FullAddress
|
||||
description: A collection of information to give the location of a building or plot of land.
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://www.google.com"
|
||||
inherits:
|
||||
- Privacy.PII
|
||||
contains:
|
||||
- Shipping.ZipCode
|
||||
- Shipping.CountryCode
|
||||
- Shipping.StreetAddress
|
||||
custom_properties:
|
||||
- is_used_for_compliance_tracking: true
|
||||
knowledge_links:
|
||||
- url: "https://en.wikipedia.org/wiki/Address"
|
||||
label: Wiki link
|
||||
domain: "urn:li:domain:Logistics"
|
||||
knowledge_links:
|
||||
- label: Wiki link for shipping
|
||||
url: "https://en.wikipedia.org/wiki/Freight_transport"
|
||||
- name: ClientsAndAccounts
|
||||
description: Provides basic concepts such as account, account holder, account provider, relationship manager that are commonly used by financial services providers to describe customers and to determine counterparty identities
|
||||
owners:
|
||||
@ -68,6 +101,8 @@ nodes:
|
||||
- name: Housing
|
||||
description: Provides terms related to the housing domain
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
groups:
|
||||
- interior
|
||||
nodes:
|
||||
|
Loading…
x
Reference in New Issue
Block a user