mirror of
https://github.com/datahub-project/datahub.git
synced 2025-12-24 16:38:19 +00:00
feat(ingestion/business-glossary): Automatically generate predictable glossary term and node URNs when incompatible URL characters are specified in term and node names. (#12673)
This commit is contained in:
parent
4714f46f11
commit
a700448bad
@ -20,6 +20,8 @@ This file documents any backwards-incompatible changes in DataHub and assists pe
|
||||
|
||||
### Breaking Changes
|
||||
|
||||
- #12673: Business Glossary ID generation has been modified to handle special characters and URL cleaning. When `enable_auto_id` is false (default), IDs are now generated by cleaning the name (converting spaces to hyphens, removing special characters except periods which are used as path separators) while preserving case. This may result in different IDs being generated for terms with special characters.
|
||||
|
||||
- #12580: The OpenAPI source handled nesting incorrectly. 12580 fixes it to create proper nested field paths, however, this will re-write the incorrect schemas of existing OpenAPI runs.
|
||||
|
||||
- #12408: The `platform` field in the DataPlatformInstance GraphQL type is removed. Clients need to retrieve the platform via the optional `dataPlatformInstance` field.
|
||||
|
||||
@ -24,7 +24,8 @@ nodes: # list of child **Glossa
|
||||
Example **GlossaryNode**:
|
||||
|
||||
```yaml
|
||||
- name: Shipping # name of the node
|
||||
- name: "Shipping" # name of the node
|
||||
id: "Shipping-Logistics" # (optional) custom identifier for the node
|
||||
description: Provides terms related to the shipping domain # description of the node
|
||||
owners: # (optional) owners contains 2 nested fields
|
||||
users: # (optional) a list of user IDs
|
||||
@ -43,7 +44,8 @@ Example **GlossaryNode**:
|
||||
Example **GlossaryTerm**:
|
||||
|
||||
```yaml
|
||||
- name: FullAddress # name of the term
|
||||
- name: "Full Address" # name of the term
|
||||
id: "Full-Address-Details" # (optional) custom identifier for the term
|
||||
description: A collection of information to give the location of a building or plot of land. # description of the term
|
||||
owners: # (optional) owners contains 2 nested fields
|
||||
users: # (optional) a list of user IDs
|
||||
@ -67,10 +69,86 @@ Example **GlossaryTerm**:
|
||||
domain: "urn:li:domain:Logistics" # (optional) domain name or domain urn
|
||||
```
|
||||
|
||||
To see how these all work together, check out this comprehensive example business glossary file below:
|
||||
## ID Management and URL Generation
|
||||
|
||||
<details>
|
||||
<summary>Example business glossary file</summary>
|
||||
The business glossary provides two primary ways to manage term and node identifiers:
|
||||
|
||||
1. **Custom IDs**: You can explicitly specify an ID for any term or node using the `id` field. This is recommended for terms that need stable, predictable identifiers:
|
||||
```yaml
|
||||
terms:
|
||||
- name: "Response Time"
|
||||
id: "support-response-time" # Explicit ID
|
||||
description: "Target time to respond to customer inquiries"
|
||||
```
|
||||
|
||||
2. **Automatic ID Generation**: When no ID is specified, the system will generate one based on the `enable_auto_id` setting:
|
||||
- With `enable_auto_id: false` (default):
|
||||
- Node and term names are converted to URL-friendly format
|
||||
- Spaces within names are replaced with hyphens
|
||||
- Special characters are removed (except hyphens)
|
||||
- Case is preserved
|
||||
- Multiple hyphens are collapsed to single ones
|
||||
- Path components (node/term hierarchy) are joined with periods
|
||||
- Example: Node "Customer Support" with term "Response Time" → "Customer-Support.Response-Time"
|
||||
|
||||
- With `enable_auto_id: true`:
|
||||
- Generates GUID-based IDs
|
||||
- Recommended for guaranteed uniqueness
|
||||
- Required for terms with non-ASCII characters
|
||||
|
||||
Here's how path-based ID generation works:
|
||||
```yaml
|
||||
nodes:
|
||||
- name: "Customer Support" # Node ID: Customer-Support
|
||||
terms:
|
||||
- name: "Response Time" # Term ID: Customer-Support.Response-Time
|
||||
description: "Response SLA"
|
||||
|
||||
- name: "First Reply" # Term ID: Customer-Support.First-Reply
|
||||
description: "Initial response"
|
||||
|
||||
- name: "Product Feedback" # Node ID: Product-Feedback
|
||||
terms:
|
||||
- name: "Response Time" # Term ID: Product-Feedback.Response-Time
|
||||
description: "Feedback response"
|
||||
```
|
||||
|
||||
**Important Notes**:
|
||||
- Periods (.) are used exclusively as path separators between nodes and terms
|
||||
- Periods in term or node names themselves will be removed
|
||||
- Each component of the path (node names, term names) is cleaned independently:
|
||||
- Spaces to hyphens
|
||||
- Special characters removed
|
||||
- Case preserved
|
||||
- The cleaned components are then joined with periods to form the full path
|
||||
- Non-ASCII characters in any component trigger automatic GUID generation
|
||||
- Once an ID is created (either manually or automatically), it cannot be easily changed
|
||||
- All references to a term (in `inherits`, `contains`, etc.) must use its correct ID
|
||||
- Moving terms in the hierarchy does NOT update their IDs:
|
||||
- The ID retains its original path components even after moving
|
||||
- This can lead to IDs that don't match the current location
|
||||
- Consider using `enable_auto_id: true` if you plan to reorganize your glossary
|
||||
- For terms that other terms will reference, consider using explicit IDs or enable auto_id
|
||||
|
||||
Example of how different names are handled:
|
||||
```yaml
|
||||
nodes:
|
||||
- name: "Data Services" # Node ID: Data-Services
|
||||
terms:
|
||||
# Basic term name
|
||||
- name: "Response Time" # Term ID: Data-Services.Response-Time
|
||||
description: "SLA metrics"
|
||||
|
||||
# Term name with special characters
|
||||
- name: "API @ Response" # Term ID: Data-Services.API-Response
|
||||
description: "API metrics"
|
||||
|
||||
# Term with non-ASCII (triggers GUID)
|
||||
- name: "パフォーマンス" # Term ID will be a 32-character GUID
|
||||
description: "Performance"
|
||||
```
|
||||
|
||||
To see how these all work together, check out this comprehensive example business glossary file below:
|
||||
|
||||
```yaml
|
||||
version: "1"
|
||||
@ -80,172 +158,108 @@ owners:
|
||||
- mjames
|
||||
url: "https://github.com/datahub-project/datahub/"
|
||||
nodes:
|
||||
- name: Classification
|
||||
- name: "Data Classification"
|
||||
id: "Data-Classification" # Custom ID for stable references
|
||||
description: A set of terms related to Data Classification
|
||||
knowledge_links:
|
||||
- label: Wiki link for classification
|
||||
url: "https://en.wikipedia.org/wiki/Classification"
|
||||
terms:
|
||||
- name: Sensitive
|
||||
- name: "Sensitive Data" # Will generate: Data-Classification.Sensitive-Data
|
||||
description: Sensitive Data
|
||||
custom_properties:
|
||||
is_confidential: "false"
|
||||
- name: Confidential
|
||||
- name: "Confidential Information" # Will generate: Data-Classification.Confidential-Information
|
||||
description: Confidential Data
|
||||
custom_properties:
|
||||
is_confidential: "true"
|
||||
- name: HighlyConfidential
|
||||
- name: "Highly Confidential" # Will generate: Data-Classification.Highly-Confidential
|
||||
description: Highly Confidential Data
|
||||
custom_properties:
|
||||
is_confidential: "true"
|
||||
domain: Marketing
|
||||
- name: PersonalInformation
|
||||
|
||||
- name: "Personal Information"
|
||||
description: All terms related to personal information
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
terms:
|
||||
- name: Email
|
||||
## An example of using an id to pin a term to a specific guid
|
||||
## See "how to generate custom IDs for your terms" section below
|
||||
# id: "urn:li:glossaryTerm:41516e310acbfd9076fffc2c98d2d1a3"
|
||||
- name: "Email" # Will generate: Personal-Information.Email
|
||||
description: An individual's email address
|
||||
inherits:
|
||||
- Classification.Confidential
|
||||
- Data-Classification.Confidential # References parent node path
|
||||
owners:
|
||||
groups:
|
||||
- Trust and Safety
|
||||
- name: Address
|
||||
- name: "Address" # Will generate: Personal-Information.Address
|
||||
description: A physical address
|
||||
- name: Gender
|
||||
- name: "Gender" # Will generate: Personal-Information.Gender
|
||||
description: The gender identity of the individual
|
||||
inherits:
|
||||
- Classification.Sensitive
|
||||
- name: Shipping
|
||||
description: Provides terms related to the shipping domain
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
terms:
|
||||
- name: FullAddress
|
||||
description: A collection of information to give the location of a building or plot of land.
|
||||
owners:
|
||||
users:
|
||||
- njones
|
||||
groups:
|
||||
- logistics
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://www.google.com"
|
||||
inherits:
|
||||
- Privacy.PII
|
||||
contains:
|
||||
- Shipping.ZipCode
|
||||
- Shipping.CountryCode
|
||||
- Shipping.StreetAddress
|
||||
related_terms:
|
||||
- Housing.Kitchen.Cutlery
|
||||
custom_properties:
|
||||
- is_used_for_compliance_tracking: "true"
|
||||
knowledge_links:
|
||||
- url: "https://en.wikipedia.org/wiki/Address"
|
||||
label: Wiki link
|
||||
domain: "urn:li:domain:Logistics"
|
||||
knowledge_links:
|
||||
- label: Wiki link for shipping
|
||||
url: "https://en.wikipedia.org/wiki/Freight_transport"
|
||||
- name: ClientsAndAccounts
|
||||
- Data-Classification.Sensitive # References parent node path
|
||||
|
||||
- name: "Clients And Accounts"
|
||||
description: Provides basic concepts such as account, account holder, account provider, relationship manager that are commonly used by financial services providers to describe customers and to determine counterparty identities
|
||||
owners:
|
||||
groups:
|
||||
- finance
|
||||
type: DATAOWNER
|
||||
terms:
|
||||
- name: Account
|
||||
- name: "Account" # Will generate: Clients-And-Accounts.Account
|
||||
description: Container for records associated with a business arrangement for regular transactions and services
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
inherits:
|
||||
- Classification.HighlyConfidential
|
||||
- Data-Classification.Highly-Confidential # References parent node path
|
||||
contains:
|
||||
- ClientsAndAccounts.Balance
|
||||
- name: Balance
|
||||
- Clients-And-Accounts.Balance # References term in same node
|
||||
- name: "Balance" # Will generate: Clients-And-Accounts.Balance
|
||||
description: Amount of money available or owed
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Balance"
|
||||
- name: Housing
|
||||
description: Provides terms related to the housing domain
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
groups:
|
||||
- interior
|
||||
nodes:
|
||||
- name: Colors
|
||||
description: "Colors that are used in Housing construction"
|
||||
terms:
|
||||
- name: Red
|
||||
description: "red color"
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Green
|
||||
description: "green color"
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Pink
|
||||
description: pink color
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
- name: "KPIs"
|
||||
description: Common Business KPIs
|
||||
terms:
|
||||
- name: WindowColor
|
||||
description: Supported window colors
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
values:
|
||||
- Housing.Colors.Red
|
||||
- Housing.Colors.Pink
|
||||
|
||||
- name: Kitchen
|
||||
description: a room or area where food is prepared and cooked.
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
|
||||
- name: Spoon
|
||||
description: an implement consisting of a small, shallow oval or round bowl on a long handle, used for eating, stirring, and serving food.
|
||||
term_source: "EXTERNAL"
|
||||
source_ref: FIBO
|
||||
source_url: "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
related_terms:
|
||||
- Housing.Kitchen
|
||||
knowledge_links:
|
||||
- url: "https://en.wikipedia.org/wiki/Spoon"
|
||||
label: Wiki link
|
||||
- name: "CSAT %" # Will generate: KPIs.CSAT
|
||||
description: Customer Satisfaction Score
|
||||
```
|
||||
</details>
|
||||
|
||||
Source file linked [here](https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/bootstrap_data/business_glossary.yml).
|
||||
## Custom ID Specification
|
||||
|
||||
## Generating custom IDs for your terms
|
||||
Custom IDs can be specified in two ways, both of which are fully supported and acceptable:
|
||||
|
||||
IDs are normally inferred from the glossary term/node's name, see the `enable_auto_id` config. But, if you need a stable
|
||||
identifier, you can generate a custom ID for your term. It should be unique across the entire Glossary.
|
||||
1. Just the ID portion (simpler approach):
|
||||
```yaml
|
||||
terms:
|
||||
- name: "Email"
|
||||
id: "company-email" # Will become urn:li:glossaryTerm:company-email
|
||||
description: "Company email address"
|
||||
```
|
||||
|
||||
Here's an example ID:
|
||||
`id: "urn:li:glossaryTerm:41516e310acbfd9076fffc2c98d2d1a3"`
|
||||
2. Full URN format:
|
||||
```yaml
|
||||
terms:
|
||||
- name: "Email"
|
||||
id: "urn:li:glossaryTerm:company-email"
|
||||
description: "Company email address"
|
||||
```
|
||||
|
||||
A note of caution: once you select a custom ID, it cannot be easily changed.
|
||||
Both methods are valid and will work correctly. The system will automatically handle the URN prefix if you specify just the ID portion.
|
||||
|
||||
The same applies for nodes:
|
||||
```yaml
|
||||
nodes:
|
||||
- name: "Communications"
|
||||
id: "internal-comms" # Will become urn:li:glossaryNode:internal-comms
|
||||
description: "Internal communication methods"
|
||||
```
|
||||
|
||||
Note: Once you select a custom ID, it cannot be easily changed.
|
||||
|
||||
## Compatibility
|
||||
|
||||
Compatible with version 1 of business glossary format.
|
||||
The source will be evolved as we publish newer versions of this format.
|
||||
Compatible with version 1 of business glossary format. The source will be evolved as newer versions of this format are published.
|
||||
@ -1,5 +1,6 @@
|
||||
import logging
|
||||
import pathlib
|
||||
import re
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, Iterable, List, Optional, TypeVar, Union
|
||||
@ -118,17 +119,58 @@ class BusinessGlossaryConfig(DefaultConfig):
|
||||
return v
|
||||
|
||||
|
||||
def clean_url(text: str) -> str:
|
||||
"""
|
||||
Clean text for use in URLs by:
|
||||
1. Replacing spaces with hyphens
|
||||
2. Removing special characters (preserving hyphens and periods)
|
||||
3. Collapsing multiple hyphens and periods into single ones
|
||||
"""
|
||||
# Replace spaces with hyphens
|
||||
text = text.replace(" ", "-")
|
||||
# Remove special characters except hyphens and periods
|
||||
text = re.sub(r"[^a-zA-Z0-9\-.]", "", text)
|
||||
# Collapse multiple hyphens into one
|
||||
text = re.sub(r"-+", "-", text)
|
||||
# Collapse multiple periods into one
|
||||
text = re.sub(r"\.+", ".", text)
|
||||
# Remove leading/trailing hyphens and periods
|
||||
text = text.strip("-.")
|
||||
return text
|
||||
|
||||
|
||||
def create_id(path: List[str], default_id: Optional[str], enable_auto_id: bool) -> str:
|
||||
"""
|
||||
Create an ID for a glossary node or term.
|
||||
|
||||
Args:
|
||||
path: List of path components leading to this node/term
|
||||
default_id: Optional manually specified ID
|
||||
enable_auto_id: Whether to generate GUIDs
|
||||
"""
|
||||
if default_id is not None:
|
||||
return default_id # No need to create id from path as default_id is provided
|
||||
return default_id # Use explicitly provided ID
|
||||
|
||||
id_: str = ".".join(path)
|
||||
|
||||
if UrnEncoder.contains_extended_reserved_char(id_):
|
||||
enable_auto_id = True
|
||||
# Check for non-ASCII characters before cleaning
|
||||
if any(ord(c) > 127 for c in id_):
|
||||
return datahub_guid({"path": id_})
|
||||
|
||||
if enable_auto_id:
|
||||
# Generate GUID for auto_id mode
|
||||
id_ = datahub_guid({"path": id_})
|
||||
else:
|
||||
# Clean the URL for better readability when not using auto_id
|
||||
id_ = clean_url(id_)
|
||||
|
||||
# Force auto_id if the cleaned URL still contains problematic characters
|
||||
if UrnEncoder.contains_extended_reserved_char(id_):
|
||||
logger.warning(
|
||||
f"ID '{id_}' contains problematic characters after URL cleaning. Falling back to GUID generation for stability."
|
||||
)
|
||||
id_ = datahub_guid({"path": id_})
|
||||
|
||||
return id_
|
||||
|
||||
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Custom URN Types",
|
||||
"urn": "urn:li:glossaryNode:Custom-URN-Types",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -42,21 +42,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Custom URN Types.Mixed URN Types",
|
||||
"urn": "urn:li:glossaryTerm:Custom-URN-Types.Mixed-URN-Types",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Mixed URN Types",
|
||||
"definition": "Term with custom URN types",
|
||||
"parentNode": "urn:li:glossaryNode:Custom URN Types",
|
||||
"parentNode": "urn:li:glossaryNode:Custom-URN-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -88,21 +88,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Custom URN Types.Mixed Standard and URN",
|
||||
"urn": "urn:li:glossaryTerm:Custom-URN-Types.Mixed-Standard-and-URN",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Mixed Standard and URN",
|
||||
"definition": "Term with both standard and URN types",
|
||||
"parentNode": "urn:li:glossaryNode:Custom URN Types",
|
||||
"parentNode": "urn:li:glossaryNode:Custom-URN-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -133,13 +133,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Custom URN Types",
|
||||
"entityUrn": "urn:li:glossaryNode:Custom-URN-Types",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -149,13 +149,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Custom URN Types.Mixed Standard and URN",
|
||||
"entityUrn": "urn:li:glossaryTerm:Custom-URN-Types.Mixed-Standard-and-URN",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -165,13 +165,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Custom URN Types.Mixed URN Types",
|
||||
"entityUrn": "urn:li:glossaryTerm:Custom-URN-Types.Mixed-URN-Types",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -181,7 +181,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-dlsmlo",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ugsgt3",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
|
||||
@ -21,6 +21,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -32,7 +33,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -58,7 +59,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -88,6 +89,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -99,7 +101,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -125,7 +127,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -155,6 +157,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -166,13 +169,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "domains",
|
||||
"aspect": {
|
||||
@ -184,14 +187,14 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"urn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
@ -214,6 +217,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -225,14 +229,14 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Personal Information",
|
||||
"urn": "urn:li:glossaryNode:Personal-Information",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -249,6 +253,7 @@
|
||||
"type": "DATAOWNER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -260,21 +265,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Email",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Email",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Email",
|
||||
"definition": "An individual's email address",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -295,6 +300,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -306,21 +312,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Address",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Address",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Address",
|
||||
"definition": "A physical address",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -334,6 +340,7 @@
|
||||
"type": "DATAOWNER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -345,21 +352,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Gender",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Gender",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Gender",
|
||||
"definition": "The gender identity of the individual",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -380,6 +387,7 @@
|
||||
"type": "DATAOWNER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -391,14 +399,14 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"urn": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -416,6 +424,7 @@
|
||||
"typeUrn": "urn:li:ownershipType:my_cutom_type"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -427,21 +436,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Clients And Accounts.Account",
|
||||
"urn": "urn:li:glossaryTerm:Clients-And-Accounts.Account",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Account",
|
||||
"definition": "Container for records associated with a business arrangement for regular transactions and services",
|
||||
"parentNode": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"parentNode": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"termSource": "EXTERNAL",
|
||||
"sourceRef": "FIBO",
|
||||
"sourceUrl": "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
@ -450,10 +459,10 @@
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryRelatedTerms": {
|
||||
"isRelatedTerms": [
|
||||
"urn:li:glossaryTerm:Classification.Highly Confidential"
|
||||
"urn:li:glossaryTerm:Classification.Highly-Confidential"
|
||||
],
|
||||
"hasRelatedTerms": [
|
||||
"urn:li:glossaryTerm:Clients And Accounts.Balance"
|
||||
"urn:li:glossaryTerm:Clients-And-Accounts.Balance"
|
||||
]
|
||||
}
|
||||
},
|
||||
@ -466,6 +475,7 @@
|
||||
"typeUrn": "urn:li:ownershipType:my_cutom_type"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -477,21 +487,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Clients And Accounts.Balance",
|
||||
"urn": "urn:li:glossaryTerm:Clients-And-Accounts.Balance",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Balance",
|
||||
"definition": "Amount of money available or owed",
|
||||
"parentNode": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"parentNode": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"termSource": "EXTERNAL",
|
||||
"sourceRef": "FIBO",
|
||||
"sourceUrl": "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Balance"
|
||||
@ -506,6 +516,7 @@
|
||||
"typeUrn": "urn:li:ownershipType:my_cutom_type"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -517,7 +528,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -541,6 +552,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -552,14 +564,14 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:4faf1eed790370f65942f2998a7993d6",
|
||||
"urn": "urn:li:glossaryTerm:KPIs.CSAT",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
@ -580,6 +592,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -591,7 +604,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -607,13 +620,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"entityUrn": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -623,7 +636,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -639,13 +652,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Personal Information",
|
||||
"entityUrn": "urn:li:glossaryNode:Personal-Information",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -655,23 +668,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:4faf1eed790370f65942f2998a7993d6",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -687,13 +684,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -703,7 +700,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
@ -719,13 +716,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients And Accounts.Account",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients-And-Accounts.Account",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -735,13 +732,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients And Accounts.Balance",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients-And-Accounts.Balance",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -751,13 +748,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Address",
|
||||
"entityUrn": "urn:li:glossaryTerm:KPIs.CSAT",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -767,13 +764,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Email",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Address",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -783,13 +780,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Gender",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Email",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -799,7 +796,23 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Gender",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-h7iopd",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Different Owner Types",
|
||||
"urn": "urn:li:glossaryNode:Different-Owner-Types",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -47,21 +47,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-2te9j9",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-8vduoq",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Different Owner Types.Mixed Ownership",
|
||||
"urn": "urn:li:glossaryTerm:Different-Owner-Types.Mixed-Ownership",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Mixed Ownership",
|
||||
"definition": "Term with different owner types",
|
||||
"parentNode": "urn:li:glossaryNode:Different Owner Types",
|
||||
"parentNode": "urn:li:glossaryNode:Different-Owner-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -99,13 +99,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-2te9j9",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-8vduoq",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Different Owner Types",
|
||||
"entityUrn": "urn:li:glossaryNode:Different-Owner-Types",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -115,13 +115,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-2te9j9",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-8vduoq",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Different Owner Types.Mixed Ownership",
|
||||
"entityUrn": "urn:li:glossaryTerm:Different-Owner-Types.Mixed-Ownership",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -131,7 +131,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-2te9j9",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-8vduoq",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Multiple Owners",
|
||||
"urn": "urn:li:glossaryNode:Multiple-Owners",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -47,21 +47,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-0l66l7",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-iuvo6j",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Multiple Owners.Multiple Dev Owners",
|
||||
"urn": "urn:li:glossaryTerm:Multiple-Owners.Multiple-Dev-Owners",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Multiple Dev Owners",
|
||||
"definition": "Term owned by multiple developers",
|
||||
"parentNode": "urn:li:glossaryNode:Multiple Owners",
|
||||
"parentNode": "urn:li:glossaryNode:Multiple-Owners",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -103,13 +103,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-0l66l7",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-iuvo6j",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Multiple Owners",
|
||||
"entityUrn": "urn:li:glossaryNode:Multiple-Owners",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -119,13 +119,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-0l66l7",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-iuvo6j",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Multiple Owners.Multiple Dev Owners",
|
||||
"entityUrn": "urn:li:glossaryTerm:Multiple-Owners.Multiple-Dev-Owners",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -135,7 +135,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-0l66l7",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-iuvo6j",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Single Owner Types",
|
||||
"urn": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -31,21 +31,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Single Owner Types.Developer Owned",
|
||||
"urn": "urn:li:glossaryTerm:Single-Owner-Types.Developer-Owned",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Developer Owned",
|
||||
"definition": "Term owned by developer",
|
||||
"parentNode": "urn:li:glossaryNode:Single Owner Types",
|
||||
"parentNode": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -71,21 +71,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Single Owner Types.Data Owner Owned",
|
||||
"urn": "urn:li:glossaryTerm:Single-Owner-Types.Data-Owner-Owned",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Data Owner Owned",
|
||||
"definition": "Term owned by data owner",
|
||||
"parentNode": "urn:li:glossaryNode:Single Owner Types",
|
||||
"parentNode": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -111,21 +111,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Single Owner Types.Producer Owned",
|
||||
"urn": "urn:li:glossaryTerm:Single-Owner-Types.Producer-Owned",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Producer Owned",
|
||||
"definition": "Term owned by producer",
|
||||
"parentNode": "urn:li:glossaryNode:Single Owner Types",
|
||||
"parentNode": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -151,21 +151,21 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Single Owner Types.Stakeholder Owned",
|
||||
"urn": "urn:li:glossaryTerm:Single-Owner-Types.Stakeholder-Owned",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Stakeholder Owned",
|
||||
"definition": "Term owned by stakeholder",
|
||||
"parentNode": "urn:li:glossaryNode:Single Owner Types",
|
||||
"parentNode": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -191,13 +191,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Single Owner Types",
|
||||
"entityUrn": "urn:li:glossaryNode:Single-Owner-Types",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -207,13 +207,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single Owner Types.Data Owner Owned",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single-Owner-Types.Data-Owner-Owned",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -223,13 +223,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single Owner Types.Developer Owned",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single-Owner-Types.Developer-Owned",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -239,13 +239,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single Owner Types.Producer Owned",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single-Owner-Types.Producer-Owned",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -255,13 +255,13 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single Owner Types.Stakeholder Owned",
|
||||
"entityUrn": "urn:li:glossaryTerm:Single-Owner-Types.Stakeholder-Owned",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -271,7 +271,7 @@
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-ruwyic",
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-bx72oe",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
|
||||
@ -4,7 +4,6 @@ import pytest
|
||||
from freezegun import freeze_time
|
||||
|
||||
from datahub.ingestion.run.pipeline import Pipeline
|
||||
from datahub.ingestion.source.metadata import business_glossary
|
||||
from tests.test_helpers import mce_helpers
|
||||
|
||||
FROZEN_TIME = "2020-04-14 07:00:00"
|
||||
@ -200,6 +199,31 @@ def test_custom_ownership_urns(
|
||||
|
||||
|
||||
@freeze_time(FROZEN_TIME)
|
||||
def test_auto_id_creation_on_reserved_char():
|
||||
id_: str = business_glossary.create_id(["pii", "secure % password"], None, False)
|
||||
assert id_ == "24baf9389cc05c162c7148c96314d733"
|
||||
@pytest.mark.integration
|
||||
def test_url_cleaning(
|
||||
mock_datahub_graph_instance,
|
||||
pytestconfig,
|
||||
tmp_path,
|
||||
mock_time,
|
||||
):
|
||||
"""Test URL cleaning functionality when auto_id is disabled"""
|
||||
test_resources_dir = pytestconfig.rootpath / "tests/integration/business-glossary"
|
||||
output_mces_path: str = f"{tmp_path}/url_cleaning_events.json"
|
||||
golden_mces_path: str = f"{test_resources_dir}/url_cleaning_events_golden.json"
|
||||
|
||||
pipeline = Pipeline.create(
|
||||
get_default_recipe(
|
||||
glossary_yml_file_path=f"{test_resources_dir}/url_cleaning_glossary.yml",
|
||||
event_output_file_path=output_mces_path,
|
||||
enable_auto_id=False,
|
||||
)
|
||||
)
|
||||
pipeline.ctx.graph = mock_datahub_graph_instance
|
||||
pipeline.run()
|
||||
pipeline.raise_from_status()
|
||||
|
||||
mce_helpers.check_golden_file(
|
||||
pytestconfig,
|
||||
output_path=output_mces_path,
|
||||
golden_path=golden_mces_path,
|
||||
)
|
||||
|
||||
@ -0,0 +1,446 @@
|
||||
[
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:URL-Testing",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
"customProperties": {},
|
||||
"definition": "Testing URL cleaning functionality",
|
||||
"name": "URL Testing"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.Basic-Term-With-Spaces",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Basic Term With Spaces",
|
||||
"definition": "Testing basic space replacement",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.SpecialCharacters",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Special@#$Characters!",
|
||||
"definition": "Testing special character removal",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.MixedCase-Term",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "MixedCase Term",
|
||||
"definition": "Testing case preservation",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.Multiple-Spaces",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Multiple Spaces",
|
||||
"definition": "Testing multiple space handling",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.Term.With.Special-Chars",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Term.With.Special-Chars",
|
||||
"definition": "Testing mixed special characters",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.Special-At-Start",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "@#$Special At Start",
|
||||
"definition": "Testing leading special characters",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:URL-Testing.Numbers-123",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Numbers 123",
|
||||
"definition": "Testing numbers in term names",
|
||||
"parentNode": "urn:li:glossaryNode:URL-Testing",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
}
|
||||
},
|
||||
{
|
||||
"com.linkedin.pegasus2avro.common.Ownership": {
|
||||
"owners": [
|
||||
{
|
||||
"owner": "urn:li:corpuser:mjames",
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:URL-Testing",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.Basic-Term-With-Spaces",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.MixedCase-Term",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.Multiple-Spaces",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.Numbers-123",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.Special-At-Start",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.SpecialCharacters",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:URL-Testing.Term.With.Special-Chars",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1586847600000,
|
||||
"runId": "datahub-business-glossary-2020_04_14-07_00_00-4alqef",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
}
|
||||
]
|
||||
@ -0,0 +1,31 @@
|
||||
# tests/integration/business-glossary/url_cleaning_glossary.yml
|
||||
version: "1"
|
||||
source: DataHub
|
||||
owners:
|
||||
users:
|
||||
- mjames
|
||||
url: "https://github.com/datahub-project/datahub/"
|
||||
nodes:
|
||||
- name: "URL Testing"
|
||||
description: "Testing URL cleaning functionality"
|
||||
terms:
|
||||
- name: "Basic Term With Spaces"
|
||||
description: "Testing basic space replacement"
|
||||
|
||||
- name: "Special@#$Characters!"
|
||||
description: "Testing special character removal"
|
||||
|
||||
- name: "MixedCase Term"
|
||||
description: "Testing case preservation"
|
||||
|
||||
- name: "Multiple Spaces"
|
||||
description: "Testing multiple space handling"
|
||||
|
||||
- name: "Term.With.Special-Chars"
|
||||
description: "Testing mixed special characters"
|
||||
|
||||
- name: "@#$Special At Start"
|
||||
description: "Testing leading special characters"
|
||||
|
||||
- name: "Numbers 123"
|
||||
description: "Testing numbers in term names"
|
||||
@ -21,6 +21,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -88,6 +89,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -155,6 +157,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -172,7 +175,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "domains",
|
||||
"aspect": {
|
||||
@ -191,7 +194,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"urn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
@ -214,6 +217,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -232,7 +236,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Personal Information",
|
||||
"urn": "urn:li:glossaryNode:Personal-Information",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -249,6 +253,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -267,14 +272,14 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Email",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Email",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Email",
|
||||
"definition": "An individual's email address",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -295,6 +300,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -313,14 +319,14 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Address",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Address",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Address",
|
||||
"definition": "A physical address",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -334,6 +340,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -352,14 +359,14 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Personal Information.Gender",
|
||||
"urn": "urn:li:glossaryTerm:Personal-Information.Gender",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Gender",
|
||||
"definition": "The gender identity of the individual",
|
||||
"parentNode": "urn:li:glossaryNode:Personal Information",
|
||||
"parentNode": "urn:li:glossaryNode:Personal-Information",
|
||||
"termSource": "INTERNAL",
|
||||
"sourceRef": "DataHub",
|
||||
"sourceUrl": "https://github.com/datahub-project/datahub/"
|
||||
@ -380,6 +387,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -398,7 +406,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryNodeSnapshot": {
|
||||
"urn": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"urn": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryNodeInfo": {
|
||||
@ -415,6 +423,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -433,14 +442,14 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Clients And Accounts.Account",
|
||||
"urn": "urn:li:glossaryTerm:Clients-And-Accounts.Account",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Account",
|
||||
"definition": "Container for records associated with a business arrangement for regular transactions and services",
|
||||
"parentNode": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"parentNode": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"termSource": "EXTERNAL",
|
||||
"sourceRef": "FIBO",
|
||||
"sourceUrl": "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Account"
|
||||
@ -449,10 +458,10 @@
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryRelatedTerms": {
|
||||
"isRelatedTerms": [
|
||||
"urn:li:glossaryTerm:Classification.Highly Confidential"
|
||||
"urn:li:glossaryTerm:Classification.Highly-Confidential"
|
||||
],
|
||||
"hasRelatedTerms": [
|
||||
"urn:li:glossaryTerm:Clients And Accounts.Balance"
|
||||
"urn:li:glossaryTerm:Clients-And-Accounts.Balance"
|
||||
]
|
||||
}
|
||||
},
|
||||
@ -464,6 +473,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -482,14 +492,14 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:Clients And Accounts.Balance",
|
||||
"urn": "urn:li:glossaryTerm:Clients-And-Accounts.Balance",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
"customProperties": {},
|
||||
"name": "Balance",
|
||||
"definition": "Amount of money available or owed",
|
||||
"parentNode": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"parentNode": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"termSource": "EXTERNAL",
|
||||
"sourceRef": "FIBO",
|
||||
"sourceUrl": "https://spec.edmcouncil.org/fibo/ontology/FBC/ProductsAndServices/ClientsAndAccounts/Balance"
|
||||
@ -503,6 +513,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -538,6 +549,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -556,7 +568,7 @@
|
||||
{
|
||||
"proposedSnapshot": {
|
||||
"com.linkedin.pegasus2avro.metadata.snapshot.GlossaryTermSnapshot": {
|
||||
"urn": "urn:li:glossaryTerm:4faf1eed790370f65942f2998a7993d6",
|
||||
"urn": "urn:li:glossaryTerm:KPIs.CSAT",
|
||||
"aspects": [
|
||||
{
|
||||
"com.linkedin.pegasus2avro.glossary.GlossaryTermInfo": {
|
||||
@ -577,6 +589,7 @@
|
||||
"type": "DEVELOPER"
|
||||
}
|
||||
],
|
||||
"ownerTypes": {},
|
||||
"lastModified": {
|
||||
"time": 0,
|
||||
"actor": "urn:li:corpuser:unknown"
|
||||
@ -610,7 +623,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Clients And Accounts",
|
||||
"entityUrn": "urn:li:glossaryNode:Clients-And-Accounts",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -642,23 +655,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryNode",
|
||||
"entityUrn": "urn:li:glossaryNode:Personal Information",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1629795600000,
|
||||
"runId": "remote-4",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:4faf1eed790370f65942f2998a7993d6",
|
||||
"entityUrn": "urn:li:glossaryNode:Personal-Information",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -690,7 +687,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly Confidential",
|
||||
"entityUrn": "urn:li:glossaryTerm:Classification.Highly-Confidential",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -722,7 +719,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients And Accounts.Account",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients-And-Accounts.Account",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -738,7 +735,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients And Accounts.Balance",
|
||||
"entityUrn": "urn:li:glossaryTerm:Clients-And-Accounts.Balance",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -754,7 +751,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Address",
|
||||
"entityUrn": "urn:li:glossaryTerm:KPIs.CSAT",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -770,7 +767,7 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Email",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Address",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
@ -786,7 +783,23 @@
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal Information.Gender",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Email",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
"json": {
|
||||
"removed": false
|
||||
}
|
||||
},
|
||||
"systemMetadata": {
|
||||
"lastObserved": 1629795600000,
|
||||
"runId": "remote-4",
|
||||
"lastRunId": "no-run-id-provided"
|
||||
}
|
||||
},
|
||||
{
|
||||
"entityType": "glossaryTerm",
|
||||
"entityUrn": "urn:li:glossaryTerm:Personal-Information.Gender",
|
||||
"changeType": "UPSERT",
|
||||
"aspectName": "status",
|
||||
"aspect": {
|
||||
|
||||
85
metadata-ingestion/tests/unit/test_business_glossary.py
Normal file
85
metadata-ingestion/tests/unit/test_business_glossary.py
Normal file
@ -0,0 +1,85 @@
|
||||
from datahub.ingestion.source.metadata.business_glossary import clean_url, create_id
|
||||
|
||||
|
||||
def test_clean_url():
|
||||
"""Test the clean_url function with various input cases"""
|
||||
test_cases = [
|
||||
("Basic Term", "Basic-Term"),
|
||||
("Term With Spaces", "Term-With-Spaces"),
|
||||
("Special@#$Characters!", "SpecialCharacters"),
|
||||
("MixedCase Term", "MixedCase-Term"),
|
||||
("Multiple Spaces", "Multiple-Spaces"),
|
||||
("Term-With-Hyphens", "Term-With-Hyphens"),
|
||||
("Term.With.Dots", "Term.With.Dots"), # Preserve periods
|
||||
("Term_With_Underscores", "TermWithUnderscores"),
|
||||
("123 Numeric Term", "123-Numeric-Term"),
|
||||
("@#$Special At Start", "Special-At-Start"),
|
||||
("-Leading-Trailing-", "Leading-Trailing"),
|
||||
("Multiple...Periods", "Multiple.Periods"), # Test multiple periods
|
||||
("Mixed-Hyphens.Periods", "Mixed-Hyphens.Periods"), # Test mixed separators
|
||||
]
|
||||
|
||||
for input_str, expected in test_cases:
|
||||
result = clean_url(input_str)
|
||||
assert result == expected, (
|
||||
f"Expected '{expected}' for input '{input_str}', got '{result}'"
|
||||
)
|
||||
|
||||
|
||||
def test_clean_url_edge_cases():
|
||||
"""Test clean_url function with edge cases"""
|
||||
test_cases = [
|
||||
("", ""), # Empty string
|
||||
(" ", ""), # Single space
|
||||
(" ", ""), # Multiple spaces
|
||||
("@#$%", ""), # Only special characters
|
||||
("A", "A"), # Single character
|
||||
("A B", "A-B"), # Two characters with space
|
||||
("A.B", "A.B"), # Period separator
|
||||
("...", ""), # Only periods
|
||||
(".Leading.Trailing.", "Leading.Trailing"), # Leading/trailing periods
|
||||
]
|
||||
|
||||
for input_str, expected in test_cases:
|
||||
result = clean_url(input_str)
|
||||
assert result == expected, (
|
||||
f"Expected '{expected}' for input '{input_str}', got '{result}'"
|
||||
)
|
||||
|
||||
|
||||
def test_create_id_url_cleaning():
|
||||
"""Test create_id function's URL cleaning behavior"""
|
||||
# Test basic URL cleaning
|
||||
id_ = create_id(["pii", "secure % password"], None, False)
|
||||
assert id_ == "pii.secure-password"
|
||||
|
||||
# Test with multiple path components
|
||||
id_ = create_id(["Term One", "Term Two", "Term Three"], None, False)
|
||||
assert id_ == "Term-One.Term-Two.Term-Three"
|
||||
|
||||
# Test with path components containing periods
|
||||
id_ = create_id(["Term.One", "Term.Two"], None, False)
|
||||
assert id_ == "Term.One.Term.Two"
|
||||
|
||||
|
||||
def test_create_id_with_special_chars():
|
||||
"""Test create_id function's handling of special characters"""
|
||||
# Test with non-ASCII characters (should trigger auto_id)
|
||||
id_ = create_id(["pii", "secure パスワード"], None, False)
|
||||
assert len(id_) == 32 # GUID length
|
||||
assert id_.isalnum() # Should only contain alphanumeric characters
|
||||
|
||||
# Test with characters that aren't periods or hyphens
|
||||
id_ = create_id(["test", "special@#$chars"], None, False)
|
||||
assert id_ == "test.specialchars"
|
||||
|
||||
|
||||
def test_create_id_with_default():
|
||||
"""Test create_id function with default_id parameter"""
|
||||
# Test that default_id is respected
|
||||
id_ = create_id(["any", "path"], "custom-id", False)
|
||||
assert id_ == "custom-id"
|
||||
|
||||
# Test with URN as default_id
|
||||
id_ = create_id(["any", "path"], "urn:li:glossaryTerm:custom-id", False)
|
||||
assert id_ == "urn:li:glossaryTerm:custom-id"
|
||||
Loading…
x
Reference in New Issue
Block a user