mirror of
				https://github.com/datahub-project/datahub.git
				synced 2025-11-03 20:27:50 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			47 lines
		
	
	
		
			2.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			47 lines
		
	
	
		
			2.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# What is a metadata aspect?
 | 
						||
 | 
						||
A metadata aspect is a structured document, or more precisely a `record` in [PDL](https://linkedin.github.io/rest.li/pdl_schema),
 | 
						||
that represents a specific kind of metadata (e.g. ownership, schema, statistics, upstreams).
 | 
						||
A metadata aspect on its own has no meaning (e.g. ownership for what?) and must be associated with a particular entity (e.g. ownership for PageViewEvent).
 | 
						||
We purposely not to impose any model requirement on metadata aspects, as each aspect is expected to differ significantly.
 | 
						||
 | 
						||
Metadata aspects are immutable by design, i.e. every change to a particular aspect results in a [new version](../advanced/aspect-versioning.md) created.
 | 
						||
An optional retention policy can be applied such that X number of most recent versions will be retained after each update.
 | 
						||
Setting X to 1 effectively means the metadata aspect is non-versioned.
 | 
						||
It is also possible to apply the retention based on time, e.g. only keeps the metadata changes from the past 30 days.
 | 
						||
 | 
						||
While a metadata aspect can be arbitrary complex document with multiple levels of nesting, it is sometimes desirable to break a monolithic aspect into smaller independent aspects.
 | 
						||
This will provide the benefits of:
 | 
						||
 | 
						||
1. **Faster read/write**: As metadata aspects are immutable, every "update" will lead to the writing the entire large aspect back to the underlying data store.
 | 
						||
   Likewise, readers will need to retrieve the entire aspect even if it’s only interested in a small part of it.
 | 
						||
2. **Ability to independently version different aspects**: For example, one may like to get the change history of all the "ownership metadata" independent of the changes made to "schema metadata" for a dataset.
 | 
						||
3. **Help with rest.li endpoint modeling**: While it’s not required to have 1:1 mapping between rest.li endpoints and metadata aspects,
 | 
						||
   it’d follow this pattern naturally, which means one will end up with smaller, more modular, endpoints instead of giant ones.
 | 
						||
 | 
						||
Here’s an example metadata aspect. Note that the `admin` and `members` fields are implicitly conveying a relationship between `Group` entity & `User` entity.
 | 
						||
It’s very natural to save such relationships as URNs in a metadata aspect.
 | 
						||
The [relationship](relationship.md) section explains how this relationship can be explicitly extracted and modelled.
 | 
						||
 | 
						||
```
 | 
						||
namespace com.linkedin.group
 | 
						||
 | 
						||
import com.linkedin.common.AuditStamp
 | 
						||
import com.linkedin.common.CorpuserUrn
 | 
						||
 | 
						||
/**
 | 
						||
 * The membership metadata for a group
 | 
						||
 */
 | 
						||
record Membership {
 | 
						||
 | 
						||
  /** Audit stamp for the last change */
 | 
						||
  auditStamp: AuditStamp
 | 
						||
 | 
						||
  /** Admin of the group */
 | 
						||
  admin: CorpuserUrn
 | 
						||
 | 
						||
  /** Members of the group, ordered in descending importance */
 | 
						||
  members: array[CorpuserUrn]
 | 
						||
}
 | 
						||
```
 |