Update relationship.md

This commit is contained in:
Mars Lan 2020-01-15 23:22:28 -08:00 committed by GitHub
parent 26ae37c26f
commit 189b8c71a0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,22 +1,20 @@
# What is a relationship?
A relationship is a named associate between exactly two entities, a source and a destination.
A relationship is a named associate between exactly two [entities](entity.md), a source and a destination.
![metadata-modeling](../imgs/metadata-modeling.png)
From the above graph, a `Group` entity can be linked to a `User` entity via a `HasMember` relationship.
Note that the name of the relationship reflects the direction, i.e pointing from `Group` to `User`.
Note that the name of the relationship reflects the direction, i.e. pointing from `Group` to `User`.
This is due to the fact that the actual metadata aspect holding this information is associated with `Group`, rather than User.
Had the direction been reversed, the relationship would have been named `IsMemberOf` instead.
See [Direction of Relationships](#direction-of-relationships) for more discussions on relationship directionality.
A specific instance of a relationship, e.g. `urn:li:corpgroup:group1` has a member `urn:li:corpuser:user1`,
corresponds to an edge in the metadata graph.
Similar to an entity, a relationship can also be associated with optional attributes that are derived from metadata.
For example, from the `Membership` metadata aspect shown below, were able to derive the `HasMember` relationship that links a specific `Group` to a specific `User`.
We can also include additional attribute to the relationship, e.g. importance, which corresponds to the position of the specific member in the original membership array.
This allows complex graph query that travel only relationships that match certain criteria, e.g. `returns only the top-5 most important members of this group.`
Once again, attributes should only be added based on query patterns.
Similar to an entity, a relationship can also be associated with optional attributes that are derived from the metadata.
For example, from the `Membership` metadata aspect shown below, were able to derive the `HasMember` relationship that links a specific `Group` to a specific `User`. We can also include additional attribute to the relationship, e.g. importance, which corresponds to the position of the specific member in the original membership array. This allows complex graph query that travel only relationships that match certain criteria, e.g. "returns only the top-5 most important members of this group."
Similar to the entity attributes, relationship attributes should only be added based on the expected query patterns to reduce the indexing cost.
```json
{
@ -47,18 +45,15 @@ Once again, attributes should only be added based on query patterns.
}
```
Relationships are meant to be `entity-neutral`. In other words, one would expect to use the same `OwnedBy` relationship to link a `Dataset` to a `User` and to link a `Dashboard` to a `User`.
As Pegasus doesnt allow typing a field using multiple URNs (because theyre all essentially strings), we resort to using generic URN type for the source and destination.
Relationships are meant to be "entity-neutral". In other words, one would expect to use the same `OwnedBy` relationship to link a `Dataset` to a `User` and to link a `Dashboard` to a `User`. As Pegasus doesnt allow typing a field using multiple URNs (because theyre all essentially strings), we resort to using generic URN type for the source and destination.
We also introduce a non-standard property pairings to limit the allowed source and destination URN types.
While its possible to model relationships in rest.li as [association resources](https://linkedin.github.io/rest.li/modeling/modeling#association),
which often get stored as mapping tables, it is far more common to model them as "foreign keys" field in a metadata aspect.
For instance, the `Ownership` aspect is likely to contain an array of owners corpuser URNs.
While its possible to model relationships in rest.li as [association resources](https://linkedin.github.io/rest.li/modeling/modeling#association), which often get stored as mapping tables, it is far more common to model them as "foreign keys" field in a metadata aspect. For instance, the `Ownership` aspect is likely to contain an array of owners corpuser URNs.
Below is an example of how a relationship is modeled in PDSC. Note that:
1. As the `source` and `destination` are of generic URN type, were able to factor them out to a common `BaseRelationship` model.
2. Each model is expected to have a pairings property that is an array of all allowed source-destination URNs.
3. Unlike entities, theres no requirement on making all attributes optional since relationships do not support partial updates.
2. Each model is expected to have a pairings property that is an array of all allowed source-destination URN pairs.
3. Unlike entity attributes, theres no requirement on making all relationship attributes optional since relationships do not support partial updates.
```json
{
@ -109,11 +104,11 @@ Below is an example of how a relationship is modeled in PDSC. Note that:
## Direction of Relationships
As relationships are modeled as directed edges between nodes, its natural to ask which way should it be pointing,
or should there be edges going both ways? The answer is, "it kind of doesnt matter." Its rather an aesthetic choice than technical one.
or should there be edges going both ways? The answer is, "doesnt really matter." Its rather an aesthetic choice than technical one.
For one, the actual direction doesnt really matter when it comes to constructing graph queries. Most graph DBs are fully capable of traversing edges in reverse direction efficiently.
For one, the actual direction doesnt really impact the execution of graph queries. Most graph DBs are fully capable of traversing edges in reverse direction efficiently.
That being said, generally theres a more "natural way" to specify the direction of a relationship, which is closely related to how metadata is stored. For example, the membership information for an LDAP group is generally stored as a list in groups metadata. As a result, its more natural to model a `HasAMember` relationship that points from a group to a member, instead of a `IsMemberOf` relationship pointing from member to group.
That being said, generally theres a more "natural way" to specify the direction of a relationship, which closely relate to how the metadata is stored. For example, the membership information for an LDAP group is generally stored as a list in groups metadata. As a result, its more natural to model a `HasMember` relationship that points from a group to a member, instead of a `IsMemberOf` relationship pointing from member to group.
Since all relationships are explicitly declared, its fairly easy for a user to discover what relationships are available and their directionality by inspecting
the [relationships directory](../../metadata-models/src/main/pegasus/com/linkedin/metadata/relationship). Its also possible to provide a UI for the catalog of entities and relationships for analysts who are interested in building complex graph queries to gain insights into metadata.
the [relationships directory](../../metadata-models/src/main/pegasus/com/linkedin/metadata/relationship). Its also possible to provide a UI for the catalog of entities and relationships for analysts who are interested in building complex graph queries to gain insights into the metadata.