3.1 KiB
How to onboard an entity?
Refer to this doc if you're only interested in adding a new aspect to an existing entity
Currently, DataHub only has a support for 3 entity types: datasets
, users
and groups
.
If you want to extend DataHub with your own use cases such as metrics
, charts
, dashboards
etc, you should follow the below steps in order.
Also we use this following diagram to help you visualize the process.
1. Define URN
Refer to here for URN definition.
2. Model your metadata
Refer to metadata modelling section. Make sure to do the following:
- Define Aspect models.
- Define aspect union model. Refer to
DatasetAspect
as an example. - Define Snapshot model. Refer to
DatasetSnapshot
as an example. - Add your newly defined snapshot to Snapshot Union model.
3. GMA search onboarding
Refer to search onboarding if you need to search the entity.
4. GMA graph onboarding
Refer to graph onboarding if you need to perform graph queries against the entity.
5. Add rest.li resource endpoints
See CorpUsers
for an example of top-level resource endpoint. Optionally add an aspect-specific sub-resource endpoint such as CorpUsersEditableInfoResource
.
If you want to use this new entity type from the ingestion framework's REST-based sink, you'll need to add it to the new endpoint to the resource list.
6. Configure dependency injection
GMS uses Spring Framework for dependency injection. You'll need to add various factories to create any custom DAOs used by the rest.li endpoint. You'll also need to add any custom package to the base-package
of <context:component-scan>
tag in beans.xml