mirror of
https://github.com/datahub-project/datahub.git
synced 2025-11-10 16:32:26 +00:00
Update features.md
This commit is contained in:
parent
7a0443cc4d
commit
990b3453c1
@ -1,35 +1,45 @@
|
|||||||
# Features of DataHub
|
# Features of DataHub
|
||||||
|
|
||||||
DataHub is composed of a [generic backend infra](what/gma.md) and a [Ember-based UI](../datahub-web). Original DataHub
|
DataHub is made up of a [generic backend](what/gma.md) and a [Ember-based UI](../datahub-web). Original DataHub
|
||||||
[blog post](https://engineering.linkedin.com/blog/2019/data-hub) extensively talks about the design and mentions some of
|
[blog post](https://engineering.linkedin.com/blog/2019/data-hub) talks about the design extensively and mentions some of
|
||||||
the features of DataHub. Our open sourcing [blog post](https://engineering.linkedin.com/blog/2020/open-sourcing-datahub--linkedins-metadata-search-and-discovery-p)
|
the features of DataHub. Our open sourcing [blog post](https://engineering.linkedin.com/blog/2020/open-sourcing-datahub--linkedins-metadata-search-and-discovery-p)
|
||||||
also provides a comparison of some features between LinkedIn production DataHub vs open source DataHub. Although, these
|
also provides a comparison of some features between LinkedIn production DataHub vs open source DataHub. Below is a list of the latest features that are available in DataHub, as well as features that will soon become available.
|
||||||
are good references, we'll list down all available (also WIP) features of DataHub.
|
|
||||||
|
|
||||||
## Data Constructs (Entities)
|
## Data Constructs (Entities)
|
||||||
Currently, open source DataHub only supports datasets, users and groups data constructs.
|
|
||||||
|
|
||||||
### Datasets
|
### Datasets
|
||||||
- **Search**: full-text & advanced search, search ranking
|
- **Search**: full-text & advanced search, search ranking
|
||||||
- **Browse**: browsing through a fixed hierarchy
|
- **Browse**: browsing through a configurable hierarchy
|
||||||
- **Schema**: table & document schema in tabular and JSON format
|
- **Schema**: table & document schema in tabular and JSON format
|
||||||
- **Coarse grain lineage**: support for lineage at the dataset level, tabular & graphical visualization of downstreams/upstreams
|
- **Coarse grain lineage**: support for lineage at the dataset level, tabular & graphical visualization of downstreams/upstreams
|
||||||
- **Ownership**: surfacing owners of a dataset, viewing datasets you own
|
- **Ownership**: surfacing owners of a dataset, viewing datasets you own
|
||||||
- **Dataset life-cycle management**: deprecate/undeprecate, surface removed datasets and tag it with "removed"
|
- **Dataset life-cycle management**: deprecate/undeprecate, surface removed datasets and tag it with "removed"
|
||||||
- **Institutional knowledge**: support for adding free form doc to any dataset
|
- **Institutional knowledge**: support for adding free form doc to any dataset
|
||||||
- **Fine grain lineage**: support for lineage at the field level [*Not available yet*]
|
- **Fine grain lineage**: support for lineage at the field level [*available soon*]
|
||||||
- **Social actions**: likes, follows, bookmarks [*Not available yet*]
|
- **Social actions**: likes, follows, bookmarks [*available soon*]
|
||||||
- **Compliance management**: field level tag based compliance editing [*Not available yet*]
|
- **Compliance management**: field level tag based compliance editing [*available soon*]
|
||||||
- **Top users**: frequent users of a dataset [*Not available yet*]
|
- **Top users**: frequent users of a dataset [*available soon*]
|
||||||
|
|
||||||
### Users
|
### Users
|
||||||
- **Search**: full-text & advanced search, search ranking
|
- **Search**: full-text & advanced search, search ranking
|
||||||
|
- **Browse**: browsing through a configurable hierarchy [*available soon*]
|
||||||
- **Profile editing**: LinkedIn style professional profile editing such as summary, skills
|
- **Profile editing**: LinkedIn style professional profile editing such as summary, skills
|
||||||
|
|
||||||
|
### Metrics [*available soon*]
|
||||||
|
- **search**: full-text & advanced search, search ranking
|
||||||
|
- **Browse**: browsing through a configurable hierarchy
|
||||||
|
- **Basic information**: ownershp, dimensions, formula, input & output datasets, dashboards
|
||||||
|
- **Institutional knowledge**: support for adding free form doc to any metric
|
||||||
|
|
||||||
|
### Dashboards [*available soon*]
|
||||||
|
- **search**: full-text & advanced search, search ranking
|
||||||
|
- **Basic information**: ownership, location
|
||||||
|
- **Institutional knowledge**: support for adding free form doc to any dashboards
|
||||||
|
|
||||||
## Metadata Sources
|
## Metadata Sources
|
||||||
You can integrate any data platform to DataHub easily. As long as you have a way of *E*xtracting metadata from the platform and
|
You can integrate any data platform to DataHub easily. As long as you have a way of *Extracting* metadata from the platform and *Transform* that into our standard [MCE](what/mxe.md) format, you're free to *Load*/ingest metadata to DataHub from any available platform.
|
||||||
*T*ransform that into our standard [MCE](what/mxe.md) format, you're free to *L*oad/ingest metadata to DataHub from any available platform.
|
|
||||||
We have provided [ETL ingestion](architecture/metadata-ingestion.md) pipelines for:
|
We have provided example [ETL ingestion](architecture/metadata-ingestion.md) scripts for:
|
||||||
- Hive
|
- Hive
|
||||||
- Kafka
|
- Kafka
|
||||||
- RDBMS
|
- RDBMS
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user