2019-12-18 18:57:18 -08:00
# DataHub
2020-01-23 12:04:27 -08:00
[](https://travis-ci.org/linkedin/datahub)
2019-09-02 18:56:00 -07:00
[](https://gitter.im/linkedin/datahub)
2019-09-01 16:03:45 -07:00
2019-12-18 18:57:18 -08:00

2015-11-19 14:39:21 -08:00
2019-09-08 20:25:58 -07:00
## Introduction
2019-12-20 02:36:24 -08:00
DataHub is LinkedIn's generalized metadata search & discovery tool. To learn more about DataHub, check out our
[LinkedIn blog post ](https://engineering.linkedin.com/blog/2019/data-hub ) and [Strata presentation ](https://speakerdeck.com/shirshanka/the-evolution-of-metadata-linkedins-journey-strata-nyc-2019 ).
You should also visit [DataHub Architecture ](docs/architecture/architecture.md ) to get a better understanding of how DataHub is implemented and
[DataHub Onboarding Guide ](docs/how/entity-onboarding.md ) to understand how to extend DataHub for your own use case.
This repository contains the complete source code to be able to build DataHub's frontend & backend services.
2016-02-09 12:23:00 -08:00
2019-08-31 20:51:14 -07:00
## Quickstart
2019-11-11 12:38:05 -08:00
1. Install [docker ](https://docs.docker.com/install/ ) and [docker-compose ](https://docs.docker.com/compose/install/ ).
2. Clone this repo and make sure you are at the `datahub` branch.
3. Run below command to download and run all Docker containers in your local:
```
cd docker/quickstart & & docker-compose pull & & docker-compose up --build
```
2019-12-18 18:57:18 -08:00
4. After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
2019-11-11 12:38:05 -08:00
```
2019-12-17 13:48:12 -08:00
./gradlew :metadata-events:mxe-schemas:build & & cd metadata-ingestion/mce-cli & & pip install --user -r requirements.txt & & python mce_cli.py produce -d bootstrap_mce.dat
2019-11-11 12:38:05 -08:00
```
2019-12-11 19:53:34 -08:00
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
2019-12-18 18:57:18 -08:00
5. Finally, you can start `DataHub` by typing `http://localhost:9001` in your browser. You can sign in with `datahub`
2019-09-08 20:25:58 -07:00
as username and password.
## Quicklinks
2019-12-20 02:36:24 -08:00
* [DataHub Architecture ](docs/architecture/architecture.md )
* [DataHub Onboarding Guide ](docs/how/entity-onboarding.md )
2019-09-08 20:25:58 -07:00
* [Docker Images ](docker )
* [Frontend App ](datahub-frontend )
2019-12-20 02:36:24 -08:00
* [Generalized Metadata Service ](gms )
2019-09-08 20:25:58 -07:00
* [Metadata Consumer Jobs ](metadata-jobs )
* [Metadata Ingestion ](metadata-ingestion )
2020-01-22 18:30:32 -08:00
## Releases
* 2019/09/21: [v0.1.0-alpha ](https://github.com/linkedin/datahub/releases/tag/datahub-v0.1.0-alpha )
* 2019/12/05: [v0.2.0-alpha ](https://github.com/linkedin/datahub/releases/tag/datahub-v0.2.0-alpha )
2019-09-08 20:25:58 -07:00
## Roadmap
2019-11-27 00:53:50 -08:00
1. Add user profile page
2019-12-18 18:57:18 -08:00
2. Deploy DataHub to [Azure Cloud ](https://azure.microsoft.com/en-us/ )