datahub/README.md

37 lines
1.9 KiB
Markdown
Raw Normal View History

2019-12-18 18:57:18 -08:00
# DataHub
[![Build Status](https://travis-ci.org/linkedin/WhereHows.svg?branch=datahub)](https://travis-ci.org/linkedin/WhereHows)
[![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/linkedin/datahub)
2019-09-01 16:03:45 -07:00
2019-12-18 18:57:18 -08:00
![DataHub](docs/imgs/datahub-logo.png)
2015-11-19 14:39:21 -08:00
2019-09-08 20:25:58 -07:00
## Introduction
2019-12-18 18:57:18 -08:00
DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our
[Linkedin blog post](https://engineering.linkedin.com/blog/2019/data-hub) and [Strata presentation](https://speakerdeck.com/shirshanka/the-evolution-of-metadata-linkedins-journey-strata-nyc-2019). This repository contains the complete source code to be able to build DataHub's frontend & backend services.
2016-02-09 12:23:00 -08:00
2019-08-31 20:51:14 -07:00
## Quickstart
2019-11-11 12:38:05 -08:00
1. Install [docker](https://docs.docker.com/install/) and [docker-compose](https://docs.docker.com/compose/install/).
2. Clone this repo and make sure you are at the `datahub` branch.
3. Run below command to download and run all Docker containers in your local:
```
cd docker/quickstart && docker-compose pull && docker-compose up --build
```
2019-12-18 18:57:18 -08:00
4. After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
2019-11-11 12:38:05 -08:00
```
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
2019-11-11 12:38:05 -08:00
```
2019-12-11 19:53:34 -08:00
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
2019-12-18 18:57:18 -08:00
5. Finally, you can start `DataHub` by typing `http://localhost:9001` in your browser. You can sign in with `datahub`
2019-09-08 20:25:58 -07:00
as username and password.
## Quicklinks
* [Docker Images](docker)
* [Frontend App](datahub-frontend)
* [Generalized Metadata Store](gms)
* [Metadata Consumer Jobs](metadata-jobs)
* [Metadata Ingestion](metadata-ingestion)
## Roadmap
1. Add user profile page
2019-12-18 18:57:18 -08:00
2. Deploy DataHub to [Azure Cloud](https://azure.microsoft.com/en-us/)