2019-09-08 20:25:58 -07:00
2019-10-03 19:27:41 -07:00
2019-10-02 16:28:03 -07:00
2019-10-02 16:28:03 -07:00
2019-10-08 18:28:24 -07:00
2019-09-02 18:36:18 -07:00
2019-10-22 10:45:14 -07:00
2017-07-30 11:07:14 -07:00
2015-11-19 14:39:21 -08:00
2019-09-05 10:53:55 -07:00

Data Hub

Build Status Gitter

Data Hub

Introduction

Data Hub is Linkedin's generalized metadata search & discovery tool. To learn more about Data Hub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build Data Hub's frontend & backend services.

Quickstart

  1. Install docker and docker-compose.
  2. Clone this repo and make sure you are at the datahub branch.
  3. Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
  1. After you have all Docker containers running in your machine, run below command to ingest provided sample data to Data Hub:
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && sudo pip install -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
  1. Finally, you can start Data Hub by typing http://localhost:9001 in your browser. You can sign in with datahub as username and password.

Roadmap

  1. Add Neo4J graph query support
  2. Add user profile page
  3. Deploy Data Hub to Azure Cloud
Description
The Metadata Platform for your Data and AI Stack
Readme Apache-2.0 1.3 GiB
Languages
Java 40.3%
TypeScript 29.1%
Python 28.9%
JavaScript 0.9%
Shell 0.2%
Other 0.2%