Shriram Anbalagan cb70736d59
Update README.md
Adding step to Readme
2020-02-06 10:40:15 -08:00
2020-02-06 09:55:20 -08:00
2020-02-06 09:55:20 -08:00
2019-12-13 15:12:50 -08:00
2019-09-02 18:36:18 -07:00
2020-01-24 17:38:14 -08:00
2017-07-30 11:07:14 -07:00
2015-11-19 14:39:21 -08:00
2020-02-06 10:40:15 -08:00

DataHub

Build Status Gitter PRs Welcome

DataHub

Introduction

DataHub is LinkedIn's generalized metadata search & discovery tool. To learn more about DataHub, check out our LinkedIn blog post and Strata presentation. You should also visit DataHub Architecture to get a better understanding of how DataHub is implemented and DataHub Onboarding Guide to understand how to extend DataHub for your own use case. This repository contains the complete source code to be able to build DataHub's frontend & backend services.

Quickstart

  1. Install docker and docker-compose.
  2. Clone this repo.
  3. Open Docker either from the command line or the Desktop app and ensure it is up and running then cd into the cloneddatahub repo.
  4. Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
  1. After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
docker build -t ingestion -f docker/ingestion/Dockerfile . && cd docker/ingestion && docker-compose up
  1. Finally, you can start DataHub by opening http://localhost:9001 in your browser. You can sign in using datahub as both username and password.

Refer to debugging guide if you have issues in any of the above steps.

Releases

See Releases page for more details.

Roadmap

  1. Kubernetes for container orchestration
  2. Deploy DataHub to Azure Cloud
Description
The Metadata Platform for your Data and AI Stack
Readme Apache-2.0 1.2 GiB
Languages
Java 42.1%
Python 28.5%
TypeScript 27.6%
JavaScript 1.1%
Shell 0.2%
Other 0.1%