yujunjun/datahub

mirror of https://github.com/datahub-project/datahub.git synced 2025-12-11 18:16:58 +00:00

Go to file

Shriram Anbalagan 1a7ef9f51b

Update README.md

2020-02-06 11:28:30 -08:00

.github/ISSUE_TEMPLATE

Add issue templates

2020-01-15 10:32:50 -08:00

Add contrib directory for community contirbutions

2020-01-01 15:17:26 -08:00

Add URNs for results to search response

2020-01-29 10:36:04 -08:00

datahub-frontend

Add forward slash escape for Elasticsearch queries

2020-02-05 19:05:49 -08:00

Update README.md

2020-02-06 09:55:20 -08:00

Fix lowercase_keyword analyzer settings for people entity

2020-02-06 01:39:05 -08:00

Update debugging.md

2020-02-06 10:53:37 -08:00

Add some debuggin help & update default profile image link

2020-01-27 15:59:34 -08:00

Initial commit for Data Hub

2019-08-31 20:51:14 -07:00

Remove dataset groups entity

2019-12-13 15:12:50 -08:00

metadata-builders

Remove dataset groups entity

2019-12-13 15:12:50 -08:00

metadata-models 50.0.6 -> 54.0.1:

2019-12-13 11:46:49 -08:00

metadata-dao-impl

Remove dataset groups entity

2019-12-13 15:12:50 -08:00

metadata-events

Removing unnecessary classes for mxe-registration

2019-12-04 17:53:19 -08:00

metadata-ingestion

Documentation update part-1

2019-12-18 18:57:18 -08:00

Rename elasticsearch-index to mae-consumer in MaeStreamTask

2019-12-19 17:46:19 -08:00

metadata-models

Add some debuggin help & update default profile image link

2020-01-27 15:59:34 -08:00

metadata-restli-resource

metadata-models 50.0.6 -> 54.0.1:

2019-12-13 11:46:49 -08:00

metadata-testing

Remove dataset groups entity

2019-12-13 15:12:50 -08:00

Enable datahub-mae-consumer job to build graph as well

2019-11-26 22:19:46 -08:00

metadata-validators

corp-identity-gms 1.0.26 -> 1.0.40:

2019-11-19 02:27:28 -08:00

.dockerignore

Add docker ignore file

2019-09-02 18:36:18 -07:00

.gitignore

Add missing MXE models and fix .gitignore

2019-09-01 15:23:39 -07:00

.travis.yml

Update travis configuration to optimize build time

2019-10-07 15:38:12 -07:00

build.gradle

Fix some changes which came with automatic commit

2019-11-19 03:08:00 -08:00

CONTRIBUTING.md

Fix doc

2020-01-24 17:38:14 -08:00

gradlew

Update gradle version to 4.0.2 (#627 )

2017-07-30 11:07:14 -07:00

gradlew.bat

Update gradle version to 4.0.2 (#627 )

2017-07-30 11:07:14 -07:00

LICENSE

Initial commit

2015-11-19 14:39:21 -08:00

README.md

Update README.md

2020-02-06 11:28:30 -08:00

repositories.gradle

Initial commit for Data Hub

2019-08-31 20:51:14 -07:00

settings.gradle

Rename elasticsearch-index-job to mae-consumer-job

2019-11-20 18:19:31 -08:00

README.md

DataHub

Introduction

DataHub is LinkedIn's generalized metadata search & discovery tool. To learn more about DataHub, check out our LinkedIn blog post and Strata presentation. You should also visit DataHub Architecture to get a better understanding of how DataHub is implemented and DataHub Onboarding Guide to understand how to extend DataHub for your own use case. This repository contains the complete source code to be able to build DataHub's frontend & backend services.

Quickstart

Install docker and docker-compose.
Clone this repo.
Open Docker either from the command line or the Desktop app and ensure it is up and running then cd into the cloneddatahub repo.
Run below command to download and run all Docker containers in your local:

cd docker/quickstart && docker-compose pull && docker-compose up --build

After you have all Docker containers running in your machine:

Switch to a new terminal, cd into the clone repo and run below command to ingest provided sample data to DataHub:

docker build -t ingestion -f docker/ingestion/Dockerfile . && cd docker/ingestion && docker-compose up

Note : If ingestion command is not run, you may not have enough sample data to explore the application and its features.

Finally, you can start DataHub by opening http://localhost:9001 in your browser. You can sign in using datahub as both username and password.

Refer to debugging guide if you have issues in any of the above steps.

Quicklinks

Releases

See Releases page for more details.

Roadmap

Kubernetes for container orchestration
Deploy DataHub to Azure Cloud

Languages

Java 42.2%

Python 28.7%

TypeScript 27.3%

JavaScript 1.1%

Shell 0.2%

Other 0.1%