mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-24 18:10:11 +00:00
DataHub
Introduction
DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build DataHub's frontend & backend services.
Quickstart
- Install docker and docker-compose.
- Clone this repo and make sure you are at the
datahub
branch. - Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
- After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
- Finally, you can start
DataHub
by typinghttp://localhost:9001
in your browser. You can sign in withdatahub
as username and password.
Quicklinks
Roadmap
- Add user profile page
- Deploy DataHub to Azure Cloud
Description
Languages
Java
39.5%
Python
29.9%
TypeScript
28.7%
JavaScript
1%
Shell
0.2%
Other
0.4%