mirror of
https://github.com/datahub-project/datahub.git
synced 2025-10-20 13:35:12 +00:00
…
DataHub
Introduction
DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build DataHub's frontend & backend services.
Quickstart
- Install docker and docker-compose.
- Clone this repo and make sure you are at the
datahubbranch. - Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
- After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
- Finally, you can start
DataHubby typinghttp://localhost:9001in your browser. You can sign in withdatahubas username and password.
Quicklinks
Roadmap
- Add user profile page
- Deploy DataHub to Azure Cloud
Description
Languages
Java
40.9%
Python
28.8%
TypeScript
28.5%
JavaScript
1.1%
Shell
0.2%
Other
0.1%
