mirror of
				https://github.com/datahub-project/datahub.git
				synced 2025-11-04 12:51:23 +00:00 
			
		
		
		
	DataHub
Introduction
DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build DataHub's frontend & backend services.
Quickstart
- Install docker and docker-compose.
 - Clone this repo and make sure you are at the 
datahubbranch. - Run below command to download and run all Docker containers in your local:
 
cd docker/quickstart && docker-compose pull && docker-compose up --build
- After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
 
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
- Finally, you can start 
DataHubby typinghttp://localhost:9001in your browser. You can sign in withdatahubas username and password. 
Quicklinks
Roadmap
- Add user profile page
 - Deploy DataHub to Azure Cloud
 
Description
				
					Languages
				
				
								
								
									Java
								
								41.9%
							
						
							
								
								
									Python
								
								28.5%
							
						
							
								
								
									TypeScript
								
								27.7%
							
						
							
								
								
									JavaScript
								
								1.1%
							
						
							
								
								
									Shell
								
								0.2%
							
						
							
								
								
									Other
								
								0.2%
							
						
					