Modify Kafka Master to handle more than one Kafka connection configurations Add additional error handling when starting the service With this change in place. - each Kafka Zookeeper requires a corresponding entry defined in wh_etl_job - the connection info (such as Zookeeper, SchemaRegistery, topic to staging table mapping...) are configured in wh_etl_job_property - kafka.consumer.etl.jobid in application.conf will determine if such Kafka job will be launched when the backend-service starts
WhereHows 
WhereHows is a data discovery and lineage tool built at LinkedIn. It integrates with all the major data processing systems and collects both catalog and operational metadata from them.
Within the central metadata repository, WhereHows curates, associates, and surfaces the metadata information through two interfaces:
- a web application that enables data & linage discovery, and community collaboration
- an API endpoint that empowers automation of data processes/applications
WhereHows serves as the single platform that:
- links data objects with people and processes
- enables crowdsourcing for data knowledge
- provides data governance and provenance based on ownership and lineage
Documentation
The detailed information can be found in the Wiki
Examples in VM
There is a pre-built vmware image (about 11GB) to quickly demonstrate the functionality of WhereHows. Check out the VM Guide
Getting Started
New to Wherehows? Check out the Getting Started Guide
Preparation
First, please get Play Framework in place.
wget http://downloads.typesafe.com/play/2.2.4/play-2.2.4.zip
# Unzip, Remove zipped folder, move play folder to $HOME
unzip play-2.2.4.zip && rm play-2.2.4.zip && mv play-2.2.4 $HOME/
# Add PLAY_HOME, GRADLE_HOME. Update Path to include new gradle, alias to counteract issues
echo 'export PLAY_HOME="$HOME/play-2.2.4"' >> ~/.bashrc
source ~/.bashrc
You need to update the file $PLAY_HOME/framework/build to increase the JVM stack size (-Xss1M) to 2M or more.
Second, please setup the metadata repository in MySQL.
CREATE DATABASE wherehows
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_general_ci;
CREATE USER 'wherehows';
SET PASSWORD FOR 'wherehows' = PASSWORD('wherehows');
GRANT ALL ON wherehows.* TO 'wherehows'
Execute the DDL files to create the required repository tables in wherehows database.
Build
- Get the source code:
git clone https://github.com/linkedin/WhereHows.git - Put a few 3rd-party jar files to metadata-etl/extralibs directory. Some of these jar files may not be available in Maven Central or Artifactory. See the download instrucitons for more detail.
cd WhereHows/metadata-etl/extralibs - Go back to the WhereHows root directory and build all the modules:
./gradlew build - Go back to the WhereHows root directory and start the metadata ETL and API service:
cd backend-service ; $PLAY_HOME/play run - Go back to the WhereHows root directory and start the web front-end:
cd web ; $PLAY_HOME/play runThen WhereHows UI is available at http://localhost:9000 by default. For example,play run -Dhttp.port=19001will use port 19001 to serve UI.
Contribute
Want to contribute? Check out the Contributors Guide