mirror of
https://github.com/datahub-project/datahub.git
synced 2025-12-26 17:37:33 +00:00
Updated a readme instruction step and synced up all the readmes (#635)
This commit is contained in:
parent
2300737d86
commit
830ad7d537
14
README.md
14
README.md
@ -13,7 +13,6 @@ WhereHows serves as the single platform that:
|
||||
|
||||
|
||||
## Who Uses WhereHows?
|
||||
|
||||
Here is a list of companies known to use WhereHows. Let us know if we missed your company!
|
||||
|
||||
* [LinkedIn](http://www.linkedin.com)
|
||||
@ -23,27 +22,24 @@ Here is a list of companies known to use WhereHows. Let us know if we missed you
|
||||
|
||||
|
||||
## How Is WhereHows Used?
|
||||
|
||||
How WhereHows is used inside of LinkedIn and other potential [use cases][USE].
|
||||
|
||||
|
||||
## Documentation
|
||||
|
||||
The detailed information can be found in the [Wiki][wiki]
|
||||
|
||||
|
||||
## Examples in VM
|
||||
|
||||
## Examples in VM (Deprecated)
|
||||
There is a pre-built vmware image (about 11GB) to quickly demonstrate the functionality of WhereHows. Check out the [VM Guide][VM]
|
||||
|
||||
## WhereHows Docker
|
||||
Docker can provide configuration free dev/production setup quickly, please check out [Docker Getting Start Guide](https://github.com/linkedin/WhereHows/tree/master/wherehows-docker/README.md)
|
||||
|
||||
## Getting Started
|
||||
|
||||
New to Wherehows? Check out the [Getting Started Guide][GS]
|
||||
|
||||
|
||||
### Preparation
|
||||
|
||||
First, please [setup the metadata repository][DB] in MySQL.
|
||||
```
|
||||
CREATE DATABASE wherehows
|
||||
@ -58,7 +54,6 @@ GRANT ALL ON wherehows.* TO 'wherehows'
|
||||
Execute the [DDL files][DDL] to create the required repository tables in **wherehows** database.
|
||||
|
||||
### Build
|
||||
|
||||
1. Get the source code: ```git clone https://github.com/linkedin/WhereHows.git```
|
||||
2. Put a few 3rd-party jar files to **wherehows-etl/extralibs** directory. Some of these jar files may not be available in Maven Central or Artifactory. See [the download instrucitons][EXJAR] for more detail. ```cd WhereHows/wherehows-etl/extralibs```
|
||||
3. From the **WhereHows** root directory and build all the modules: ```./gradlew build```
|
||||
@ -67,17 +62,14 @@ Execute the [DDL files][DDL] to create the required repository tables in **where
|
||||
|
||||
|
||||
## Roadmap
|
||||
|
||||
Check out the current [roadmap][RM] for WhereHows.
|
||||
|
||||
|
||||
## Contribute
|
||||
|
||||
Want to contribute? Check out the [Contributors Guide][CON]
|
||||
|
||||
|
||||
## Community
|
||||
|
||||
Want help? Check out the [Gitter chat room][GITTER] and [Google Groups][LIST]
|
||||
|
||||
|
||||
|
||||
@ -3,16 +3,13 @@
|
||||
Wherehows works by sending out ‘crawlers’ to capture metadata from databases, hdfs, directory services, schedulers, and data integration tools. The collected metadata is loaded into an integrated data warehouse. Wherehows provides a web-ui service and a backend service.
|
||||
|
||||
Wherehows comes in three operational components:
|
||||
- A web-ui service
|
||||
- Backend service
|
||||
- **Backend service**
|
||||
- [A web-ui service](../wherehows-frontend/README.md)
|
||||
- Database schema for MySQL
|
||||
|
||||
The backend service provides the RESTful api but more importantly runs the ETL jobs that go and gather the metadata. The backend service relies heavily on the mysql wherehows database instance for configuration information and as a location for where the metadata will land.
|
||||
|
||||
The Web UI provides navigation between the bits of information and the ability to annotate the collected data with comments, ownership and more. The example below is for collecting Hive metadata collected from the Cloudera Hadoop VM
|
||||
|
||||
|
||||
Configuration notes:
|
||||
Configuration notes
|
||||
MySQL database for the Wherehows metadata database
|
||||
```
|
||||
host: <mysqlhost>
|
||||
@ -26,19 +23,18 @@ Host: <edge node>
|
||||
Folder: /opt/wherehows
|
||||
```
|
||||
|
||||
# Key notes:
|
||||
|
||||
## Key notes
|
||||
Please become familiar with these pages:
|
||||
- https://github.com/linkedin/WhereHows/wiki/Architecture (Nice tech overview)
|
||||
- https://github.com/linkedin/WhereHows
|
||||
- https://github.com/linkedin/WhereHows/blob/master/wherehows-docs/getting-started.md
|
||||
|
||||
### Build:
|
||||
### Build
|
||||
```
|
||||
./gradlew build dist
|
||||
$ ./gradlew build dist
|
||||
```
|
||||
|
||||
### Install:
|
||||
### Install (In Production)
|
||||
Download/upload the distribution binaries, unzip to
|
||||
```
|
||||
/opt/wherehows/wherehows-backend
|
||||
@ -46,13 +42,13 @@ Download/upload the distribution binaries, unzip to
|
||||
|
||||
Create temp space for wherehows
|
||||
```
|
||||
sudo mkdir /var/tmp/wherehows
|
||||
sudo chmod a+rw /var/tmp/wherehows
|
||||
sudo mkdir /var/tmp/wherehows/resource
|
||||
$ sudo mkdir /var/tmp/wherehows
|
||||
$ sudo chmod a+rw /var/tmp/wherehows
|
||||
$ sudo mkdir /var/tmp/wherehows/resource
|
||||
```
|
||||
|
||||
```
|
||||
cd /opt/wherehows/wherehows-backend
|
||||
$ cd /opt/wherehows/wherehows-backend
|
||||
```
|
||||
|
||||
The hive metastore (as MySQL database) properties need to match the hadoop cluster:
|
||||
@ -66,23 +62,18 @@ URL jdbc:mysql://<metastore host>:3306/metastore
|
||||
Set the hive metastore driver class to ```com.mysql.jdbc.Driver```
|
||||
other properties per configuration.
|
||||
|
||||
Ensure these JAR files are present
|
||||
```
|
||||
lib/jython-standalone-2.7.0.jar
|
||||
lib/mysql-connector-java-5.1.36.jar
|
||||
```
|
||||
|
||||
### Run
|
||||
To run the backend service:
|
||||
|
||||
Set the variables in application.env to configure the application.
|
||||
|
||||
To Run backend service application on port 19001 (from the wherehows-backend folder):
|
||||
To Run backend service application on port 9001 (from the wherehows-backend folder):
|
||||
```
|
||||
./runBackend
|
||||
$ ./runBackend
|
||||
```
|
||||
|
||||
Open browser to ```http://<edge node>:19001/```
|
||||
Open browser to ```http://<edge node>:9001/```
|
||||
This will show ‘TEST’. This is the RESTful api endpoint
|
||||
|
||||
|
||||
@ -91,7 +82,30 @@ Once the Hive ETL is fully flushed out, look at the HDFS metadata ETL
|
||||
Configure multiple Hive & HDFS jobs to gather data from all Hadoop clusters
|
||||
Add additional crawlers, for Oracle, Teradata, ETL and schedulers
|
||||
|
||||
### Troubleshooting
|
||||
To log in the first time to the web UI:
|
||||
## Troubleshooting
|
||||
|
||||
You have to create an account. In the upper right corner there is a "Not a member yet? Join Now" link. Click on that and get a form to fill out.
|
||||
- Compile error with the below messages:
|
||||
```
|
||||
TAliasClause aliasClouse = tablelist.getTable(i).getAliasClause();
|
||||
^
|
||||
symbol: class TAliasClause
|
||||
location: class UpdateStmt
|
||||
...
|
||||
* What went wrong:
|
||||
Execution failed for task ':wherehows-etl:compileJava'.
|
||||
> Compilation failed; see the compiler error output for details.
|
||||
|
||||
```
|
||||
You should install extra libs: [Install extra libs](https://github.com/linkedin/WhereHows/tree/master/wherehows-etl/extralibs)
|
||||
|
||||
- Other Running library failure:
|
||||
Ensure these JAR files are present in **wherehows-backend/build/stage/wherehows-backend/lib**
|
||||
```
|
||||
...
|
||||
gsp.jar
|
||||
hsqldb-hsqldb-1.8.0.10.jar
|
||||
mysql-mysql-connector-java-5.1.40.jar
|
||||
ojdbc7.jar
|
||||
terajdbc4.jar
|
||||
...
|
||||
```
|
||||
@ -3,6 +3,6 @@ set -a
|
||||
source application.env
|
||||
set +a
|
||||
|
||||
export JAVA_OPTS="-Xms512m -Xmx2048m -Dhttp.port=19001"
|
||||
export JAVA_OPTS="-Xms512m -Xmx2048m -Dhttp.port=9001"
|
||||
|
||||
build/stage/wherehows-backend/bin/playBinary
|
||||
|
||||
@ -1,11 +1,103 @@
|
||||
#WhereHows
|
||||
Enterprise Metadata Platform for Big Data
|
||||
# WhereHows Frontend UI
|
||||
The Web UI provides navigation between the bits of information and the ability to annotate the collected data with comments, ownership and more. The example below is for collecting Hive metadata collected from the Cloudera Hadoop VM
|
||||
|
||||
##Dependencies
|
||||
* Java
|
||||
* Scala
|
||||
* Play
|
||||
* Bower
|
||||
* NPM
|
||||
* Ember
|
||||
Wherehows comes in three operational components:
|
||||
- [Backend service](../wherehows-backend/README.md)
|
||||
- **A web-ui service**
|
||||
- Database schema for MySQL
|
||||
|
||||
## Key notes
|
||||
Please become familiar with these pages:
|
||||
- https://github.com/linkedin/WhereHows/wiki/Architecture (Nice tech overview)
|
||||
- https://github.com/linkedin/WhereHows
|
||||
- https://github.com/linkedin/WhereHows/blob/master/wherehows-docs/getting-started.md
|
||||
|
||||
|
||||
## Build
|
||||
```
|
||||
$ ../gradlew build dist
|
||||
|
||||
Starting a Gradle Daemon (subsequent builds will be faster)
|
||||
|
||||
> Task :wherehows-web:bowerInstall
|
||||
bower ember-cli-shims extra-resolution Unnecessary resolution: ember-cli-shims#0.1.3
|
||||
bower bootstrap extra-resolution Unnecessary resolution: bootstrap#3.3.7
|
||||
|
||||
> Task :wherehows-web:emberBuild
|
||||
...
|
||||
- dist/legacy-app/vendors/toastr/toastr.min-c4d50504a82305d607ae5ff7b33e0c39.css: 5.85 KB (2.68 KB gzipped)
|
||||
- dist/legacy-app/vendors/toastr/toastr.min-d59436971aa13b0e0c24d4332543fbef.js: 4.87 KB (1.91 KB gzipped)
|
||||
|
||||
|
||||
BUILD SUCCESSFUL in 56s
|
||||
21 actionable tasks: 9 executed, 12 up-to-date
|
||||
```
|
||||
|
||||
## Install (In Production)
|
||||
Download/upload the distribution binaries, unzip to
|
||||
**/opt/wherehows/wherehows-frontend**
|
||||
|
||||
|
||||
Create temp space for wherehows
|
||||
```
|
||||
$ sudo mkdir /var/tmp/wherehows
|
||||
$ sudo chmod a+rw /var/tmp/wherehows
|
||||
$ sudo mkdir /var/tmp/wherehows/resource
|
||||
```
|
||||
|
||||
```
|
||||
$ cd /opt/wherehows/wherehows-frontend
|
||||
```
|
||||
|
||||
## Configuration
|
||||
Forntend has a seperate configuration file in **wherehows-frontend/application.env**
|
||||
```
|
||||
# Secret Key
|
||||
WHZ_SECRET="change_me"
|
||||
|
||||
# Database Connection
|
||||
WHZ_DB_NAME="wherehows"
|
||||
WHZ_DB_USERNAME="wherehows"
|
||||
WHZ_DB_PASSWORD="wherehows"
|
||||
|
||||
# Fully qualified jdbc url
|
||||
WHZ_DB_URL="jdbc:mysql://localhost/wherehows"
|
||||
|
||||
# Serach Engine
|
||||
WHZ_SEARCH_ENGINE=elasticsearch
|
||||
|
||||
# Elasticsearch (Change "localhost" to your Es host )
|
||||
WHZ_ES_DATASET_URL="http://localhost:9200/wherehows/dataset/_search"
|
||||
WHZ_ES_METRIC_URL="http://localhost:9200/wherehows/metric/_search"
|
||||
WHZ_ES_FLOW_URL="http://localhost:9200/wherehows/flow_jobs/_search"
|
||||
|
||||
# LDAP
|
||||
WHZ_LDAP_URL=your_ldap_url
|
||||
WHZ_LDAP_PRINCIPAL_DOMAIN=your_ldap_principal_domain
|
||||
WHZ_LDAP_SEARCH_BASE=your_ldap_search_base
|
||||
|
||||
# Piwik tracking configuration
|
||||
PIWIK_SITE_ID="0000" # change_to_your_piwik_id
|
||||
PIWIK_URL="change_to_your_piwik_url"
|
||||
|
||||
```
|
||||
|
||||
|
||||
## Run
|
||||
To run frontend app, go to **wherehows-frontend**
|
||||
```
|
||||
$ ./runFrontend
|
||||
|
||||
NettyServer.main is deprecated. Please start your Play server with the
|
||||
2017-08-02 14:19:58 INFO p.a.Play:97 - Application started (Prod)
|
||||
2017-08-02 14:19:58 INFO p.c.s.NettyServer:165 - Listening for HTTP on /0:0:0:0:0:0:0:0:9001
|
||||
|
||||
```
|
||||
|
||||
Open browser to ```http://<edge node>:9000/```
|
||||
This will show WhereHows login page.
|
||||
|
||||
## Troubleshooting
|
||||
- To log in the first time to the web UI:
|
||||
You have to create an account. In the upper right corner there is a "Not a member yet? Join Now" link. Click on that and get a form to fill out.
|
||||
|
||||
|
||||
@ -3,6 +3,6 @@ set -a
|
||||
source ./application.env
|
||||
set +a
|
||||
|
||||
export JAVA_OPTS="-Xms512m -Xmx2048m -Dhttp.port=9001"
|
||||
export JAVA_OPTS="-Xms512m -Xmx2048m -Dhttp.port=9000"
|
||||
|
||||
build/stage/wherehows-frontend/bin/playBinary
|
||||
Loading…
x
Reference in New Issue
Block a user