2017-06-08 17:09:47 +02:00
# Linkedin Wherehows - a Metadata data warehouse
2016-09-02 12:31:52 -04:00
Wherehows works by sending out ‘ crawlers’ to capture metadata from databases, hdfs, directory services, schedulers, and data integration tools. The collected metadata is loaded into an integrated data warehouse. Wherehows provides a web-ui service and a backend service.
Wherehows comes in three operational components:
2017-08-03 10:53:37 -07:00
- **Backend service**
- [A web-ui service ](../wherehows-frontend/README.md )
2016-09-02 12:31:52 -04:00
- Database schema for MySQL
The backend service provides the RESTful api but more importantly runs the ETL jobs that go and gather the metadata. The backend service relies heavily on the mysql wherehows database instance for configuration information and as a location for where the metadata will land.
2017-08-03 10:53:37 -07:00
Configuration notes
2016-09-02 12:31:52 -04:00
MySQL database for the Wherehows metadata database
```
host: < mysqlhost >
2017-03-21 11:37:38 -07:00
db: wherehows
2016-09-02 12:31:52 -04:00
user: wherehows
pass: wherehows
```
Wherehows application directory (in test):
```
Host: < edge node >
Folder: /opt/wherehows
```
2017-08-03 10:53:37 -07:00
## Key notes
2016-09-02 12:31:52 -04:00
Please become familiar with these pages:
- https://github.com/linkedin/WhereHows/wiki/Architecture (Nice tech overview)
- https://github.com/linkedin/WhereHows
2017-07-21 16:09:50 -07:00
- https://github.com/linkedin/WhereHows/blob/master/wherehows-docs/getting-started.md
2016-09-02 12:31:52 -04:00
2017-08-03 10:53:37 -07:00
### Build
2016-09-02 12:31:52 -04:00
```
2017-08-03 10:53:37 -07:00
$ ./gradlew build dist
2016-09-02 12:31:52 -04:00
```
2017-08-03 10:53:37 -07:00
### Install (In Production)
2016-11-02 13:48:22 -07:00
Download/upload the distribution binaries, unzip to
2016-09-02 12:31:52 -04:00
```
2017-03-21 11:37:38 -07:00
/opt/wherehows/wherehows-backend
2016-09-02 12:31:52 -04:00
```
Create temp space for wherehows
```
2017-08-03 10:53:37 -07:00
$ sudo mkdir /var/tmp/wherehows
$ sudo chmod a+rw /var/tmp/wherehows
$ sudo mkdir /var/tmp/wherehows/resource
2016-09-02 12:31:52 -04:00
```
```
2017-08-03 10:53:37 -07:00
$ cd /opt/wherehows/wherehows-backend
2016-09-02 12:31:52 -04:00
```
The hive metastore (as MySQL database) properties need to match the hadoop cluster:
```
Host < metastore host >
Port 3306
Username hive
Password hive
2017-03-21 11:37:38 -07:00
URL jdbc:mysql://< metastore host > :3306/metastore
2016-09-02 12:31:52 -04:00
```
Set the hive metastore driver class to ```com.mysql.jdbc.Driver` ``
other properties per configuration.
2017-04-26 16:34:50 -07:00
2016-09-02 12:31:52 -04:00
### Run
To run the backend service:
2017-04-26 16:34:50 -07:00
Set the variables in application.env to configure the application.
2017-08-03 10:53:37 -07:00
To Run backend service application on port 9001 (from the wherehows-backend folder):
2016-09-02 12:31:52 -04:00
```
2017-08-03 10:53:37 -07:00
$ ./runBackend
2016-09-02 12:31:52 -04:00
```
2017-08-03 10:53:37 -07:00
Open browser to ```http://<edge node>:9001/` ``
2016-09-02 12:31:52 -04:00
This will show ‘ TEST’ . This is the RESTful api endpoint
## Next steps
Once the Hive ETL is fully flushed out, look at the HDFS metadata ETL
Configure multiple Hive & HDFS jobs to gather data from all Hadoop clusters
Add additional crawlers, for Oracle, Teradata, ETL and schedulers
2017-08-03 10:53:37 -07:00
## Troubleshooting
- Compile error with the below messages:
```
TAliasClause aliasClouse = tablelist.getTable(i).getAliasClause();
^
symbol: class TAliasClause
location: class UpdateStmt
...
* What went wrong:
Execution failed for task ':wherehows-etl:compileJava'.
> Compilation failed; see the compiler error output for details.
```
You should install extra libs: [Install extra libs ](https://github.com/linkedin/WhereHows/tree/master/wherehows-etl/extralibs )
- Other Running library failure:
Ensure these JAR files are present in **wherehows-backend/build/stage/wherehows-backend/lib**
```
...
gsp.jar
hsqldb-hsqldb-1.8.0.10.jar
mysql-mysql-connector-java-5.1.40.jar
ojdbc7.jar
terajdbc4.jar
...
```