* Allow the logback directory for ETL jobs to be overridden using system property.
See https://logback.qos.ch/manual/configuration.html#variableSubstitution for more details.
* Add WHZ_ETL_TEMP_DIR env var and play config to control where the ETL job logs & temp files to be saved.
This enables us to move away from the default /var/tmp/wherehows directory.
- Move logback.xml in metadata-etl to etl_logback.xml under backend/conf to avoid multiple logback config in classpath. ETL jobs are able to write to their own log file again.
- Replace generated single string command with String[] and invoke Runtime.getRuntime().exec(String[])
- When loading dataset fields in staging table, populate the dateset_id field first then use this in later JOIN.
- When JOIN two big tables such as dict_field_detail, use pre-select to reduce table JOIN size and DB resource.
- Refactor some SQL code.
- Modify logback setting to better capture log time.
- Remove unnecessary config in backend application.conf
* Use ProcessBuilder and redirected log file for HDFS Extract
* relax urn validation rule
* continue process if hive sql parsor encounters error
* reformat etl job log message
* add API to find dataset dependents, such as which hive tables are based on an hdfs path
* Add multiproduct and git repo metadata etl job
* Extract commit hash use it when querying acl
* Use FileWriter to write records into CSV file
* Remove unnecessary log entries from kafka processor
* Fix the incompatibility between integer repo_id in db and string field in record
* merge API tables to existing dataset owner and schema field table
* Add confidential and recursive column to dict_dataset_field