datahub

mirror of https://github.com/datahub-project/datahub.git synced 2025-12-11 18:16:58 +00:00

Author	SHA1	Message	Date
Mars Lan	d57bce2c0b	Redirect ETL job's stderr & stdout to files to make debugging easier. (#465 )	2017-07-10 13:42:54 -07:00
Yi (Alan) Wang	93242768ff	Update runBackend to source application.env for conf values (#458 )	2017-07-10 13:42:53 -07:00
Mars Lan	f5a7e0c9ec	Make sure all intermediate directories are created for ETL job property files. (#450 )	2017-07-10 13:42:51 -07:00
Mars Lan	2499ee0116	Unify the open-source application.conf with internal ones so we don't need to maintain both.	2017-07-10 13:42:50 -07:00
Mars Lan	fda572dd8a	Allow the logback directory for ETL jobs to be overridden using system property (#448 ) * Allow the logback directory for ETL jobs to be overridden using system property. See https://logback.qos.ch/manual/configuration.html#variableSubstitution for more details. * Add WHZ_ETL_TEMP_DIR env var and play config to control where the ETL job logs & temp files to be saved. This enables us to move away from the default /var/tmp/wherehows directory.	2017-07-10 13:42:16 -07:00
Mars Lan	ebeda9f690	Remove unused env var from template.	2017-07-10 09:58:44 -07:00
Mars Lan	9e55d80538	Add WHZ_KRB5_DIR environmental variable to the search path for gss-jass.conf & krb5.conf files. (#421 ) Also remove the unset WH_HOME directory from the search path.	2017-07-10 09:58:43 -07:00
Mars Lan	c75fa5e6dc	Use environmental variables to set ETL & Kafka job IDs. (#418 ) This will allow us to set different job IDs in staging & production via cfg2.	2017-07-10 09:57:51 -07:00
Mars Lan	cf4e157813	Read master key from environmental variable instead of from local fil… (#417 ) * Read master key from environmental variable instead of from local file. This would allow us to pass it in via cfg2 ultimiately. * Move the env var name to Constant.java	2017-07-10 09:55:16 -07:00
Mars Lan	11d6186fe6	Add healthcheck endpoint for frontend & backend. (#388 )	2017-07-10 09:55:11 -07:00
Yi (Alan) Wang	8ede6f3314	Move logback.xml, modify etl job command generation (#364 ) - Move logback.xml in metadata-etl to etl_logback.xml under backend/conf to avoid multiple logback config in classpath. ETL jobs are able to write to their own log file again. - Replace generated single string command with String[] and invoke Runtime.getRuntime().exec(String[])	2017-07-10 09:54:20 -07:00
Yi (Alan) Wang	3360fe79cc	Modify genearate java command to solve classpath issue (#362 ) Remove the single quote around classpath.	2017-07-10 09:54:20 -07:00
Mars Lan	a589abbd76	Split the root build script into multiple scripts. (#348 ) Split the root build script into multiple scripts Add coveralls support.	2017-07-10 09:54:08 -07:00
Mars Lan	6b7609918e	Replace sbt build with native Gradle Play plugin and update the docs. (#352 ) Benefits 1. Simpler setup - no need to download activator in order to build & run 2. Faster build - See https://engineering.linkedin.com/play/developing-play-applications-using-gradle 3. Streamlined dependency management - Everything defined in build.gradle, instead of build.gradle + build.sbt 4. Better integration with gradle lifecycle tasks - build, test, dist, clean all work as expected Changes 1. Location of staging & distribution files moved from target to build 2. Use ./gradle -t runPlayBinary to run app with hot reload support 3. The generated start scripts are quite different from those generated by sbt	2017-07-10 09:54:08 -07:00
Mars Lan	5a999f29b1	Revert "Split the root build script into multiple scripts." This reverts commit 4b8a6f86577739209b09ec8cc8cb09c2808f4aa7.	2017-07-10 09:54:08 -07:00
Mars Lan	edf5c54de3	Split the root build script into multiple scripts. Add support for coveralls.	2017-07-10 09:54:08 -07:00
Mars Lan	e36a40cd65	Generate code coverage reports (#334 ) * Add playCoverage task to run code coverage using JaCoco for backend and web. * Add jacocoTestReport task to run code coverage for testNG-based tests in wherehows-common & metadata-etl.	2017-07-10 09:53:28 -07:00
Mars Lan	bcc3cd9f76	Make unit tests buildable again for backend and web (#325 ) * Make unit tests buildable again for backend and web. * Add back fest dependency so the tests can stay more of less the same as before.	2017-07-10 09:53:28 -07:00
Naga Srinivas Vemuri	803e3added	Modify /dataset POST method to perform INSERT or UPDATE of the DatasetRecord	2017-07-10 09:53:25 -07:00
Christopher Chiche	d064e7bc47	Fix title in backend's README (#552 )	2017-06-08 08:09:47 -07:00
Shuya Tsukamoto	33e04a585a	Make possible to change settings via environment variables (#533 )	2017-05-26 10:28:05 -07:00
Shuya Tsukamoto	53fe63680f	Add a mkdir comamnd for the TreeBuilder output. (#499 )	2017-05-18 13:54:56 -07:00
Yi (Alan) Wang	b6e644fbb1	Optimize dataset load scripts, improve speed (#350 ) - When loading dataset fields in staging table, populate the dateset_id field first then use this in later JOIN. - When JOIN two big tables such as dict_field_detail, use pre-select to reduce table JOIN size and DB resource. - Refactor some SQL code. - Modify logback setting to better capture log time. - Remove unnecessary config in backend application.conf	2017-03-22 10:23:30 -07:00
Yi (Alan) Wang	66a8eea21b	Fix issues from Oracle MetadataChangeEvent integration (#336 ) * Fix issues from Oracle MetadataChangeEvent integration	2017-03-14 17:19:30 -07:00
Yi (Alan) Wang	4f873a919a	Fix bugs found by AppCheck in issue #328 (#335 )	2017-02-24 14:20:56 -08:00
Yi Wang	14824c06bb	Change sleep to 10s after etl job init error	2017-01-30 09:27:42 -08:00
Yi (Alan) Wang	665a5dbded	Add retry for ETL jobs failed at initialization (#308 )	2017-01-27 11:17:38 -08:00
Yi Wang	ea8f6e8551	Add retry for ETL jobs failed at initialization	2017-01-20 14:11:45 -08:00
Yi (Alan) Wang	e07306b51e	Update MetadataChangeEvent, separate privacy compliance from security (#275 )	2016-11-11 17:25:41 -08:00
Yi Wang	b4f5e438e2	Add JobExecutionLineageEvent and kafka processor	2016-11-08 19:11:37 -08:00
Yi (Alan) Wang	e34bbcc629	Update README.md (#264 )	2016-11-02 13:48:22 -07:00
Yi (Alan) Wang	dca47a3b75	Merge pull request #254 from alyiwang/master Upgrade to play 2.4.8	2016-10-20 13:18:58 -07:00
Douglas Moore	53f6622ed8	Update README.md (#252 ) Remove backlink to my github account.	2016-10-19 18:37:40 -07:00
Yi Wang	664e4072bb	Upgrade to play 2.4.8	2016-10-19 17:42:28 -07:00
Yi Wang	3227412339	Login authentication support multiple LDAP servers, add login history	2016-10-13 14:30:43 -07:00
Yi Wang	fcd6cf149e	Update MetastoreAuditProcessor to reduce storage, also refactor some code	2016-10-11 11:26:36 -07:00
Yi Wang	5049c847fa	Update Kafka consumer actors to reduce memory usage	2016-10-10 14:49:14 -07:00
Yi (Alan) Wang	c9dfb637af	Update MetadataChangeEvent APIs according to schema change (#243 ) * Update MetadataChangeEvent APIs according to schema change * Update MultiproductLoad to reflect new Owner types * Add comments for Owner_type precedence (priority) and compliance	2016-10-06 13:33:45 -07:00
jbai	a11e4908dc	tracking the GobblinTrackingEvent_autit to get owner information	2016-09-29 15:01:32 -07:00
Yi Wang	ac34eb683f	Update Kafka processor casting Object to String, also add debug info if can't fetch schema from Registery	2016-09-26 15:06:33 -07:00
Yi Wang	1ad2b1528e	logback redirect ETL job logs into corresponding files	2016-09-23 16:54:52 -07:00
Yi (Alan) Wang	753de7de7c	Merge pull request #233 from alyiwang/master Update backend APIs to cast SQL results back to Java record then to Json	2016-09-21 08:59:04 -07:00
Eric Sun	89ff794ddf	Add api to get dependents of a dataset (#232 ) * Use ProcessBuilder and redirected log file for HDFS Extract * relax urn validation rule * continue process if hive sql parsor encounters error * reformat etl job log message * add API to find dataset dependents, such as which hive tables are based on an hdfs path	2016-09-21 08:55:44 -07:00
Yi Wang	be65efb0cc	Update backend APIs to cast SQL results back to Java record then serialize to Json reply	2016-09-20 18:56:49 -07:00
Yi Wang(Data Infrastructure)	1171e00097	Add REST proxy for Security API from backend to web	2016-09-19 18:14:10 -07:00
Yi Wang	b136fc6c37	Add MetadataInventoryEvent processor and API	2016-09-15 09:22:42 -07:00
Eric Sun	86bf71499f	Reformat the ETL job info message in log. (#222 ) * Use ProcessBuilder and redirected log file for HDFS Extract * relax urn validation rule * continue process if hive sql parsor encounters error * reformat etl job log message	2016-09-13 14:01:14 -07:00
Yi Wang	5ce5a1425e	Add hostname and process_id to wh_etl_job_execution	2016-09-12 16:09:33 -07:00
Yi Wang	5515cbdde9	Add MatadataChangeEvent processor to call seperate APIs	2016-09-06 16:41:50 -07:00
Eric Sun	0ac00e1af3	Update README.md	2016-09-02 09:35:13 -07:00

1 2

98 Commits