Mars Lan
e36a40cd65
Generate code coverage reports ( #334 )
...
* Add playCoverage task to run code coverage using JaCoco for backend and web.
* Add jacocoTestReport task to run code coverage for testNG-based tests in wherehows-common & metadata-etl.
2017-07-10 09:53:28 -07:00
Mars Lan
bcc3cd9f76
Make unit tests buildable again for backend and web ( #325 )
...
* Make unit tests buildable again for backend and web.
* Add back fest dependency so the tests can stay more of less the same as before.
2017-07-10 09:53:28 -07:00
Naga Srinivas Vemuri
803e3added
Modify /dataset POST method to perform INSERT or UPDATE of the DatasetRecord
2017-07-10 09:53:25 -07:00
Christopher Chiche
d064e7bc47
Fix title in backend's README ( #552 )
2017-06-08 08:09:47 -07:00
Shuya Tsukamoto
33e04a585a
Make possible to change settings via environment variables ( #533 )
2017-05-26 10:28:05 -07:00
Shuya Tsukamoto
53fe63680f
Add a mkdir comamnd for the TreeBuilder output. ( #499 )
2017-05-18 13:54:56 -07:00
Yi (Alan) Wang
b6e644fbb1
Optimize dataset load scripts, improve speed ( #350 )
...
- When loading dataset fields in staging table, populate the dateset_id field first then use this in later JOIN.
- When JOIN two big tables such as dict_field_detail, use pre-select to reduce table JOIN size and DB resource.
- Refactor some SQL code.
- Modify logback setting to better capture log time.
- Remove unnecessary config in backend application.conf
2017-03-22 10:23:30 -07:00
Yi (Alan) Wang
66a8eea21b
Fix issues from Oracle MetadataChangeEvent integration ( #336 )
...
* Fix issues from Oracle MetadataChangeEvent integration
2017-03-14 17:19:30 -07:00
Yi (Alan) Wang
4f873a919a
Fix bugs found by AppCheck in issue #328 ( #335 )
2017-02-24 14:20:56 -08:00
Yi Wang
14824c06bb
Change sleep to 10s after etl job init error
2017-01-30 09:27:42 -08:00
Yi (Alan) Wang
665a5dbded
Add retry for ETL jobs failed at initialization ( #308 )
2017-01-27 11:17:38 -08:00
Yi Wang
ea8f6e8551
Add retry for ETL jobs failed at initialization
2017-01-20 14:11:45 -08:00
Yi (Alan) Wang
e07306b51e
Update MetadataChangeEvent, separate privacy compliance from security ( #275 )
2016-11-11 17:25:41 -08:00
Yi Wang
b4f5e438e2
Add JobExecutionLineageEvent and kafka processor
2016-11-08 19:11:37 -08:00
Yi (Alan) Wang
e34bbcc629
Update README.md ( #264 )
2016-11-02 13:48:22 -07:00
Yi (Alan) Wang
dca47a3b75
Merge pull request #254 from alyiwang/master
...
Upgrade to play 2.4.8
2016-10-20 13:18:58 -07:00
Douglas Moore
53f6622ed8
Update README.md ( #252 )
...
Remove backlink to my github account.
2016-10-19 18:37:40 -07:00
Yi Wang
664e4072bb
Upgrade to play 2.4.8
2016-10-19 17:42:28 -07:00
Yi Wang
3227412339
Login authentication support multiple LDAP servers, add login history
2016-10-13 14:30:43 -07:00
Yi Wang
fcd6cf149e
Update MetastoreAuditProcessor to reduce storage, also refactor some code
2016-10-11 11:26:36 -07:00
Yi Wang
5049c847fa
Update Kafka consumer actors to reduce memory usage
2016-10-10 14:49:14 -07:00
Yi (Alan) Wang
c9dfb637af
Update MetadataChangeEvent APIs according to schema change ( #243 )
...
* Update MetadataChangeEvent APIs according to schema change
* Update MultiproductLoad to reflect new Owner types
* Add comments for Owner_type precedence (priority) and compliance
2016-10-06 13:33:45 -07:00
jbai
a11e4908dc
tracking the GobblinTrackingEvent_autit to get owner information
2016-09-29 15:01:32 -07:00
Yi Wang
ac34eb683f
Update Kafka processor casting Object to String, also add debug info if can't fetch schema from Registery
2016-09-26 15:06:33 -07:00
Yi Wang
1ad2b1528e
logback redirect ETL job logs into corresponding files
2016-09-23 16:54:52 -07:00
Yi (Alan) Wang
753de7de7c
Merge pull request #233 from alyiwang/master
...
Update backend APIs to cast SQL results back to Java record then to Json
2016-09-21 08:59:04 -07:00
Eric Sun
89ff794ddf
Add api to get dependents of a dataset ( #232 )
...
* Use ProcessBuilder and redirected log file for HDFS Extract
* relax urn validation rule
* continue process if hive sql parsor encounters error
* reformat etl job log message
* add API to find dataset dependents, such as which hive tables are based on an hdfs path
2016-09-21 08:55:44 -07:00
Yi Wang
be65efb0cc
Update backend APIs to cast SQL results back to Java record then serialize to Json reply
2016-09-20 18:56:49 -07:00
Yi Wang(Data Infrastructure)
1171e00097
Add REST proxy for Security API from backend to web
2016-09-19 18:14:10 -07:00
Yi Wang
b136fc6c37
Add MetadataInventoryEvent processor and API
2016-09-15 09:22:42 -07:00
Eric Sun
86bf71499f
Reformat the ETL job info message in log. ( #222 )
...
* Use ProcessBuilder and redirected log file for HDFS Extract
* relax urn validation rule
* continue process if hive sql parsor encounters error
* reformat etl job log message
2016-09-13 14:01:14 -07:00
Yi Wang
5ce5a1425e
Add hostname and process_id to wh_etl_job_execution
2016-09-12 16:09:33 -07:00
Yi Wang
5515cbdde9
Add MatadataChangeEvent processor to call seperate APIs
2016-09-06 16:41:50 -07:00
Eric Sun
0ac00e1af3
Update README.md
2016-09-02 09:35:13 -07:00
Douglas Moore
d44c194529
Backend service readme ( #215 )
...
* Update README.md
* Update README
* Rename README to README.me
* Rename README.me to README.md
* Update README.md
2016-09-02 09:31:52 -07:00
Yi (Alan) Wang
579b8fc9d7
Add metadataChangeEvent APIs to backend-service ( #205 )
...
* Add multiproduct and git repo metadata etl job
* Extract commit hash use it when querying acl
* Use FileWriter to write records into CSV file
* Remove unnecessary log entries from kafka processor
* Fix the incompatibility between integer repo_id in db and string field in record
* merge API tables to existing dataset owner and schema field table
* Add confidential and recursive column to dict_dataset_field
2016-08-24 09:10:35 -07:00
Eric Sun
cd4853d0a5
Use ProcessBuilder and redirected log file for HDFS Extract ( #198 )
...
* Use ProcessBuilder and redirected log file for HDFS Extract
* relax urn validation rule
2016-08-08 14:02:34 -07:00
Yi Wang
c0cfe1f5ca
Modify KafkaConsumerMaster to handle more than one kafka config, add error handling
2016-08-04 13:07:19 -07:00
Yi Wang
3d3b2a8075
Get kafka job id from applicatoin.conf and then get ref_id and configs from DB
2016-08-03 18:55:07 -07:00
Eric Sun
ca7542ca40
remove duplicate section
2016-08-03 17:52:49 -07:00
Eric Sun
1cd5872369
temp fix for hdfs_schema_crawler getRuntime().exec() hangs problem; exclude log4j
2016-08-03 15:50:00 -07:00
Eric Sun
8c9cb99ba4
primary_dataset_type for cfg_database
2016-08-01 13:20:04 -07:00
Eric Sun
9d2c803f0c
Merge pull request #187 from ericsun2/master
...
Add datacenter, deploymenttier, cluster info to better describe dataset instance
2016-07-28 17:22:32 -07:00
Eric Sun
f745642212
add datacenter, deploymenttier, cluster to describe dataset instance
2016-07-28 16:38:03 -07:00
Yi Wang
74ed769bab
add Oracle dataset metadata ETL job
2016-07-28 14:07:07 -07:00
Yi Wang
7edacc9a9f
get kafka config from wh_etl_job_property
2016-07-26 12:16:34 -07:00
Yi Wang
6d4706bc62
Ingest Gobblin tracking events into wherehows using Kafka consumer client
2016-07-25 15:03:29 -07:00
jbai
7a77aba4b7
merge the pull request 165 to master branch
2016-07-21 10:38:36 -07:00
Naga Srinivas Vemuri
97370ed2e1
Query Dataset properties to retrieve datasetUrns
2016-07-21 11:54:47 +05:30
jbai
6af54658d6
merge Fetching dataset watchers via get /dataset/watchers to main branch
2016-06-30 10:20:54 -07:00