176 Commits

Author SHA1 Message Date
jbai
85bc2db85c add try catch to catch the exception when reading the config properties 2016-07-26 16:53:30 -07:00
Yi Wang
7edacc9a9f get kafka config from wh_etl_job_property 2016-07-26 12:16:34 -07:00
Yi Wang
6d4706bc62 Ingest Gobblin tracking events into wherehows using Kafka consumer client 2016-07-25 15:03:29 -07:00
jbai
9fb5b09bd2 update dependency property name and fix the duplicated key issue when update cfg_object_name_map table 2016-07-20 19:07:16 -07:00
jbai
f3c299480f update the column names from schema to schema_text and view_expanded_text to ddl_text 2016-07-20 18:01:25 -07:00
jbai
33b05cde4b tracking the dalids schema and expanded text by versions 2016-07-20 15:59:11 -07:00
jbai
9166db7563 update the dict_dataset_instance data loading sql since table key changed 2016-06-29 18:00:10 -07:00
Eric Sun
1573fdb212 rename hive dependency to hive_exec; reuse metadata-etl/extralibs; test travis ci; 2016-06-28 18:03:02 -07:00
Eric Sun
5348d44a77 Force object (db.table) names extracted by the getViewDependency() API to lower cases
object (db.table) extracted by the getViewDependency() API can contain the camel cases string, this can potentially cause mismatch in the underlying RDBMS.
2016-06-27 16:43:04 -07:00
Eric Sun
3af1ce4202 Ignore non-JSON schema_literal (which is often the error message) 2016-06-27 16:07:33 -07:00
Rafal Kluszczynski
cc13379075 fix: use logback logging provided by play framework (exclude log4j binding) 2016-06-27 11:01:37 +02:00
Eric Sun
56cebd0d9c add dict_dataset_instance and cfg_object_name_map to track dataset on multiple clusters and their replication dependency and view dependency 2016-06-22 19:40:05 -07:00
jbai
0c68d9c4fb fix the dataset field has duplicated records issue 2016-06-16 16:33:37 -07:00
jbai
38fdf1c132 fix the dalids dependency issue and add more log info in elasticsearch 2016-06-16 14:38:01 -07:00
jbai
451054b87b run the dataset tree builder as independet job 2016-06-14 10:32:44 -07:00
jbai
5062fa4ecc load dali depends on and instance into final table 2016-06-09 18:51:05 -07:00
jbai
f344f1cf2e update code to follow the code review 2016-06-07 11:39:07 -07:00
jbai
e5880cf81a fix the merge conflict 2016-06-07 11:31:00 -07:00
jbai
07bb6c98ab merge the lastest commit 2016-06-07 11:26:27 -07:00
SunZhaonan
310e6e9f06 Catch hive table even though its schema is None or exception message 2016-06-03 11:30:08 -07:00
SunZhaonan
6d5b4ace5d small bug: set whExecId 2016-06-03 11:05:41 -07:00
jbai
af976c3d5e Dali Metadata integration - combine dali versions into one node 2016-06-02 18:29:44 -07:00
Zhaonan Sun
391d0d27c6 Merge pull request #137 from jerrybai2009/master
add the copy_dict_cursor function since it is used by elastic search refresh
2016-06-01 11:01:31 -07:00
jbai
87367b8404 add the copy_dict_cursor function since it is used by elastic search refresh 2016-06-01 10:59:06 -07:00
Zhaonan Sun
81c0fe226e Merge pull request #133 from SunZhaonan/master
Add local mode for hdfs extract
2016-05-31 15:07:36 -07:00
SunZhaonan
bec1c5cee0 Add local mode for hdfs extract 2016-05-31 14:44:32 -07:00
jbai
f614fe8314 fix the Elastic search index issue 2016-05-31 12:40:08 -07:00
Rafal Kluszczynski
457b0c0a2e refactor: apply code review remarks 2016-05-24 22:39:59 +02:00
Rafal Kluszczynski
ee1e36219a fix: generate tree also when elastic is not configured 2016-05-24 16:04:31 +02:00
jerrybai2009
8a9eeb1bb8 Merge pull request #126 from jerrybai2009/master
support the elasticsearch as search engine
2016-05-23 18:12:55 -07:00
jbai
a2e42d60f3 add the elasticsearch index build and update file 2016-05-23 17:58:37 -07:00
SunZhaonan
fb1198e4ae Fix hive field ETL bug 2016-05-20 17:52:45 -07:00
SunZhaonan
0b5c421311 Fix Hive column parser parent path bug 2016-05-19 16:36:30 -07:00
SunZhaonan
9d6a1b2649 Add optional config of ETL job white list 2016-05-12 16:28:23 -07:00
SunZhaonan
22fcc7ebcb pagination the LDAP fetch process 2016-05-04 17:15:45 -07:00
SunZhaonan
31de21ddcf pass parameter through file. 2016-05-03 16:25:56 -07:00
Zhaonan Sun
a7187a42bf Merge pull request #108 from SunZhaonan/master
Innodb engine DDL. Add config for timeout and load sample.
2016-04-06 15:07:44 -07:00
Arkadiusz Osinski
4d9f1681f0 missing letter in property name hive.metastore.username 2016-04-06 08:46:33 +02:00
SunZhaonan
b202832741 Innodb engine DDL. Add config for timeout and load sample. 2016-04-05 12:43:02 -07:00
SunZhaonan
c66b00e2f6 Fix dataset insert API bug. Fix load sql bug. 2016-03-28 16:27:43 -07:00
SunZhaonan
4a4894a192 Use Kerberos login 2016-03-17 12:31:58 -07:00
SunZhaonan
a0b7cb9d57 Fix process hanging bug. Add hive field ETL process. 2016-03-16 19:12:21 -07:00
SunZhaonan
aff8f323e4 Scheduler check previous job is finished. Redirect remote outputstream into log. Fix avro parser bugs 2016-03-16 19:09:53 -07:00
SunZhaonan
6b024196cd Fix Hive extract disorder bug. Add Hive database optional whitelist params 2016-03-16 19:09:52 -07:00
SunZhaonan
4b3b344b96 enable travis 2016-03-15 16:20:04 -07:00
SunZhaonan
c4671d2579 Add field comments ETL
Fix API bug of tech_matrix_id

Add key in comment table
2016-03-14 14:23:33 -07:00
SunZhaonan
5e9ae37952 Change to multi processing instead of multi thread. Fix hive ETL bug 2016-02-29 16:37:03 -08:00
SunZhaonan
8a76f9f931 Move test config file to a external file 2016-02-29 16:37:03 -08:00
SunZhaonan
4574e89de9 Fix bug of sample data schema inconsistant. Add clean up. Parameterize number of Actor 2016-02-22 17:40:07 -08:00
Zhaonan Sun
8921cc0375 Merge pull request #33 from SunZhaonan/master
Parameterize dataset source derived process. Fix bug in API. Exclude conf file when build.
2016-02-18 16:45:45 -08:00