Shuya Tsukamoto
e0f324c8e4
Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone ( #483 )
...
* Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone
* Move location
* Update build.gradle
2017-05-04 10:08:51 -07:00
Eric Sun
7b36d09b58
Add get_schema_literal_from_url() to fetch schema literal based on schema url ( #268 )
...
* use schema_url_helper to fetch avro schema from hdfs or http location
* trim space
* add dfs.namenode.kerberos.principal.pattern; include htrace for SchemaUrlHelper
2016-11-07 08:14:45 -08:00
Yi Wang
664e4072bb
Upgrade to play 2.4.8
2016-10-19 17:42:28 -07:00
Eric Sun
53d40c8392
add a few new hdfs directory patterns
2016-08-03 16:16:58 -07:00
Eric Sun
1cd5872369
temp fix for hdfs_schema_crawler getRuntime().exec() hangs problem; exclude log4j
2016-08-03 15:50:00 -07:00
Eric Sun
1573fdb212
rename hive dependency to hive_exec; reuse metadata-etl/extralibs; test travis ci;
2016-06-28 18:03:02 -07:00
SunZhaonan
4a4894a192
Use Kerberos login
2016-03-17 12:31:58 -07:00
SunZhaonan
a0b7cb9d57
Fix process hanging bug. Add hive field ETL process.
2016-03-16 19:12:21 -07:00
SunZhaonan
aff8f323e4
Scheduler check previous job is finished. Redirect remote outputstream into log. Fix avro parser bugs
2016-03-16 19:09:53 -07:00
SunZhaonan
dfeefba213
Parameterize dataset source derived process.
2016-02-17 16:06:24 -08:00
SunZhaonan
de4d4cd0c1
Add documentation on important Constants and Classes
2016-02-12 16:57:12 -08:00
SunZhaonan
b5d7c38b7d
Eclipse integration. Resolve circular dependency. wherehows-common-test configure.
2016-02-09 15:50:49 -08:00
Zhen Chen
5bfb5adb71
close input stream and fix import older version jar in the standalone module
2015-12-03 16:59:38 -08:00
SunZhaonan
d5c3d87d00
Initial commit
2015-11-19 14:39:21 -08:00