Shuya Tsukamoto
e0f324c8e4
Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone ( #483 )
...
* Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone
* Move location
* Update build.gradle
2017-05-04 10:08:51 -07:00
Eric Sun
7b36d09b58
Add get_schema_literal_from_url() to fetch schema literal based on schema url ( #268 )
...
* use schema_url_helper to fetch avro schema from hdfs or http location
* trim space
* add dfs.namenode.kerberos.principal.pattern; include htrace for SchemaUrlHelper
2016-11-07 08:14:45 -08:00
Eric Sun
53d40c8392
add a few new hdfs directory patterns
2016-08-03 16:16:58 -07:00
SunZhaonan
4a4894a192
Use Kerberos login
2016-03-17 12:31:58 -07:00
SunZhaonan
a0b7cb9d57
Fix process hanging bug. Add hive field ETL process.
2016-03-16 19:12:21 -07:00
SunZhaonan
aff8f323e4
Scheduler check previous job is finished. Redirect remote outputstream into log. Fix avro parser bugs
2016-03-16 19:09:53 -07:00
SunZhaonan
dfeefba213
Parameterize dataset source derived process.
2016-02-17 16:06:24 -08:00
SunZhaonan
de4d4cd0c1
Add documentation on important Constants and Classes
2016-02-12 16:57:12 -08:00
SunZhaonan
d5c3d87d00
Initial commit
2015-11-19 14:39:21 -08:00