datahub

mirror of https://github.com/datahub-project/datahub.git synced 2025-12-13 11:06:30 +00:00

Author	SHA1	Message	Date
Shuya Tsukamoto	e0f324c8e4	Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone (#483 ) * Implement ParquetFileAnalyzer for hadoop-dataset-extractor-standalone * Move location * Update build.gradle	2017-05-04 10:08:51 -07:00
Eric Sun	7b36d09b58	Add get_schema_literal_from_url() to fetch schema literal based on schema url (#268 ) * use schema_url_helper to fetch avro schema from hdfs or http location * trim space * add dfs.namenode.kerberos.principal.pattern; include htrace for SchemaUrlHelper	2016-11-07 08:14:45 -08:00
Eric Sun	53d40c8392	add a few new hdfs directory patterns	2016-08-03 16:16:58 -07:00
SunZhaonan	4a4894a192	Use Kerberos login	2016-03-17 12:31:58 -07:00
SunZhaonan	a0b7cb9d57	Fix process hanging bug. Add hive field ETL process.	2016-03-16 19:12:21 -07:00
SunZhaonan	aff8f323e4	Scheduler check previous job is finished. Redirect remote outputstream into log. Fix avro parser bugs	2016-03-16 19:09:53 -07:00
SunZhaonan	dfeefba213	Parameterize dataset source derived process.	2016-02-17 16:06:24 -08:00
SunZhaonan	de4d4cd0c1	Add documentation on important Constants and Classes	2016-02-12 16:57:12 -08:00
SunZhaonan	d5c3d87d00	Initial commit	2015-11-19 14:39:21 -08:00