198 Commits

Author SHA1 Message Date
Harshal Sheth
0063c04460 gometa-ingest -> datahub ingest 2021-02-15 18:29:27 -08:00
Harshal Sheth
b91d0cf63b Add bigquery and refactor others 2021-02-15 18:29:27 -08:00
Harshal Sheth
cbbdf0930a Add snowflake 2021-02-15 18:29:27 -08:00
Harshal Sheth
d12497a3ff Add postgres source 2021-02-15 18:29:27 -08:00
Harshal Sheth
5666a50a7b Add hive 2021-02-15 18:29:27 -08:00
Harshal Sheth
68da294514 Update README.md 2021-02-15 18:29:27 -08:00
Shirshanka Das
1bbaecfae1 Modifying README to bring in old content 2021-02-15 18:29:27 -08:00
Harshal Sheth
d0bc3c55db Setup CI 2021-02-15 18:29:27 -08:00
Harshal Sheth
08ed46eb69 Add docs for each source/sink 2021-02-15 18:29:27 -08:00
Harshal Sheth
e0560e27ba Start updating readme 2021-02-15 18:29:27 -08:00
Harshal Sheth
df3e3da45b More autofixes 2021-02-15 18:29:27 -08:00
Harshal Sheth
761b27893b Update readme with python 3.6 info 2021-02-15 18:29:27 -08:00
Harshal Sheth
0660991fb8 More python 3.6 compat 2021-02-15 18:29:27 -08:00
Harshal Sheth
29c1cfac4d Rename yaml -> yml 2021-02-15 18:29:27 -08:00
Harshal Sheth
2ef62149ea Create examples directory 2021-02-15 18:29:27 -08:00
Shirshanka Das
063f513997 Update README.md 2021-02-15 18:29:27 -08:00
Shirshanka Das
e03a9e25f8 Update README.md 2021-02-15 18:29:27 -08:00
Harshal Sheth
4b83fc6591 adding allow deny patterns to sql config 2021-02-15 18:29:27 -08:00
Harshal Sheth
62bb7f012f Quick readme updates 2021-02-15 18:29:27 -08:00
Shirshanka Das
5a8bb3cfac adding docker commands 2021-02-15 18:29:27 -08:00
Harshal Sheth
4fb673925c Start using avro producer 2021-02-15 18:29:27 -08:00
Shirshanka Das
faf472aa64 adding some TODOs 2021-02-15 18:29:27 -08:00
Shirshanka Das
128781942d Firstdrop of ingest (#1) 2021-02-15 18:29:27 -08:00
Harshal Sheth
082c86463e Move old metadata ingestion scripts out of the way 2021-02-15 18:29:27 -08:00
Mars Lan
7a786c185b
Drop obsolete info on mysql-etl (#2072) 2021-01-29 09:03:53 -08:00
John Plaisted
6ece2d6469
Start adding java ETL examples, starting with kafka etl. (#1805)
Start adding java ETL examples, starting with kafka etl.

We've had a few requests to start providing Java examples rather than Python due to type safety.

I've also started to add these to metadata-ingestion-examples to make it clearer these are *examples*. They can be used directly or as a basis for other things.

As we port to Java we'll move examples to contrib.
2020-09-11 13:04:21 -07:00
Chris Lee
381c3e7fcd
Update README.md 2020-07-31 12:29:39 -07:00
Mars Lan
682bb87a7e
feat(ingest): replace custom hive-etl with sql-based ETL (#1713)
This offloads most of the heavy lifting to SQLAlchemy.
Also add a docker file for testing
2020-06-25 19:04:56 -07:00
Mars Lan
fa9fe5e110
refactor(py3): Refactor all ETL scripts to using Python 3 exclusively (#1710)
* refactor(py3): Refactor all ETL scripts to using Python 3 exclusively

Fix https://github.com/linkedin/datahub/issues/1688

* Update requirements.txt
2020-06-25 15:16:04 -07:00
Mars Lan
8e6665fc94
Update README.md 2020-06-22 21:26:38 -07:00
Mars Lan
4fea6083f8
feature(etl): add SQLAlchemy-based ingestion script (#1708)
This replaces the old incomplete rdbms ETL script.
2020-06-22 21:25:55 -07:00
Mars Lan
b6589ab1d1
Update README.md 2020-06-03 13:52:56 -07:00
Mars Lan
aa81e774fd
doc: fix example MCEs 2020-04-02 19:39:12 -07:00
Kerem Sahin
9b536ecf80 Small doc fix 2020-02-06 18:28:29 -08:00
Kerem Sahin
165d4aef95 Documentation update part-1 2019-12-18 18:57:18 -08:00
Chris Lee
f344b31f49 Built docker-compose in mce ingestion. 2019-12-16 17:44:12 -08:00
Chris Lee
250ced5eeb Introduced RDBMS ETL as a metadata ingestion example. 2019-12-04 16:47:31 -08:00
Chris Lee
89c6e35502 Performed --user flag to avoid the permission issue on requirement installation. 2019-11-13 17:01:32 -08:00
Chris Lee
1d6484da60 Released mySQL ETL in metadata-ingestion. 2019-10-24 05:17:28 -07:00
Chris Lee
646630b599 Enabled the kafka ingestion pipeline. 2019-09-24 16:41:06 -07:00
Kerem Sahin
162d52a421 Updated DataHub wiki link conf 2019-09-17 16:30:37 -07:00
Chris Lee
72b18aa890 Introduced the hive ETL pipeline. 2019-09-17 16:14:09 -07:00
Chris Lee
493cbe16d6 Added the runbook for the requirements.txt. 2019-09-16 13:27:16 -07:00
Chris Lee
81a38f87f7 Classified the ETL jobs in metadata-ingestion. 2019-09-16 12:52:37 -07:00
Chris Lee
430f13108d Enabled the configuration lists in the LDAP ETL. 2019-09-12 10:16:41 -07:00
Chris Lee
9df9d336c9 Ingested metadata from LDAP server to Data Hub. 2019-09-10 18:58:22 -07:00
Kerem Sahin
0dc5bd9fe0 Add documentation 2019-09-08 20:25:58 -07:00
Kerem Sahin
693ee97c6c Fix ingestion script by pointing to correct MCE schema
Refactor for metadata-ingestion module
Adding readme for metadata-ingestion
2019-09-08 05:27:19 -07:00