fabiofilz
340c54317c
1849 support ssl to mce cli.py ( #1857 )
...
* Adding SSL support to mce_cli.py
* Kafka Config option
* Adding space and removing the commented line
Co-authored-by: Fabio de Simoni <fabio.desimoni@kindredgroup.com>
2020-09-04 12:17:27 -07:00
Mars Lan
7d6fde4f37
feat: add MCE ingestion support for CorpGroup ( #1837 )
...
* feat: add MCE ingestion support for CorpGroup
Also use consistent camel case for corp user URNs in bootstrap MCE data
Fixes https://github.com/linkedin/datahub/issues/1822
2020-08-31 10:08:58 -07:00
Mars Lan
03e3d49445
feat(ingest): add example crawler for MS SQL ( #1803 )
...
Also fix the incorrect assumption on column comments & add sample docker-compose file
2020-08-12 08:51:39 -07:00
Chris Lee
381c3e7fcd
Update README.md
2020-07-31 12:29:39 -07:00
Chris Lee
4143fb901e
<refactor>[ingestions]: align the default kafka topics with PR #1756 ( #1758 )
2020-07-29 20:26:01 -07:00
cobolbaby
5dc61658f8
fix: correct the way to catch the exception ( #1727 )
...
* fix: modify the etl script dependency
* fix: Correct the way to catch the exception
* fix: Compatible with the following kafka cluster when the Kafka Topic message Key cannot be empty
* fix: Adjust the kafka message key; Improve the comment of field
* fix: Avro schema required for key
Co-authored-by: Cobolbaby <Zhang.Xing-Long@inventec.com>
2020-07-10 07:56:19 -07:00
cobolbaby
ed128080e2
fix: modify the etl script dependency ( #1726 )
...
Co-authored-by: Cobolbaby <Zhang.Xing-Long@inventec.com>
2020-07-08 21:51:42 -07:00
Kerem Sahin
2dc11a51f4
fix(py3): Bump ingestion Docker py dependency to 3.6 ( #1716 )
2020-06-29 08:22:50 -07:00
Mars Lan
65bf623b8b
feat(ingest): add snowflake ETL script ( #1714 )
2020-06-25 19:05:38 -07:00
Mars Lan
682bb87a7e
feat(ingest): replace custom hive-etl with sql-based ETL ( #1713 )
...
This offloads most of the heavy lifting to SQLAlchemy.
Also add a docker file for testing
2020-06-25 19:04:56 -07:00
Mars Lan
5da55fe8d3
Update README.md
2020-06-25 16:32:22 -07:00
Mars Lan
52a54b9fda
feat(ingest): add PostgreSQL ETL script ( #1712 )
...
Also add a simple docker file for testing
2020-06-25 15:28:42 -07:00
Mars Lan
221c9af220
feature(ingest): add bigquery ETL script ( #1711 )
...
Also fix minor issues in the common script
2020-06-25 15:28:13 -07:00
Mars Lan
fa9fe5e110
refactor(py3): Refactor all ETL scripts to using Python 3 exclusively ( #1710 )
...
* refactor(py3): Refactor all ETL scripts to using Python 3 exclusively
Fix https://github.com/linkedin/datahub/issues/1688
* Update requirements.txt
2020-06-25 15:16:04 -07:00
Mars Lan
8e6665fc94
Update README.md
2020-06-22 21:26:38 -07:00
Mars Lan
4fea6083f8
feature(etl): add SQLAlchemy-based ingestion script ( #1708 )
...
This replaces the old incomplete rdbms ETL script.
2020-06-22 21:25:55 -07:00
Kerem Sahin
f79b2c958a
fix(ingestion): Fix sample MCE for data process
2020-06-11 01:04:52 -07:00
Liangjun Jiang
92c4a3689e
Data process entity ( #1680 )
...
* add job info as aspect of a dataset
* add job urn def., aspect and entity
* job entity with upstream and downstream lineage
* use job urn in upstream & downstream
* add Job entity rest APIs
* rest.li api, impl and factory for job entity
* code cleanup
* use pdl; onboard data process entity
* add es index json
* fix gradlew build ignored tasks
* add a comment about data process info field
* fix style warning issues
* update content based on PR
* checked in generated snapshot json
* updated based on PR feedback
* update data process data format
* updated based on code review feedback
* revert back gms & mce-job docker image
* delete temp files
* update based pr feedback
* file name and a typo
* format with linkedin style
Co-authored-by: Liangjun <liajiang@expediagroup.com>
2020-06-09 15:42:08 -07:00
Mars Lan
867dbd0d36
fix: use tuple notations for union types
2020-06-03 15:36:07 -07:00
Mars Lan
b6589ab1d1
Update README.md
2020-06-03 13:52:56 -07:00
Chris Lee
2a59070d54
fix(metadata-ingestion): pass schema_record to mce-cli cosumer ( #1646 )
2020-04-24 14:34:16 -07:00
Mars Lan
aa81e774fd
doc: fix example MCEs
2020-04-02 19:39:12 -07:00
Chris Lee
ba33c7a5cd
Specify python version in mce-cli requirement.txt
2020-03-27 13:33:22 -07:00
Chris Lee
d1cf62854d
Fix: Docker Quickstart - Sample Data Loading Error
...
Specify the python version for the required confluent-Kafka library.
2020-03-27 13:14:23 -07:00
Jay Sen
1579a209b3
specify explicit avro lib for compatibility issue ( #1605 )
2020-03-23 09:50:46 -07:00
Kerem Sahin
a745c4035f
Update metadata for bootstrap datasets
2020-02-11 00:23:25 -08:00
Kerem Sahin
8704e3dd62
Update bootstrap data
2020-02-07 18:11:10 -08:00
Kerem Sahin
9b536ecf80
Small doc fix
2020-02-06 18:28:29 -08:00
Kerem Sahin
165d4aef95
Documentation update part-1
2019-12-18 18:57:18 -08:00
Chris Lee
f344b31f49
Built docker-compose in mce ingestion.
2019-12-16 17:44:12 -08:00
Chris Lee
e7ec56096a
Provided primitive fieldPaths against the platform.
2019-12-09 12:07:03 -08:00
Chris Lee
250ced5eeb
Introduced RDBMS ETL as a metadata ingestion example.
2019-12-04 16:47:31 -08:00
Chris Lee
89c6e35502
Performed --user flag to avoid the permission issue on requirement installation.
2019-11-13 17:01:32 -08:00
Chris Lee
1d6484da60
Released mySQL ETL in metadata-ingestion.
2019-10-24 05:17:28 -07:00
Kerem Sahin
8928e7a147
Add more samples to bootstrap data to be able to show upstreams/downstreams support
2019-09-27 11:47:59 -07:00
Chris Lee
646630b599
Enabled the kafka ingestion pipeline.
2019-09-24 16:41:06 -07:00
Chris Lee
49534ea590
Aligned the long type in time format.
2019-09-19 09:48:43 -07:00
Chris Lee
600c2f04ab
Organized the output layout.
2019-09-18 11:03:48 -07:00
Kerem Sahin
162d52a421
Updated DataHub wiki link conf
2019-09-17 16:30:37 -07:00
Chris Lee
72b18aa890
Introduced the hive ETL pipeline.
2019-09-17 16:14:09 -07:00
Chris Lee
57976cc901
Introduced the hive ETL pipeline.
2019-09-17 16:13:35 -07:00
Chris Lee
9cc7a1e2f8
Imported the confluent-kafka[avro] as a requirement.
2019-09-17 15:06:56 -07:00
Chris Lee
5c2e03b162
Imported the confluent-kafka[avro] as a requirement.
2019-09-17 15:03:25 -07:00
Chris Lee
493cbe16d6
Added the runbook for the requirements.txt.
2019-09-16 13:27:16 -07:00
Chris Lee
bd32ec20a1
Supplied the requirements.txt for metadata-ingestion.
2019-09-16 13:13:42 -07:00
Chris Lee
81a38f87f7
Classified the ETL jobs in metadata-ingestion.
2019-09-16 12:52:37 -07:00
Chris Lee
430f13108d
Enabled the configuration lists in the LDAP ETL.
2019-09-12 10:16:41 -07:00
Chris Lee
9df9d336c9
Ingested metadata from LDAP server to Data Hub.
2019-09-10 18:58:22 -07:00
Kerem Sahin
8bfb086e09
Set displayName fields for corp user MCEs
2019-09-09 00:31:09 -07:00
Kerem Sahin
0dc5bd9fe0
Add documentation
2019-09-08 20:25:58 -07:00