3089 Commits

Author SHA1 Message Date
Harshal Sheth
ebe7409897
fix(cli): prevent click from suppressing errors (#2560) 2021-05-17 11:50:38 -07:00
Harshal Sheth
3dfe3d375b
feat(ingest): add options for Airflow lineage backend (#2557) 2021-05-13 20:02:47 -07:00
Kevin Hu
5ab1cbbbb2
feat(ingest): MongoDB schema inference (#2546) 2021-05-13 19:44:33 -07:00
Fredrik Sannholm
133577557c
feat(ingest): Looker view and dashboard ingestion (#2493) 2021-05-13 11:42:53 -07:00
Harshal Sheth
e3c190e772
fix(ingest): register custom Hive types (#2543) 2021-05-12 17:54:59 -07:00
Harshal Sheth
a671001824
refactor(ingest): move Airflow into datahub_provider module (#2521) 2021-05-12 15:01:11 -07:00
Albert Franzi
7fce505ffb
feat(ingest): define Redshift as a Postgres Source (#2540) 2021-05-12 10:00:34 -07:00
Harshal Sheth
2811d23e45
feat(ingest): add a transformer for adding ownership (#2532) 2021-05-11 17:46:39 -07:00
Harshal Sheth
35841d2d85
feat(ingest): check in generated schema files (#2503) 2021-05-10 19:36:23 -07:00
Harshal Sheth
4b2a6bd46a
fix(ingest): generate Airflow tags correctly (#2522) 2021-05-10 19:26:55 -07:00
Harshal Sheth
0ddf0ab262
fix(ingest): add support for custom postgres types (#2524) 2021-05-10 18:06:56 -07:00
Harshal Sheth
50ea58c32d
docs: improve airflow explanations and examples (#2509) 2021-05-06 19:12:19 -07:00
Harshal Sheth
0e8b3d97ab
fix(ingest): remove double edges from Airflow lineage backend (#2508) 2021-05-06 19:08:02 -07:00
Harshal Sheth
1facfbd5a3
feat(ingest): capture table properties if available (#2497) 2021-05-05 14:07:08 -07:00
Harshal Sheth
7f0656fd5e
fix(ingest): replace ImportError with ModuleNotFoundError (#2498)
Using the more specific exception will prevent us from accidentally
ignoring errors that should be handled.
2021-05-05 14:05:16 -07:00
Harshal Sheth
9f4de4b20a
fix(ingest): remove datahub.metadata import shortcut (#2449) 2021-04-30 21:10:12 -07:00
Harshal Sheth
201ffd4979
test: add smoke test (#2464) 2021-04-29 23:27:03 -07:00
Harshal Sheth
df9e7c594f
fix(ingest): guess hook type from name (#2475) 2021-04-29 23:23:19 -07:00
Harshal Sheth
e48a74b80a
test(ingest): add test names and IDs using pytest (#2476) 2021-04-29 23:18:55 -07:00
Harshal Sheth
5553dc820c
fix(ingest): use postgres data platform urn (#2472) 2021-04-28 11:34:45 -07:00
Harshal Sheth
50aee5c05a
fix(ingest): support Airflow 1.10.x style lineage in Airflow 2 (#2455) 2021-04-26 23:08:43 -07:00
Harshal Sheth
83fdc6417f
feat(ingest): capture default values in Avro schemas (#2463) 2021-04-26 17:07:29 -07:00
Harshal Sheth
d415234a8c
fix(ingest): fields with defaults should be optional (#2461) 2021-04-26 16:45:48 -07:00
Dexter Lee
6554f15e67
fix(docker): Nuke ingestion containers when calling docker/nuke.sh (#2459) 2021-04-26 16:29:25 -07:00
Harshal Sheth
f6c7195ac5
fix(cli): check docker setup containers (#2457) 2021-04-26 16:28:49 -07:00
Harshal Sheth
a857d3b9d8
fix(ingest): various updates to datahub rest sink (#2445) 2021-04-23 23:48:44 -07:00
Harshal Sheth
663dfe9a7c
fix(ingest): add snowflake warehouse and role to config (#2444) 2021-04-23 23:46:31 -07:00
Gabe Lyons
851e00ba9f
feat(lineage): implement support for datasets, charts and dashboards downstream lineage fetching in a generic way (#2397)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Brian <brianwebtek@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2021-04-23 00:18:39 -07:00
adriaanslechten
1295c44615
feat(ingest) LDAP groups ingestion (#2434) 2021-04-22 13:56:30 -07:00
Harshal Sheth
034c33a050
fix(ingest): use entrypoints lib instead of pkg_resources (#2438) 2021-04-22 00:13:47 -07:00
Gabe Lyons
c7b49de67b
feat(ingest): adding superset ingestion source (#2425) 2021-04-22 00:11:54 -07:00
Harshal Sheth
56b66ed328
fix(ingest): support custom snowflake types (#2436) 2021-04-21 15:14:59 -07:00
Harshal Sheth
79daec29b7
fix(ingest): ensure upstreams in airflow lineage emission are entities (#2427) 2021-04-20 20:44:38 -07:00
Harshal Sheth
7d1ec520e5
fix(ingest): include database info for snowflake (#2426) 2021-04-20 20:40:30 -07:00
Thomas Larsson
7869a8f142
feature(ingestion): Adding the concept of transformers (#2411)
Fixes: #2410

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-04-18 11:15:05 -07:00
Harshal Sheth
91a2f69310
fix(ingest): properly handle fieldDiscriminator with restli (#2408) 2021-04-16 09:42:52 -07:00
Thomas Larsson
89fb538fa5
feature(ingestion): Make origin/fabric_type configurable (#2405)
Fixes: #2394

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-04-15 10:47:28 -07:00
Harshal Sheth
ffe03e6758
fix(ingest): streamline codegen init methods (#2400) 2021-04-14 19:25:57 -07:00
Harshal Sheth
9c2a30c3a1
fix(ingest): add db name to postgres URNs (#2401) 2021-04-14 15:04:13 -07:00
Harshal Sheth
2af4603e49
fix(ingest): enable mypy disallow_incomplete_defs and disallow_untyped_decorators (#2393) 2021-04-14 13:40:24 -07:00
Harshal Sheth
a11329d5b8
refactor(ingest): update test harness to use a compose file per test (#2392) 2021-04-13 17:30:24 -07:00
Harshal Sheth
fb6f74b1da
feat(ingest): add generic sqlalchemy source (#2389) 2021-04-13 08:01:38 -07:00
Harshal Sheth
eeee8aa34e
fix(ingest): report correct version status in dev mode (#2388) 2021-04-12 19:35:40 -07:00
Harshal Sheth
41cd52f9e2
feat(ingest): add Airflow lineage backend (#2368) 2021-04-12 17:40:15 -07:00
Thomas Larsson
6610666496
fix(ingestion): dont crash on non-RecordSchema topics (#2372)
Fixes: #2371

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-04-09 17:36:01 -07:00
Thomas Larsson
e02a17aecf
fix(ingestion): Support mapping from avro "boolean" and "map" types t… (#2364)
Fixes: #2363

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-04-08 14:23:12 -07:00
Thomas Larsson
4215dcd53c
fix(ingestion): properly detect optional fields in avro schemas (#2343)
Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-04-08 14:00:01 -07:00
Harshal Sheth
bfe345da42
fix(ingest): add test for avro serialization and deserialization (#2351) 2021-04-07 21:30:21 -07:00
Harshal Sheth
e29082bf55
feat(cli): Add support for checking docker memory usage (#2361) 2021-04-07 16:26:21 -07:00
Harshal Sheth
518de354d9
fix(ingest): support python3 -m datahub (#2359) 2021-04-07 14:58:58 -07:00