3904 Commits

Author SHA1 Message Date
cburroughs
cc0772f8d8
feat(ingest): unbundle airflow plugin emitter dependencies (#7493)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 09:07:42 -08:00
mohdsiddique
de719663ff
feat(ingestion): powerbi # support Google BigQuery table lineage (#7502)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 09:07:00 -08:00
Peter Szalai
1d33392761
feat(cli): introduce remote config for quickstart (#7424)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 13:14:24 +01:00
Mayuri Nehate
406b11a9ed
feat(ingest/GX): add urn lowercasing option for GX assertions (#7472) 2023-03-06 20:42:23 -08:00
Harshal Sheth
01ee351c4c
fix(ingest): prevent logging from blowing up on TypeErrors (#7497) 2023-03-03 14:36:55 -08:00
Harshal Sheth
195196ddf1
fix(ingest): redact auth info in curl commands (#7496) 2023-03-03 14:35:29 -08:00
Harshal Sheth
795d76b56c
fix(ingest/tableau): load project workbook hierarchy correctly (#7483) 2023-03-03 11:54:42 -08:00
Harshal Sheth
b4dd1d7d82
chore(ingest): pin acryl-datahub-classify (#7485) 2023-03-03 11:36:20 -05:00
Andrew Sikowitz
7a71b84296
refactor(ingest): Convert FileBackedDict to dataclass for cleaner init (#7469)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-02 19:53:05 -08:00
mohdsiddique
29d171106b
feat(ingest/tableau): project path and container support (#7426)
Co-authored-by: mayurinehate <mayuri.nehate@gslab.com>
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-03-02 16:53:19 -08:00
Kevin G
622688916c
fix(ingest/dbt): check for nodes key before accessing (#7462)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-02 11:40:13 -08:00
Aseem Bansal
1adbc2cab0
chore(ci): upgrade GE version (#7290) 2023-03-02 10:47:38 -08:00
Tamas Nemeth
3f88fb7d16
feat(ingest/bigquery) - Capture dataset labels in bigquery (#7460) 2023-03-02 11:41:19 +01:00
Harshal Sheth
49029943f9
fix(ingest): remove extraneous platform configs (#7454) 2023-03-02 01:10:35 -08:00
Thomas Memenga
18dd7298ad
fix(ingest/s3): propagate s3 endpoint to profiling (#7431) 2023-03-02 01:05:17 -08:00
Tony Ouyang
4f651b0d3d
fix(ingest/bigquery): update bigquery platform_instance capability (#7467) 2023-03-02 00:52:40 -08:00
Harshal Sheth
c648f7376a
refactor(ingest): use auto_stale_entity_removal in json schema source (#7465) 2023-03-02 08:25:41 +01:00
Harshal Sheth
619fad0ae1
fix(ingest/dbt): remove deprecated backcompat_skip_source_on_lineage_edge option (#7466) 2023-03-02 08:24:50 +01:00
Andrew Sikowitz
8101f0d47a
feat(ingest): Introduce FileBackedDict for offloading data to disk (#7461)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Also includes minor refactoring to the bigquery connector
2023-03-01 19:09:51 -05:00
Harshal Sheth
45feb01e3b
fix(ingest/bigquery): simplify type annotations for bigquery usage (#7457)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-01 08:38:36 -08:00
Harshal Sheth
2c3e3c203f
docs(ingest): add details about backwards compatibility guarantees (#7439) 2023-02-28 13:33:58 -08:00
Shirshanka Das
17e85979dd
refactor(ingest): subtypes - standardize (#7437) 2023-02-28 13:11:07 -08:00
Harshal Sheth
376fffeebc
docs(ingest): add more guidelines for writing sources (#7451) 2023-02-28 11:53:43 -08:00
Harshal Sheth
73493c577b
refactor(ingest): avoid allowing extras for all DataHubGraphConfig (#7448) 2023-02-28 10:42:31 -08:00
Harshal Sheth
639bbcfa86
chore(ingest/glue): cleanup deprecated underlying_platform config (#7449) 2023-02-28 10:41:54 -08:00
nachiket-juneja
e07cd2090b
Feat/s3 ingestion enhancement to update schema from latest partition (#7410)
Co-authored-by: Prashant Singh Thakur <prashant.thakur@nucleusteq.com>
2023-02-28 08:58:28 +01:00
Harshal Sheth
3b8b5e8aa4
chore(ingest): cleanup unused files/vars in tests (#7450) 2023-02-28 08:07:34 +01:00
Tamas Nemeth
62e33e03a3
fix(ingest/unity): Use assigned metastore if not metastore listed in unity catalog (#7446) 2023-02-28 08:06:28 +01:00
Tamas Nemeth
77d072b522
fix(ingest/athena): Fix athena source if dbname is not specified in the connection string (#7417) 2023-02-27 22:15:29 +01:00
Tamas Nemeth
1b53c03794
fix(ingest/snowflake): fixing Snowflake state issue (#7443) 2023-02-27 13:59:30 +01:00
Tamas Nemeth
14a660428e
fix(ingest/bigquery): Querying table metadata details in batch properly (#7429) 2023-02-27 11:10:24 +01:00
Harshal Sheth
d02701d91c
docs(ingest): add ingestion configs guide (#7438) 2023-02-26 16:04:23 -08:00
Shirshanka Das
221b1ae801
fix(ingest): lookml - add support for includes, extends, view_name i… (#7428) 2023-02-24 12:05:21 -08:00
Tamas Nemeth
3a4c9a69f6
fix(ingest/bigquery): Fixing double quoting in profiling approx count query (#7416) 2023-02-24 09:39:52 +01:00
Shirshanka Das
95750317e1
refactor(ingest): lookml - fix up golden files in normalized form (#7423) 2023-02-24 00:10:18 -08:00
Andrew Sikowitz
0532cc9056
fix(ingest/bigquery) Filter upstream lineage by list of existing tables (#7415)
Co-authored-by: mayurinehate <mayuri.nehate@gslab.com>
- Creates global stores table_refs and view_upstream_tables when extracting lineage
- Moves lineage processing to the end, after schema processing
- Adds `project_ids` config option to specify multiple projects to ingest; adds corresponding tests
- Changes `created` timestamps to `auditStamp` on `UpstreamClass`; uses VIEW type for lineage identified through view ddl parsing
2023-02-23 19:40:00 -05:00
Tamas Nemeth
4c1bf18f9a
feat(ingest/bigquery) - Emit cross-project usage from gcp logs (#7364) 2023-02-22 18:53:35 -05:00
Andrew Sikowitz
e82e284982
fix(ingest/kafka): Remove topic from kafka browse path (#7398) 2023-02-22 18:38:08 -05:00
Mayuri Nehate
d436ab9f9b
feat(ingest/kafka-connect): add config to lowercase urns, do not emit… (#7393)
Co-authored-by: John Joyce <john@acryl.io>
2023-02-22 11:42:44 -08:00
Mayuri Nehate
5db133619f
fix(ingest/bigquery): Prefer parsed lineage for view over lineage from audit logs (#7408) 2023-02-22 11:51:04 -05:00
Andrew Sikowitz
c5c2bdb983
fix(ingest/bigquery): Correctly upsert lineage_map when parsing view ddl (#7403) 2023-02-22 11:57:01 +01:00
Andrew Sikowitz
2764c44977
fix(ingest): Do not require platform_instance for stateful ingestion (#7397) 2023-02-21 21:27:44 -05:00
Chris Collins
2de779adbf
fix(docs): Update transformers docs to note not minting urns (#7399) 2023-02-21 13:29:36 -08:00
서재권(Data Platform)
3068e7f0b1
fix(ingest/oracle) add database name to oracle urn name (#7016) 2023-02-21 13:50:24 -05:00
Andrew Sikowitz
1402e88e3a
build(idea): mark metadata-ingestion sources and tests (#7394) 2023-02-21 09:50:03 -05:00
Aseem Bansal
f8a73005d4
chore(ci): relax bigquery dependency (#7309) 2023-02-21 08:33:00 +01:00
Tamas Nemeth
097d4e6bbd
fix(dep/json-schema): Fixing json-schema dependencies (#7383) 2023-02-20 14:02:08 -08:00
John Joyce
08a215951c
feat(queries): Overhaul Queries Tab (#7366) 2023-02-20 11:10:18 -08:00
Andrew Sikowitz
8fd2cc5f20
fix(ingest/snowflake): Improve memory usage of metadata extraction (#7349) 2023-02-20 14:46:10 +01:00
Aseem Bansal
986086ae00
test(cli): add check for missing init files (#7378) 2023-02-20 18:41:12 +05:30