1604 Commits

Author SHA1 Message Date
Harshal Sheth
af566e1184
feat(model): fully populate the entity registry (#7818) 2023-04-15 13:33:05 -07:00
Andrew Sikowitz
1ac1ccf26e
perf(ingest/bigquery): Improve bigquery usage disk usage and speed (#7825) 2023-04-14 18:09:43 -07:00
Andrew Sikowitz
e839ac4c40
fix(ingest/bigquery): Handle null values from usage aggregation (#7827) 2023-04-14 16:54:22 -07:00
Harshal Sheth
204727a6ee
feat(ingest/unity): support extracting ownership (#7801) 2023-04-12 19:45:41 -07:00
Harshal Sheth
3079f0a7e1
feat(sdk): support executing graphql via DataHubGraph (#7753)
Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
2023-04-12 11:30:05 -07:00
Andrew Sikowitz
73016ebff9
test(ingest/bigquery): Add sql parser xfail test to fix later (#7792) 2023-04-12 10:51:29 -07:00
Tamas Nemeth
0cc12bcce7
feat(ingest): redshift - Redshift rework (#6906) 2023-04-12 19:15:43 +02:00
Andrew Sikowitz
54f047e1a8
test(ingest/snowflake): fix tests around host_port (#7791)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-11 16:06:35 -07:00
David Sanchez
a50c71264d
feat(ingest/tableau): extract lineage from csql queries (#7561)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-11 11:12:15 -07:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
mohdsiddique
5e145cbb2d
feat(ingestion/okta): okta stateful ingestion (#7736)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-04-07 13:44:32 -07:00
Mayuri Nehate
5fd7981532
fix(ingest/snowflake): fix incorrect tag urn case, improve tag display name (#7758) 2023-04-07 13:07:08 -07:00
Andrew Sikowitz
087855f374
fix(ingest/bigquery): Support cross project usage using FileBackedDict (#7663)
Includes major refactor of bigquery usage ingestion, minor refactor of the source as a whole, and reporting cleanup.
Includes bigquery performance testing changes.
2023-04-07 12:18:26 -07:00
Mayuri Nehate
1fda92441f
feat(snowflake): improve snowflake lineage perf and memory, push down to snowflake (#7710) 2023-04-07 11:06:06 -07:00
Andrew Sikowitz
44663fa035
fix(ingest/bigquery): Raise report_failure threshold; add robustness around table parsing (#7772)
- Converted getting views and tables to iterators
- Catches exception around table expiration time being impossible to represent in python because it's too far in the future
2023-04-06 13:24:22 -07:00
Tamas Nemeth
96bacfc5d7
fix(ingest/redshift): Fixing adding back db name in redshift urn (#7765) 2023-04-06 11:45:10 +02:00
Tamas Nemeth
29d2492667
fix(ingest/bigquery): Lineage edges use datetime with timezone; correctly parse last_altered (#7762) 2023-04-06 02:46:50 +00:00
Aseem Bansal
a11a7fa9d0
feat(snowflake): better error message on key pair authentication (#7734)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-05 00:46:07 +00:00
Harshal Sheth
8d99babf75
feat(ingest/dbt): update subtypes for dbt (#7750) 2023-04-04 17:11:23 -07:00
Andrew Sikowitz
ce1ac7fa12
refactor(ingest): Use sqlite.Row row_factory for FileBackedCollections (#7739) 2023-04-04 11:53:56 -07:00
Harshal Sheth
f860ce95c0
feat(ingest): emit state payloads as soft-deleted (#7714) 2023-04-04 17:06:21 +00:00
Harshal Sheth
8394dcb538
chore(ingest): change kafka connect mapped ports (#7728) 2023-04-04 18:38:30 +05:30
Harshal Sheth
1634edaf25
feat(ingest/dbt): include dbt unique_id in properties (#7737) 2023-04-04 13:37:13 +05:30
Harshal Sheth
f780da4c0a
feat(ingest/lookml): support views with derived_table.explore_source (#7704) 2023-04-03 16:18:39 -07:00
Andrew Sikowitz
de587b2bfe
refactor(ingest): Minor cleanup of File, CsvEnricher, BusinessGlossary, and FileLineage sources (#7718)
- Adds auto_workunit_reporter to each source
- Standardizes comments around remote paths
- Adds back AuditStamp to FileLineage source
- Some generic refactoring
2023-03-31 15:49:24 -07:00
Harshal Sheth
f6d7e1a325
feat(ingest/snowflake): hide host_port from snowflake docs (#7717) 2023-03-31 15:58:52 +05:30
xiphl
7d240c600a
feat(ingestion) Allow for ingestion to read files remotely (#7552)
Co-authored-by: xiphl <xiphlerl9@gmail.com>
Allows the CsvEnricher, BusinessGlossary, File, and LineageFile sources to read from URLs.
2023-03-29 18:10:46 -07:00
Andrew Sikowitz
54a372795b
test(ingest/bigquery): Add performance testing framework for bigquery usage (#7690)
- Creates metadata-ingestion/tests/performance directory
- Excludes metadata-ingestion/tests from docs generation
- Updates bigquery reporting around project state
2023-03-29 14:13:43 -07:00
Harshal Sheth
94fa62d431
chore(ingest): formatting + cleanup MCPW usages (#7706) 2023-03-29 11:43:25 -07:00
mohdsiddique
c0f7ba2f85
feat(ingestion): azure-ad stateful ingestion (#7701) 2023-03-29 21:50:31 +05:30
Harshal Sheth
2eb9fe408a
docs(): generate docs for our Python SDK (#7612) 2023-03-28 20:23:20 -07:00
Mayuri Nehate
fc238c2513
feat(ingest/postgres): support extracting metadata from all databases in single recipe (#7581)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-28 14:16:12 -07:00
Harshal Sheth
b2689b7514
test(ingest/dbt): add test for column meta match (#7673) 2023-03-28 21:51:31 +05:30
Andrew Sikowitz
c7d35ffd66
perf(ingest): Improve FileBackedDict iteration performance; minor refactoring (#7689)
- Adds dirty bit to cache, only writes data if dirty
- Refactors __iter__
- Adds sql_query_iterator
- Adds items_snapshot, more performant `items()` that allows for filtering
- Renames connection -> shared_connection
- Removes unnecessary flush during close if connection is not shared
- Adds Closeable mixin
2023-03-27 17:20:34 -04:00
Harshal Sheth
d1bab5616c
feat(ingest/looker): enable looker usage ingestion by default (#7684) 2023-03-27 00:02:25 +00:00
Andrew Sikowitz
419bee8614
fix(ingest/bigquery): Fix BigQueryTableType enum accesses (#7685) 2023-03-25 00:08:11 +00:00
Harshal Sheth
c8abf9a1d4
fix(ingest/dbt): enable incremental lineage by default (#7674) 2023-03-24 18:14:19 -04:00
Mayuri Nehate
301c8616ed
refactor(ingest/bigquery): add inline comments + refactor in table name parsing (#7609) 2023-03-24 14:44:30 -04:00
Shirshanka Das
3d81539c7e
fix(ingest): json-schema - nullability handling (#7667) 2023-03-23 23:07:30 +00:00
Andrew Sikowitz
95f99198af
fix(ingest/bigquery): Pass whether view is materialized; pass last_altered correctly (#7660) 2023-03-22 13:40:57 -04:00
mohdsiddique
6d6d59141e
feat(ingestion): powerbi # uniquly identify the multiple instance of same platform (#7632)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-03-21 09:27:29 -07:00
mohdsiddique
7efac2215d
feat(ingestion): powerbi # support platform instance (#7583)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-03-21 09:07:31 -07:00
david-leifker
697e8e2647
fix(misc): misc fixes (#7633) 2023-03-21 19:42:50 +05:30
Harshal Sheth
482431bcf4
fix(ingest/superset): support superset v2 (#7588)
Co-authored-by: John Joyce <john@acryl.io>
2023-03-20 19:49:32 -07:00
Harshal Sheth
cbd8e14b78
feat(ingest): add auto_materialize_referenced_tags helper (#7626) 2023-03-20 16:34:22 -07:00
alex-magno
6ab606b748
fix(ingest/dbt): introduce lowercase column urn option (#7418)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-20 10:37:19 -07:00
mohdsiddique
fc8757d25e
feat(ingestion): powerbi # Amazon Redshift lineage support (#7562)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-20 10:24:34 -07:00
Shirshanka Das
104c9811f5
fix(ingest/docs): improve matcher to include types with spaces in them (#7631) 2023-03-18 12:59:43 -07:00
Shirshanka Das
41d4c0b074
feat(ingest/docs): json-schema fixes, improvements to ingestion doc generation (#7615) 2023-03-17 15:58:14 +01:00
Harshal Sheth
89734587f7
feat(ingest): add urn modification helper (#7440)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-16 13:27:08 -07:00