3904 Commits

Author SHA1 Message Date
Tamas Nemeth
37b350c8e3
fix(ingest/redshift): fixing sql which extracts lineage from insert queries (#7770) 2023-04-06 16:34:34 +02:00
Tamas Nemeth
96bacfc5d7
fix(ingest/redshift): Fixing adding back db name in redshift urn (#7765) 2023-04-06 11:45:10 +02:00
Tamas Nemeth
29d2492667
fix(ingest/bigquery): Lineage edges use datetime with timezone; correctly parse last_altered (#7762) 2023-04-06 02:46:50 +00:00
Harshal Sheth
2840cba68b
docs(ingest/lookml): update error message for Looker connection fetch (#7756) 2023-04-05 13:41:17 -07:00
Harshal Sheth
5bb0e60bd3
fix(ingest/dbt-cloud): use correct dbt cloud IDE urls (#7755) 2023-04-05 13:40:56 -07:00
Harshal Sheth
e06117af66
fix(ingest/demo-data): fix bug in path type (#7749) 2023-04-04 23:16:15 -07:00
Mayuri Nehate
20504aae70
fix(ingest/bigquery): fix and refractor exported audit logs query (#7699) 2023-04-05 11:17:25 +05:30
Harshal Sheth
e71c0d3490
feat(sdk): fix ownership emission for groups (#7751) 2023-04-05 11:15:06 +05:30
Aseem Bansal
a11a7fa9d0
feat(snowflake): better error message on key pair authentication (#7734)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-05 00:46:07 +00:00
Harshal Sheth
8d99babf75
feat(ingest/dbt): update subtypes for dbt (#7750) 2023-04-04 17:11:23 -07:00
Andrew Sikowitz
06bc1c32e0
refactor(ingest/bigquery): Standardize audit log parsing and make TopKDict a DefaultDict (#7738)
- Moves get_sanitized_table_ref calls to ReadEvent / QueryEvent creation
- Standardizes how the audit log is read and parsed, unifying code when reading from gcp logging vs audit metadata (exported logs)
- Adds error handling around the parsing of each event, to catch errors from the new get_sanitized_table_ref calls
- Makes TopKDict inherit from DefaultDict and cleans up calls around that.
2023-04-04 11:58:48 -07:00
Andrew Sikowitz
ce1ac7fa12
refactor(ingest): Use sqlite.Row row_factory for FileBackedCollections (#7739) 2023-04-04 11:53:56 -07:00
Tim
23e57fffa2
fix(sdk): remove rest emitter to graph cache in CorpGroup (#7743) 2023-04-04 10:32:15 -07:00
Harshal Sheth
f860ce95c0
feat(ingest): emit state payloads as soft-deleted (#7714) 2023-04-04 17:06:21 +00:00
Harshal Sheth
82dc2b6393
feat(docs): clear up source configs (#7720) 2023-04-04 18:40:19 +05:30
Harshal Sheth
8394dcb538
chore(ingest): change kafka connect mapped ports (#7728) 2023-04-04 18:38:30 +05:30
Harshal Sheth
1634edaf25
feat(ingest/dbt): include dbt unique_id in properties (#7737) 2023-04-04 13:37:13 +05:30
Harshal Sheth
f780da4c0a
feat(ingest/lookml): support views with derived_table.explore_source (#7704) 2023-04-03 16:18:39 -07:00
Andrew Sikowitz
de587b2bfe
refactor(ingest): Minor cleanup of File, CsvEnricher, BusinessGlossary, and FileLineage sources (#7718)
- Adds auto_workunit_reporter to each source
- Standardizes comments around remote paths
- Adds back AuditStamp to FileLineage source
- Some generic refactoring
2023-03-31 15:49:24 -07:00
Andrew Sikowitz
a2f8c76388
feat(ingest/bigquery): Capture all operation types when ingesting operational stats (#7723) 2023-03-31 16:01:28 +05:30
Harshal Sheth
f6d7e1a325
feat(ingest/snowflake): hide host_port from snowflake docs (#7717) 2023-03-31 15:58:52 +05:30
Aseem Bansal
f0a675f9aa
docs(okta): add how to use email in urns (#7708) 2023-03-31 15:55:22 +05:30
xiphl
7d240c600a
feat(ingestion) Allow for ingestion to read files remotely (#7552)
Co-authored-by: xiphl <xiphlerl9@gmail.com>
Allows the CsvEnricher, BusinessGlossary, File, and LineageFile sources to read from URLs.
2023-03-29 18:10:46 -07:00
Harshal Sheth
575909e41c
feat(docs): support inlining code snippets from files (#7712) 2023-03-30 00:02:21 +00:00
Sergio Gómez Villamor
25808478cb
fix(ingestion): fix AssertionError in base_transformer (#7702)
Co-authored-by: Sergio Gomez Villamor <sergio.gomez.villamor@adevinta.com>
2023-03-29 16:15:57 -07:00
Andrew Sikowitz
54a372795b
test(ingest/bigquery): Add performance testing framework for bigquery usage (#7690)
- Creates metadata-ingestion/tests/performance directory
- Excludes metadata-ingestion/tests from docs generation
- Updates bigquery reporting around project state
2023-03-29 14:13:43 -07:00
Harshal Sheth
94fa62d431
chore(ingest): formatting + cleanup MCPW usages (#7706) 2023-03-29 11:43:25 -07:00
mohdsiddique
c0f7ba2f85
feat(ingestion): azure-ad stateful ingestion (#7701) 2023-03-29 21:50:31 +05:30
Tamas Nemeth
f348113b38
fix(ingest/redshift): Lineage query fix to work with the latest redshift (#7698) 2023-03-29 09:32:09 +02:00
Harshal Sheth
2eb9fe408a
docs(): generate docs for our Python SDK (#7612) 2023-03-28 20:23:20 -07:00
Mayuri Nehate
fc238c2513
feat(ingest/postgres): support extracting metadata from all databases in single recipe (#7581)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-28 14:16:12 -07:00
Harshal Sheth
b2689b7514
test(ingest/dbt): add test for column meta match (#7673) 2023-03-28 21:51:31 +05:30
Andrew Sikowitz
c7d35ffd66
perf(ingest): Improve FileBackedDict iteration performance; minor refactoring (#7689)
- Adds dirty bit to cache, only writes data if dirty
- Refactors __iter__
- Adds sql_query_iterator
- Adds items_snapshot, more performant `items()` that allows for filtering
- Renames connection -> shared_connection
- Removes unnecessary flush during close if connection is not shared
- Adds Closeable mixin
2023-03-27 17:20:34 -04:00
Mayuri Nehate
279f38a2fd
fix(ingest/bigquery): quote string constants in query (#7694) 2023-03-27 09:32:20 -07:00
Aseem Bansal
a62889b39a
chore(lint): fix lint in looker (#7695) 2023-03-27 14:08:51 +02:00
Harshal Sheth
d1bab5616c
feat(ingest/looker): enable looker usage ingestion by default (#7684) 2023-03-27 00:02:25 +00:00
Harshal Sheth
6d04511949
fix(ingest/looker): correct looker/lookml capability reports (#7683) 2023-03-26 23:36:12 +00:00
Andrew Sikowitz
419bee8614
fix(ingest/bigquery): Fix BigQueryTableType enum accesses (#7685) 2023-03-25 00:08:11 +00:00
Harshal Sheth
c8abf9a1d4
fix(ingest/dbt): enable incremental lineage by default (#7674) 2023-03-24 18:14:19 -04:00
Harshal Sheth
d71463041f
fix(ingest/looker): skip empty user ids for usage (#7686) 2023-03-24 21:40:11 +00:00
Mayuri Nehate
301c8616ed
refactor(ingest/bigquery): add inline comments + refactor in table name parsing (#7609) 2023-03-24 14:44:30 -04:00
Aseem Bansal
1324231e70
chore(ci): add coverage code for python (#7681) 2023-03-24 07:10:45 +00:00
Hyejin Yoon
918718e7d0
feat(docs-website): add docs on creating users and groups (#7574)
Co-authored-by: Hyejin Yoon <yoonhyejin@Hyejins-MacBook-Pro.local>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-03-24 11:52:12 +09:00
Harshal Sheth
39bc8a3e93
fix(cli): allow usage without kafka (#7677) 2023-03-24 02:36:56 +00:00
Hyejin Yoon
864ac2da9f
docs: add guides on deleting entities (#7636)
Co-authored-by: Hyejin Yoon <yoonhyejin@Hyejins-MacBook-Pro.local>
Co-authored-by: Hyejin Yoon <yoonhyejin@ip-192-168-0-10.us-west-2.compute.internal>
2023-03-24 08:49:34 +09:00
Andrew Sikowitz
589d354a57
perf(ingest): Increase default rest sink parallelism (#7675) 2023-03-23 19:43:20 -04:00
Shirshanka Das
3d81539c7e
fix(ingest): json-schema - nullability handling (#7667) 2023-03-23 23:07:30 +00:00
Mayuri Nehate
4ec34fc73b
feat(ingest/bigquery): emit create operation event for view (#7656) 2023-03-23 11:04:41 -07:00
Harshal Sheth
cf40aecb84
fix(cli/delete): include aspect name in dry-run message (#7664) 2023-03-22 22:25:33 -07:00
Harshal Sheth
3b519924e8
fix(cli): protect against timeseries get_aspects (#7665) 2023-03-22 22:04:11 -07:00