935 Commits

Author SHA1 Message Date
sagar-salvi-apptware
da72ba2113
fix(ingestion/transformer): replace the externalUrl container (#11013) 2024-07-30 15:17:04 +05:30
sagar-salvi-apptware
a09575fb6f
fix(ingestion/glue): Add support for missing config options for profiling in Glue (#10858) 2024-07-29 16:04:07 +05:30
Harshal Sheth
f816a14a98
fix(ingest): fix graph config loading (#11002)
Co-authored-by: Pedro Silva <pedro@acryl.io>
2024-07-26 11:15:46 -07:00
Harshal Sheth
8c3bfd996d
feat(ingest/bigquery): improve handling of information schema in sql parser (#10985) 2024-07-25 17:58:16 -07:00
Tamas Nemeth
71d1cdbe3b
fix(ingest/s3): Fixing container creation when there is no folder in path (#10993) 2024-07-25 23:38:10 +02:00
Pedro Silva
dd732d0d46
feat(cli): Make consistent use of DataHubGraphClientConfig (#10466)
Deprecates get_url_and_token() in favor of a more complete option: load_graph_config() that returns a full DatahubClientConfig.
This change was then propagated across previous usages of get_url_and_token so that connections to DataHub server from the client respect the full breadth of configuration specified by DatahubClientConfig.

I.e: You can now specify disable_ssl_verification: true in your ~/.datahubenv file so that all cli functions to the server work when ssl certification is disabled.

Fixes #9705
2024-07-25 19:06:14 +00:00
Harshal Sheth
1fa7998ed3
feat(ingest): support domains in meta -> "datahub" section (#10967) 2024-07-25 09:31:19 -07:00
sagar-salvi-apptware
348d449d8a
fix(ingest/Glue): column upstream lineage between S3 and Glue (#10895) 2024-07-19 14:39:19 +05:30
Harshal Sheth
7f3da47e90
fix(ingest/snowflake): fix test connection (#10927) 2024-07-17 11:57:58 -07:00
Harshal Sheth
bccfd8f0a5
feat(ingest/snowflake): integrate snowflake-queries into main source (#10905) 2024-07-17 10:22:14 -07:00
sagar-salvi-apptware
ec788df328
fix(ingest/bigquery): handle quota exceeded for project.list requests (#10912) 2024-07-17 17:17:52 +02:00
Patrick Franco Braz
4b83adfa9f
fix(ingest/bigquery): changes helper function to decode unicode scape sequences (#10845) 2024-07-16 15:50:54 -07:00
Mayuri Nehate
ff1c6b895e
feat(ingest/BigQuery): refactor+parallelize dataset metadata extraction (#10884) 2024-07-16 11:46:42 -07:00
Harshal Sheth
a4bce6af1c
feat(ingest): add snowflake-queries source (#10835) 2024-07-12 15:08:51 -07:00
Harshal Sheth
351e434856
fix(ingest/dbt): always encode tag urns (#10799) 2024-07-11 16:32:16 -07:00
Harshal Sheth
82bd3c248f
fix(ingest): only populate audit stamps where accurate (#10604) 2024-07-11 13:26:57 -07:00
haeniya
3e86192b29
feat(ingestion/tableau): optionally ingest multiple sites and create site containers (#10498)
Co-authored-by: Yanik Häni <Yanik.Haeni1@swisscom.com>
2024-07-09 11:49:41 -07:00
Shubham Jagtap
b6c7fe8267
refactor(ingestion): remove company domain for security reason (#10839) 2024-07-08 21:15:20 -07:00
Aseem Bansal
41b9e15235
feat(ingest/audit): add client id and version in system metadata props (#10829) 2024-07-08 09:38:12 -07:00
John Joyce
fa3e381f83
refactor(ingest): Refactor structured logging to support infos, warnings, and failures structured reporting to UI (#10828)
Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-07-03 19:20:58 -07:00
sagar-salvi-apptware
b8af2b9d69
fix(ingestion/glue): ensure date formatting works on all platforms for aws glue (#10836) 2024-07-03 18:05:37 +05:30
skrydal
099021c7a3
feat(ingest/glue): allow ingestion of empty databases from Glue (#10666)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-07-03 11:43:12 +05:30
sagar-salvi-apptware
640d42dc65
feat(ingest/transformer): tags to terms transformer (#10758)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2024-07-02 15:30:05 +05:30
Oleksandr Simonchuk
8b4e302881
feat(ingest): add and use file system abstraction in file source (#8415)
Co-authored-by: oleksandrsimonchuk <oleksandr.si@appsflyer.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2024-07-01 10:47:07 -07:00
Harshal Sheth
f4be88d0a9
feat(ingest): set pipeline name in system metadata (#10190)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
2024-06-27 15:00:35 -07:00
Harshal Sheth
fa2ab1bcee
fix(ingest): add status aspect to dataProcessInstance (#10757) 2024-06-27 12:07:28 -07:00
Harshal Sheth
0d677e4992
fix(ingest/snowflake): fix column batcher (#10781) 2024-06-25 22:21:54 -07:00
Harshal Sheth
724907b8f4
feat(ingest): add async batch mode to the rest sink (#10733) 2024-06-25 15:49:00 -07:00
Harshal Sheth
0dc0bc5761
feat(ingest/snowflake): performance improvements (#10746) 2024-06-25 14:46:55 -07:00
Eric L (CCCS)
79ba0b1720
fix(ingest/iceberg): add support for nested dictionaries when configuring pyiceberg (#10762) 2024-06-21 14:38:01 -07:00
ethan-cartwright
c58be155f3
feat(ingest/bigquery): Support for View Labels (#10648)
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2024-06-17 18:36:41 +05:30
Harshal Sheth
62c6704f69
feat(ingest/snowflake): refactor + parallel schema extraction (#10653) 2024-06-14 13:23:07 -07:00
Harshal Sheth
6329153e36
fix(ingest): fix redshift query urns + reduce memory usage (#10691) 2024-06-13 11:27:06 -07:00
Shubham Jagtap
05aee03f3f
perf(ingestion/fivetran): Connector performance optimization (#10556) 2024-06-11 20:19:57 -07:00
skrydal
b9e71a61b1
feat(ingest/glue): database parameters extraction (#10665) 2024-06-11 11:50:46 -07:00
aabharti-visa
8a905774f7
feat(ingestion/kafka)-Add support for ingesting schemas from schema registry (#10612) 2024-06-11 14:00:12 +02:00
Harshal Sheth
e842161849
feat(ingest): add fast query fingerprinting (#10619) 2024-06-05 13:47:44 -07:00
Eric L (CCCS)
c04b3bc2e4
fix(ingest/iceberg): update iceberg source to support newer versions of pyiceberg at runtime (#10614) 2024-06-04 09:45:29 -07:00
Mayuri Nehate
81b655c82d
feat(open assertion spec): MVP for Snowflake DMF Assertions: update models, add assertions cli with snowflake integration (#10602) 2024-05-31 12:03:22 -07:00
Harshal Sheth
e873104b80
feat(ingest): fetch connections from the backend (#10511) 2024-05-29 10:32:29 -07:00
Tony Ouyang
a5515c5d47
feat(ingestion/SageMaker): Remove deprecated apis and add stateful ingestion capability (#10573) 2024-05-28 12:16:28 +02:00
Harshal Sheth
2e14f70864
test(ingest/sql): refactor CLL generator + add tests (#10580) 2024-05-23 18:11:22 -07:00
Harshal Sheth
b8023a93a4
refactor(ingest): defer ctx.graph initialization (#10504) 2024-05-21 17:01:35 -07:00
Harshal Sheth
2b6c78b776
feat(ingest): bump acryl-sqlglot dep (#10554) 2024-05-21 23:52:33 +02:00
Harshal Sheth
187ef12182
fix(ingest/dbt): improve handling for CLL via ephemeral nodes (#10535) 2024-05-20 13:33:25 -07:00
Sergio Gómez Villamor
0059960720
feat(ingestion/glue): delta schemas (#10299)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2024-05-17 14:21:35 +02:00
Harshal Sheth
3d5735cbc5
chore(ingest): run pyupgrade for python 3.8 (#10513) 2024-05-15 22:31:05 -07:00
Harshal Sheth
bc9250c904
fix(ingest): fix bug in incremental lineage (#10515) 2024-05-15 22:30:47 -07:00
sagar-salvi-apptware
5fbf781558
fix(ingest/transformer): Add dataset domains based on tags using transformer (#10458) 2024-05-15 14:13:03 +05:30
Tamas Nemeth
897e648eae
fix(ingest/mode): Improve query lineage (#10284)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-05-07 22:02:37 -07:00