722 Commits

Author SHA1 Message Date
Shuixi Li
f147b51fc8
feat(ingest): add preset source (#10954)
Co-authored-by: MARK CHENG <hcheng@wealthsimple.com>
Co-authored-by: hwmarkcheng <94201005+hwmarkcheng@users.noreply.github.com>
2024-10-09 20:27:31 -07:00
Mayuri Nehate
26bbe02e44
feat(ingest/stateful): omit irrelevant urns for deletion (#11558) 2024-10-09 08:46:58 -07:00
Shirshanka Das
f3a348a231
sdk(platform-resource): add entity type for ease of use (#11541) 2024-10-07 20:30:14 -07:00
skrydal
134ad21afe
fix(ingestion/nifi): Fix for incremental lineage ingestion for nifi (#11517) 2024-10-04 17:29:06 +05:30
skrydal
e1514d5e8e
fix(ingestion/nifi): Improve nifi lineage extraction performance (#11490) 2024-10-01 21:51:00 +02:00
sagar-salvi-apptware
660fbf8e57
fix(ingestion/transformer): Add container support for ownership and domains (#11375) 2024-10-01 11:39:07 -07:00
Harshal Sheth
07034caf09
feat(ingest): support DATAHUB_INCLUDE_ENV_IN_CONTAINER_PROPERTIES (#11476) 2024-09-27 10:24:22 -07:00
Mayuri Nehate
6a58493011
fix(ingest/bq): do not query PARTITIONS for biglake tables (#11463) 2024-09-27 16:46:37 +05:30
sid-acryl
9fb2df11f3
fix(ingest): sort by last modified not working in the UI (#11343) 2024-09-23 10:06:05 -07:00
Harshal Sheth
aec5e1b249
fix(ingest/dbt): handle null index values (#11433) 2024-09-19 16:05:44 -07:00
Sergio Gómez Villamor
31edb46dbc
feat(ingestion): adds env property in ContainerProperties (#11214)
Co-authored-by: siladitya2 <siladitya2@gmail.com>
2024-09-18 14:56:52 +05:30
Harshal Sheth
38bcd9c381
feat(ingest): default to ASYNC_BATCH mode in datahub-rest sink (#11369) 2024-09-17 07:11:58 +01:00
Harshal Sheth
3755731f0e
chore(ingest): improve code formatting (#11326) 2024-09-11 10:48:57 -07:00
Harshal Sheth
311ea10833
feat(ingest): maintain ordering in file-backed dict (#11346) 2024-09-10 13:53:38 -07:00
Mayuri Nehate
837d00d391
fix(ingest/bq): fix ordering of queries for use_queries_v2 (#11333) 2024-09-10 12:17:23 -07:00
Harshal Sheth
f4033707d4
chore(ingest): bump acryl-sqlglot (#11331) 2024-09-09 21:09:44 -07:00
Mayuri Nehate
cf49f80e77
feat(ingest/sql): auto extract and use mode query user metadata (#11307)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-09-09 12:38:24 -07:00
Harshal Sheth
28310bb64b
feat(ingest): support full urns without owner_type in meta mapping (#11298)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2024-09-09 11:18:07 -07:00
sid-acryl
3150d90bd1
fix(ingestion/tableau): restructure the tableau graphql datasource query (#11230)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-09-09 10:45:06 -07:00
david-leifker
ede9520b87
feat(schemaField): populate schemaFields with side effects (#10928) 2024-09-04 15:36:12 -05:00
Harshal Sheth
401b787427
fix(ingest): add custom StrEnum type (#11270) 2024-09-04 08:09:36 -07:00
Mayuri Nehate
a7fc7f519a
feat(ingest/bq): integrate bigquery-queries into main source (#11247) 2024-08-30 18:16:45 -07:00
Mayuri Nehate
223650dd7a
feat(ingest): add bigquery-queries source (#10994)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-08-25 22:51:00 -07:00
Harshal Sheth
e0c13fda27
feat(ingest/dbt): add support for urns in add_owner directive (#11221) 2024-08-23 13:52:21 -04:00
Tamas Nemeth
ef6a410091
feat(ingest/s3): Partition support improvements (#11083)
- Partition autodetection
- Option to find min/max/min-max partition of a dataset
- Generating Partition aspects
2024-08-22 17:55:43 +02:00
sagar-salvi-apptware
50ed448861
fix(ingest/sagemaker): ensure consistent STS token usage with refresh mechanism (#11170)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2024-08-22 15:42:13 +05:30
Mayuri Nehate
9568a4254d
feat: separate great-expectations action package (#11096)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-08-21 12:13:36 -04:00
sid-acryl
627c5abfd6
feat(ingestion/bigquery): Add ability to filter GCP project ingestion based on project labels (#11169)
Co-authored-by: Alice Naghshineh <alice.naghshineh@nytimes.com>
Co-authored-by: Alice Naghshineh <45885699+anaghshineh@users.noreply.github.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
2024-08-20 14:42:00 -04:00
skrydal
8f7642b910
fix(ingestion/tableau): Tableau field type parsing (#11202) 2024-08-20 13:24:16 +02:00
Harshal Sheth
897173f270
feat(dbt): support prefer_sql_parser_lineage with sources enabled (#11168) 2024-08-13 13:54:50 -07:00
sid-acryl
b1f16f9b11
fix(ingestion/lookml): fix for sql parsing error (#11079)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-08-09 15:06:42 -07:00
Harshal Sheth
840b15083a
fix(sql-parser): prevent bad urns from alter table lineage (#11092) 2024-08-08 14:05:55 -07:00
Felix Lüdin
9619553e2d
fix(ingest): use correct native data type in all SQLAlchemy sources by compiling data type using dialect (#10898)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-08-06 12:52:20 -07:00
Harshal Sheth
50139431be
fix(ingest): set lastObserved in sdk when unset (#11071) 2024-08-02 15:48:10 -07:00
Harshal Sheth
89933fee1e
feat(ingest/dbt-cloud): update metadata_endpoint inference (#11041) 2024-07-31 14:16:18 -07:00
sid-acryl
dffdef2eaa
fix(ingestion/powerbi): fix issue with broken report lineage (#10910) 2024-07-31 11:40:09 -07:00
Harshal Sheth
4b9844da1b
feat(ingest/dbt): add experimental prefer_sql_parser_lineage flag (#11039) 2024-07-31 09:23:02 -07:00
sagar-salvi-apptware
da72ba2113
fix(ingestion/transformer): replace the externalUrl container (#11013) 2024-07-30 15:17:04 +05:30
sagar-salvi-apptware
a09575fb6f
fix(ingestion/glue): Add support for missing config options for profiling in Glue (#10858) 2024-07-29 16:04:07 +05:30
Harshal Sheth
f816a14a98
fix(ingest): fix graph config loading (#11002)
Co-authored-by: Pedro Silva <pedro@acryl.io>
2024-07-26 11:15:46 -07:00
Harshal Sheth
8c3bfd996d
feat(ingest/bigquery): improve handling of information schema in sql parser (#10985) 2024-07-25 17:58:16 -07:00
Tamas Nemeth
71d1cdbe3b
fix(ingest/s3): Fixing container creation when there is no folder in path (#10993) 2024-07-25 23:38:10 +02:00
Pedro Silva
dd732d0d46
feat(cli): Make consistent use of DataHubGraphClientConfig (#10466)
Deprecates get_url_and_token() in favor of a more complete option: load_graph_config() that returns a full DatahubClientConfig.
This change was then propagated across previous usages of get_url_and_token so that connections to DataHub server from the client respect the full breadth of configuration specified by DatahubClientConfig.

I.e: You can now specify disable_ssl_verification: true in your ~/.datahubenv file so that all cli functions to the server work when ssl certification is disabled.

Fixes #9705
2024-07-25 19:06:14 +00:00
Harshal Sheth
1fa7998ed3
feat(ingest): support domains in meta -> "datahub" section (#10967) 2024-07-25 09:31:19 -07:00
sagar-salvi-apptware
348d449d8a
fix(ingest/Glue): column upstream lineage between S3 and Glue (#10895) 2024-07-19 14:39:19 +05:30
Harshal Sheth
7f3da47e90
fix(ingest/snowflake): fix test connection (#10927) 2024-07-17 11:57:58 -07:00
Harshal Sheth
bccfd8f0a5
feat(ingest/snowflake): integrate snowflake-queries into main source (#10905) 2024-07-17 10:22:14 -07:00
sagar-salvi-apptware
ec788df328
fix(ingest/bigquery): handle quota exceeded for project.list requests (#10912) 2024-07-17 17:17:52 +02:00
Patrick Franco Braz
4b83adfa9f
fix(ingest/bigquery): changes helper function to decode unicode scape sequences (#10845) 2024-07-16 15:50:54 -07:00
Mayuri Nehate
ff1c6b895e
feat(ingest/BigQuery): refactor+parallelize dataset metadata extraction (#10884) 2024-07-16 11:46:42 -07:00