935 Commits

Author SHA1 Message Date
Andrew Sikowitz
a4e726872b
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams (#8758) 2023-08-31 12:44:24 -04:00
Harshal Sheth
21b2851be7
feat(sql-parser): schema-aware output column casing (#8760)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-08-31 09:43:39 -07:00
Mayuri Nehate
e867dbc3da
ci: separate airflow build and test (#8688)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-30 14:08:42 -07:00
Andrew Sikowitz
026f7abe9c
feat(ingest/usage): Make cumulative query character limit configurable (#8751) 2023-08-30 15:53:08 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Harshal Sheth
7b66c32b70
feat(ingest): support writing configs to files (#8696) 2023-08-28 09:55:50 -07:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Mayuri Nehate
e285da3e75
feat(ingest/snowflake): tables from snowflake shares as siblings (#8531) 2023-08-24 10:23:07 -04:00
Andrew Sikowitz
22c35f1a23
fix(ingest/bigquery): Add config option to create DataPlatformInstance, default off (#8659) 2023-08-24 14:46:06 +05:30
Alexander
c0addf6eef
feat(ingest/bigquery): add tag to BigQuery clustering columns (#8495)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-17 12:44:15 -04:00
kr_Deepankar
94ce753bb0
fix(ingest/kafka): use SchemaReference properties instead of dict access (#8615)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-16 10:33:09 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Mayuri Nehate
ddcd5109dc
feat(ingest): allow relative start time config (#8562)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-14 17:48:06 -07:00
alplatonov
11fdfcf956
Fix(ingestion/clickhouse) move to two tier sqlalchemy (#8300)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-11 16:11:40 -04:00
Aseem Bansal
dac89fb1fb
feat(ingest): allow lower freq profiling based on date of month/day of week (#8489) 2023-08-04 10:13:48 +05:30
Aseem Bansal
bb33f015ca
fix(ingest/s3): wrong sorting in case of multi-partition key (#8536) 2023-08-02 09:54:33 +05:30
Jarod Smilkstein
f51bd01a70
feat(ingest): add ability to read other method types than GET for OAS ingest recipes (#8303) 2023-08-02 09:54:09 +05:30
Kirill Popov
eec89a884a
feat(ingest): Add metabase database id to platform instance mapping (#8359) 2023-08-02 09:53:48 +05:30
Benjamin Dornel
2e2a6748ac
fix(ingest/json-schema): convert non-string enums to strings (#8479) 2023-08-01 19:35:40 +05:30
Harshal Sheth
66074341f7
test(ingest): test case statements with sql parser (#8437) 2023-08-01 19:34:48 +05:30
Tamas Nemeth
1a47a51f1b
fix(ingest/build): Fix sagemaker mypy and flake8 issues (#8530) 2023-07-31 16:13:07 +02:00
Harshal Sheth
89f23d3c36
chore(ingest): bump sqllineage and sqlparse (#8481) 2023-07-28 13:10:19 -07:00
Harshal Sheth
9718505fc7
fix(ingest): respect max_threads for ingestion reporter (#8521) 2023-07-28 13:09:32 -07:00
Andrew Sikowitz
bf9f380350
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance (#8501) 2023-07-25 11:35:34 +05:30
Mayuri Nehate
f4fde21168
feat(ingest/nifi): add support for basic auth in nifi (#8457) 2023-07-20 14:12:33 -04:00
Aseem Bansal
9df70d7355
ingest(elasticsearch): add basic profiling (#8351) 2023-07-20 08:25:30 +05:30
Harshal Sheth
4fb77e4a25
fix(ingest): tweak ingestion exit codes (#8418) 2023-07-14 15:47:16 -07:00
Mayuri Nehate
a2fc02294c
fix(ingest/snowflake): fix azure cloud region ids in external url (#8376)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-12 17:11:33 +05:30
Harshal Sheth
d4135d57b7
feat(ingest/bigquery): support column-level lineage (#8382) 2023-07-11 11:12:51 -07:00
Andrew Sikowitz
759d0c86af
test(ingest/airflow): Fix test for airflow 2.6.3 (#8393) 2023-07-11 12:55:32 -04:00
Andrew Sikowitz
2261531e31
test(ingest): Aspect level golden file comparison (#8310) 2023-07-11 10:39:47 -04:00
Harshal Sheth
3e47b3d228
feat(ingest): schema-aware SQL parsing for column-level lineage (#8334) 2023-07-07 16:24:35 -07:00
Andrew Sikowitz
1f84bf5b2b
fix(ingest/sql-common): Fix profile_table_level_only (#8331)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-07 19:05:50 -04:00
Andrew Sikowitz
3a21c27f06
feat(ingest): Turn on browse path v2 creation (#8342)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-06 16:43:42 -04:00
Mayuri Nehate
8cf778dc9b
fix(ingest): update pydantic helpers to address unique name issue (#8324)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2023-07-06 13:16:07 -07:00
Harshal Sheth
08d4e904a8
feat(ingest): add YamlFileUpdater utility (#8266) 2023-06-29 13:15:34 -07:00
Mayuri Nehate
711efde2c0
feat(ingest/snowflake): snowpipe s3 lineage (#8262)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-28 10:59:01 -04:00
Andrew Sikowitz
aa5e02d0ec
feat(ingest): Create zero usage aspects (#8205)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-06-22 17:07:50 -04:00
Andrew Sikowitz
2751a09284
fix(ingest): pass platform correctly to browse path v2 helper (#8244) 2023-06-15 20:10:15 -07:00
Andrew Sikowitz
66806a805e
feat(ingest/unity): Set external url for containers and datasets (#8238) 2023-06-15 09:05:49 +02:00
Harshal Sheth
2d7692a245
feat(sdk): support patches as MCPs in file source (#8220)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-06-14 14:56:27 -07:00
Andrew Sikowitz
c5cc53b99a
feat(ingest/bigquery_v2): enable platform instance using project id (#8216)
Co-authored-by: Adrián Pertíñez <khurzak92@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-14 09:50:21 -07:00
Andrew Sikowitz
f2c66fd8d3
feat(ingest): Produce browse paths v2 on demand and with platform instance (#8173)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>
2023-06-09 10:35:54 -07:00
Andrew Sikowitz
369a04ae30
revert(ingest/bigquery): Do not emit DataPlatformInstance; remove references to platform_instance (#8196) 2023-06-09 13:44:24 +05:30
Andrew Sikowitz
9fa8489cb8
feat(ingest/snowflake): Okta OAuth support; update docs (#8157) 2023-06-07 01:09:05 -07:00
Mayuri Nehate
983a8ca675
feat(classification): support for regex based custom infotypes (#8177) 2023-06-06 14:41:51 +02:00
Adrián Pertíñez
743439c11d
feat(ingest/bigquery_v2): enable platform instance using project id (#8142)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-06-05 15:17:40 -07:00
Andrew Sikowitz
802c91a0a7
feat(ingest): Create Browse Paths V2 under flag (#8120) 2023-06-02 12:50:38 -07:00
Mayuri Nehate
fe1ff71318
fix(ingest/nifi): allow nifi site url with context path (#8156) 2023-06-02 15:43:33 +02:00
Harshal Sheth
690ed083d9
feat(ingest): add more fail-safes to stateful ingestion (#8111) 2023-05-31 18:49:48 -07:00