942 Commits

Author SHA1 Message Date
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Harshal Sheth
2a0200b047
feat(ingest): bump acryl-sqlglot (#8882) 2023-09-21 14:28:51 -07:00
Mayuri Nehate
cdb9f5ba62
feat(bigquery): add better timers around every API call (#8626) 2023-09-15 11:55:39 -07:00
Harshal Sheth
0e8000cf18
feat(ingest): drop sql_metadata parser (#8765) 2023-09-07 11:32:28 -07:00
Harshal Sheth
4ffad4d9b9
chore(ingest): upgrade sqlglot fork (#8775)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-09-06 12:49:44 -07:00
Mayuri Nehate
e680a97046
fix(ingest/bigquery): fix partition and median queries for profiling (#8778) 2023-09-06 12:48:11 -07:00
cccs-eric
6fe60a274e
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 (#8357)
Co-authored-by: cccs-Dustin <96579982+cccs-Dustin@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-31 13:01:05 -04:00
Andrew Sikowitz
a4e726872b
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams (#8758) 2023-08-31 12:44:24 -04:00
Harshal Sheth
21b2851be7
feat(sql-parser): schema-aware output column casing (#8760)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-08-31 09:43:39 -07:00
Mayuri Nehate
e867dbc3da
ci: separate airflow build and test (#8688)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-30 14:08:42 -07:00
Andrew Sikowitz
026f7abe9c
feat(ingest/usage): Make cumulative query character limit configurable (#8751) 2023-08-30 15:53:08 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Harshal Sheth
7b66c32b70
feat(ingest): support writing configs to files (#8696) 2023-08-28 09:55:50 -07:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Mayuri Nehate
e285da3e75
feat(ingest/snowflake): tables from snowflake shares as siblings (#8531) 2023-08-24 10:23:07 -04:00
Andrew Sikowitz
22c35f1a23
fix(ingest/bigquery): Add config option to create DataPlatformInstance, default off (#8659) 2023-08-24 14:46:06 +05:30
Alexander
c0addf6eef
feat(ingest/bigquery): add tag to BigQuery clustering columns (#8495)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-17 12:44:15 -04:00
kr_Deepankar
94ce753bb0
fix(ingest/kafka): use SchemaReference properties instead of dict access (#8615)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-16 10:33:09 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Mayuri Nehate
ddcd5109dc
feat(ingest): allow relative start time config (#8562)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-14 17:48:06 -07:00
alplatonov
11fdfcf956
Fix(ingestion/clickhouse) move to two tier sqlalchemy (#8300)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-11 16:11:40 -04:00
Aseem Bansal
dac89fb1fb
feat(ingest): allow lower freq profiling based on date of month/day of week (#8489) 2023-08-04 10:13:48 +05:30
Aseem Bansal
bb33f015ca
fix(ingest/s3): wrong sorting in case of multi-partition key (#8536) 2023-08-02 09:54:33 +05:30
Jarod Smilkstein
f51bd01a70
feat(ingest): add ability to read other method types than GET for OAS ingest recipes (#8303) 2023-08-02 09:54:09 +05:30
Kirill Popov
eec89a884a
feat(ingest): Add metabase database id to platform instance mapping (#8359) 2023-08-02 09:53:48 +05:30
Benjamin Dornel
2e2a6748ac
fix(ingest/json-schema): convert non-string enums to strings (#8479) 2023-08-01 19:35:40 +05:30
Harshal Sheth
66074341f7
test(ingest): test case statements with sql parser (#8437) 2023-08-01 19:34:48 +05:30
Tamas Nemeth
1a47a51f1b
fix(ingest/build): Fix sagemaker mypy and flake8 issues (#8530) 2023-07-31 16:13:07 +02:00
Harshal Sheth
89f23d3c36
chore(ingest): bump sqllineage and sqlparse (#8481) 2023-07-28 13:10:19 -07:00
Harshal Sheth
9718505fc7
fix(ingest): respect max_threads for ingestion reporter (#8521) 2023-07-28 13:09:32 -07:00
Andrew Sikowitz
bf9f380350
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance (#8501) 2023-07-25 11:35:34 +05:30
Mayuri Nehate
f4fde21168
feat(ingest/nifi): add support for basic auth in nifi (#8457) 2023-07-20 14:12:33 -04:00
Aseem Bansal
9df70d7355
ingest(elasticsearch): add basic profiling (#8351) 2023-07-20 08:25:30 +05:30
Harshal Sheth
4fb77e4a25
fix(ingest): tweak ingestion exit codes (#8418) 2023-07-14 15:47:16 -07:00
Mayuri Nehate
a2fc02294c
fix(ingest/snowflake): fix azure cloud region ids in external url (#8376)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-12 17:11:33 +05:30
Harshal Sheth
d4135d57b7
feat(ingest/bigquery): support column-level lineage (#8382) 2023-07-11 11:12:51 -07:00
Andrew Sikowitz
759d0c86af
test(ingest/airflow): Fix test for airflow 2.6.3 (#8393) 2023-07-11 12:55:32 -04:00
Andrew Sikowitz
2261531e31
test(ingest): Aspect level golden file comparison (#8310) 2023-07-11 10:39:47 -04:00
Harshal Sheth
3e47b3d228
feat(ingest): schema-aware SQL parsing for column-level lineage (#8334) 2023-07-07 16:24:35 -07:00
Andrew Sikowitz
1f84bf5b2b
fix(ingest/sql-common): Fix profile_table_level_only (#8331)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-07 19:05:50 -04:00
Andrew Sikowitz
3a21c27f06
feat(ingest): Turn on browse path v2 creation (#8342)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-06 16:43:42 -04:00
Mayuri Nehate
8cf778dc9b
fix(ingest): update pydantic helpers to address unique name issue (#8324)
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2023-07-06 13:16:07 -07:00
Harshal Sheth
08d4e904a8
feat(ingest): add YamlFileUpdater utility (#8266) 2023-06-29 13:15:34 -07:00
Mayuri Nehate
711efde2c0
feat(ingest/snowflake): snowpipe s3 lineage (#8262)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-28 10:59:01 -04:00
Andrew Sikowitz
aa5e02d0ec
feat(ingest): Create zero usage aspects (#8205)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-06-22 17:07:50 -04:00
Andrew Sikowitz
2751a09284
fix(ingest): pass platform correctly to browse path v2 helper (#8244) 2023-06-15 20:10:15 -07:00
Andrew Sikowitz
66806a805e
feat(ingest/unity): Set external url for containers and datasets (#8238) 2023-06-15 09:05:49 +02:00
Harshal Sheth
2d7692a245
feat(sdk): support patches as MCPs in file source (#8220)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-06-14 14:56:27 -07:00
Andrew Sikowitz
c5cc53b99a
feat(ingest/bigquery_v2): enable platform instance using project id (#8216)
Co-authored-by: Adrián Pertíñez <khurzak92@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-14 09:50:21 -07:00
Andrew Sikowitz
f2c66fd8d3
feat(ingest): Produce browse paths v2 on demand and with platform instance (#8173)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>
2023-06-09 10:35:54 -07:00