1604 Commits

Author SHA1 Message Date
Harshal Sheth
d13553f53a
feat(sqlparser): extract CLL from updates (#9078) 2023-10-24 12:24:50 -07:00
Andrew Sikowitz
adf8c8db38
refactor(ingest): Move sqlalchemy import out of sql_types.py (#9065) 2023-10-24 08:59:56 +02:00
Harshal Sheth
8fb95e88a1
feat(sqlparser): parse create DDL statements (#9002) 2023-10-23 12:40:42 -07:00
Andrew Sikowitz
35d7770462
test(ingest/delta-lake): Fix integration tests (#9056) 2023-10-20 01:40:23 -04:00
Tim
1eaf9c8c5f
feature(ingest/athena): introduce support for complex and nested schemas in Athena (#8137)
Co-authored-by: dnks23 <dominik.s23@live.de>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Tim <tim@MBP-von-Tim.fritz.box>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-18 09:39:59 -07:00
Andrew Sikowitz
d2eb42373f
fix(ingest/sqlalchemy): Fix URL parsing when sqlalchemy_uri provided (#9032) 2023-10-18 17:34:45 +02:00
VISHAL KUMAR
5937937819
feat(ingestion/Vertica): Fixed vertica integration test Updated vertica dialect (#9011) 2023-10-18 14:52:07 +05:30
Mayuri Nehate
c81a339bfc
build(ingest): remove ratelimiter dependency (#9008) 2023-10-16 09:27:57 -07:00
Andrew Sikowitz
6bc7425353
feat(cli/datacontract): Add data quality assertion support (#8968)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2023-10-13 12:36:18 -04:00
Tamas Nemeth
a8f0080c08
feat(ingest/teradata): Teradata source (#8977) 2023-10-12 15:14:45 -07:00
Tamas Nemeth
c381806110
feat(ingestion): Adding config option to auto lowercase dataset urns (#8928) 2023-10-12 13:56:30 +02:00
Tamas Nemeth
dd418de76d
fix(ingest/bigquery): Fix shard regexp to match without underscore as well (#8934) 2023-10-12 13:10:59 +02:00
Harshal Sheth
84bba4dc44
feat(ingest): add output schema inference for sql parser (#8989) 2023-10-11 22:31:17 -07:00
Sergio Gómez Villamor
245c5c0008
fix(ingest/looker): stop emitting tag owner (#8942) 2023-10-11 17:06:19 -07:00
siddiquebagwan-gslab
10a190470e
feat(ingestion/redshift): CLL support in redshift (#8921) 2023-10-10 20:24:08 -07:00
Andrew Sikowitz
1a72fa499c
feat(ingest/tableau): Allow parsing of database name from fullName (#8981) 2023-10-10 17:34:06 -04:00
Mayuri Nehate
57f855ecd1
feat(ingest): refactor + simplify incremental lineage helper (#8976) 2023-10-09 23:48:21 -07:00
Mayuri Nehate
8d175ef7ef
feat(ingest): incremental lineage source helper (#8941)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-09 13:34:25 -07:00
Harshal Sheth
3cede10ab3
feat(ingest/dbt): support use_compiled_code and test_warnings_are_errors (#8956) 2023-10-05 10:29:47 -07:00
Aseem Bansal
2bc685d3b9
ci: tweak ci to decrease wait time of devs (#8945) 2023-10-05 09:31:32 +05:30
Harshal Sheth
817c371fbf
feat: data contracts models + CLI (#8923)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: John Joyce <john@acryl.io>
2023-10-04 20:11:06 -07:00
ethan-cartwright
e2afd44bfe
feat(dbt-ingestion): add documentation link from dbt source to institutionalMemory (#8686)
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-04 20:38:58 +00:00
Andrew Sikowitz
301d3e6b1c
test(ingest/unity): Add Unity Catalog memory performance testing (#8932) 2023-10-04 10:23:13 -04:00
Upendra Rao Vedullapalli
13508a9d88
feat(bigquery): excluding projects without any datasets from ingestion (#8535)
Co-authored-by: Upendra Vedullapalli <upendra.rao.vedullapalli@entur.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-10-04 09:20:51 -04:00
Harshal Sheth
a300b39f15
feat(ingest/airflow): airflow plugin v2 (#8853) 2023-10-04 16:23:15 +05:30
siddiquebagwan-gslab
c415d63dda
feat(ingestion/powerbi): column level lineage extraction for M-Query (#8796) 2023-10-04 16:22:51 +05:30
Aseem Bansal
ad313ad282
feat(transfomer): add transformer to get ownership from tags (#8748) 2023-10-04 14:06:03 +05:30
Harshal Sheth
9deb7be3fc
fix(ingest): refactor test markers + fix disk space issues in CI (#8938) 2023-10-03 20:17:49 -07:00
hariishaa
622816dcb8
feat(metadata-ingestion): implement mlflow source (#7971)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-26 13:51:30 -04:00
Mayuri Nehate
874109f76e
feat(ingest/snowflake): allow shares config without platform instance (#8803) 2023-09-25 14:04:05 +05:30
Shubham Jagtap
501522d891
feat(ingest/kafka-connect): Lineage for Kafka Connect > Snowflake (#8811)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:12:48 -07:00
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Harshal Sheth
2a0200b047
feat(ingest): bump acryl-sqlglot (#8882) 2023-09-21 14:28:51 -07:00
Tony Ouyang
f4da93988e
feat(ingestion/dynamodb): Add DynamoDB as new metadata ingestion source (#8768)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-09-15 13:26:17 -07:00
Mayuri Nehate
cdb9f5ba62
feat(bigquery): add better timers around every API call (#8626) 2023-09-15 11:55:39 -07:00
cjm98332
a021053a72
fix(ingest/mssql): Add UNIQUEIDENTIFIER data type as String (#8642)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-12 19:23:39 +05:30
siddiquebagwan-gslab
95b2d437ca
feat(ingestion/looker): Add view file-path as option in view_naming_pattern config (#8713) 2023-09-11 16:55:17 +05:30
Harshal Sheth
0e8000cf18
feat(ingest): drop sql_metadata parser (#8765) 2023-09-07 11:32:28 -07:00
Harshal Sheth
4ffad4d9b9
chore(ingest): upgrade sqlglot fork (#8775)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-09-06 12:49:44 -07:00
Mayuri Nehate
e680a97046
fix(ingest/bigquery): fix partition and median queries for profiling (#8778) 2023-09-06 12:48:11 -07:00
cccs-eric
6fe60a274e
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 (#8357)
Co-authored-by: cccs-Dustin <96579982+cccs-Dustin@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-31 13:01:05 -04:00
Andrew Sikowitz
a4e726872b
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams (#8758) 2023-08-31 12:44:24 -04:00
Harshal Sheth
21b2851be7
feat(sql-parser): schema-aware output column casing (#8760)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-08-31 09:43:39 -07:00
Mayuri Nehate
e867dbc3da
ci: separate airflow build and test (#8688)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-30 14:08:42 -07:00
Andrew Sikowitz
026f7abe9c
feat(ingest/usage): Make cumulative query character limit configurable (#8751) 2023-08-30 15:53:08 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Tamas Nemeth
d86b336e70
chore(ingest/s3) Bump Deequ and Pyspark version (#8638)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-29 18:11:37 +02:00
Kirill Popov
3acd25ba1d
feat(ingest/metabase): detect source table for cards sourced from other cards (#8577) 2023-08-28 13:02:41 -04:00
Harshal Sheth
7b66c32b70
feat(ingest): support writing configs to files (#8696) 2023-08-28 09:55:50 -07:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30