905 Commits

Author SHA1 Message Date
Harshal Sheth
84bba4dc44
feat(ingest): add output schema inference for sql parser (#8989) 2023-10-11 22:31:17 -07:00
siddiquebagwan-gslab
10a190470e
feat(ingestion/redshift): CLL support in redshift (#8921) 2023-10-10 20:24:08 -07:00
Mayuri Nehate
57f855ecd1
feat(ingest): refactor + simplify incremental lineage helper (#8976) 2023-10-09 23:48:21 -07:00
Mayuri Nehate
8d175ef7ef
feat(ingest): incremental lineage source helper (#8941)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-09 13:34:25 -07:00
Harshal Sheth
3cede10ab3
feat(ingest/dbt): support use_compiled_code and test_warnings_are_errors (#8956) 2023-10-05 10:29:47 -07:00
Harshal Sheth
817c371fbf
feat: data contracts models + CLI (#8923)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: John Joyce <john@acryl.io>
2023-10-04 20:11:06 -07:00
ethan-cartwright
e2afd44bfe
feat(dbt-ingestion): add documentation link from dbt source to institutionalMemory (#8686)
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-04 20:38:58 +00:00
Andrew Sikowitz
301d3e6b1c
test(ingest/unity): Add Unity Catalog memory performance testing (#8932) 2023-10-04 10:23:13 -04:00
Upendra Rao Vedullapalli
13508a9d88
feat(bigquery): excluding projects without any datasets from ingestion (#8535)
Co-authored-by: Upendra Vedullapalli <upendra.rao.vedullapalli@entur.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-10-04 09:20:51 -04:00
Harshal Sheth
a300b39f15
feat(ingest/airflow): airflow plugin v2 (#8853) 2023-10-04 16:23:15 +05:30
Aseem Bansal
ad313ad282
feat(transfomer): add transformer to get ownership from tags (#8748) 2023-10-04 14:06:03 +05:30
hariishaa
622816dcb8
feat(metadata-ingestion): implement mlflow source (#7971)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-26 13:51:30 -04:00
Mayuri Nehate
874109f76e
feat(ingest/snowflake): allow shares config without platform instance (#8803) 2023-09-25 14:04:05 +05:30
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Harshal Sheth
2a0200b047
feat(ingest): bump acryl-sqlglot (#8882) 2023-09-21 14:28:51 -07:00
Mayuri Nehate
cdb9f5ba62
feat(bigquery): add better timers around every API call (#8626) 2023-09-15 11:55:39 -07:00
Harshal Sheth
0e8000cf18
feat(ingest): drop sql_metadata parser (#8765) 2023-09-07 11:32:28 -07:00
Harshal Sheth
4ffad4d9b9
chore(ingest): upgrade sqlglot fork (#8775)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-09-06 12:49:44 -07:00
Mayuri Nehate
e680a97046
fix(ingest/bigquery): fix partition and median queries for profiling (#8778) 2023-09-06 12:48:11 -07:00
cccs-eric
6fe60a274e
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 (#8357)
Co-authored-by: cccs-Dustin <96579982+cccs-Dustin@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-31 13:01:05 -04:00
Andrew Sikowitz
a4e726872b
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams (#8758) 2023-08-31 12:44:24 -04:00
Harshal Sheth
21b2851be7
feat(sql-parser): schema-aware output column casing (#8760)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-08-31 09:43:39 -07:00
Mayuri Nehate
e867dbc3da
ci: separate airflow build and test (#8688)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-30 14:08:42 -07:00
Andrew Sikowitz
026f7abe9c
feat(ingest/usage): Make cumulative query character limit configurable (#8751) 2023-08-30 15:53:08 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Harshal Sheth
7b66c32b70
feat(ingest): support writing configs to files (#8696) 2023-08-28 09:55:50 -07:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Mayuri Nehate
e285da3e75
feat(ingest/snowflake): tables from snowflake shares as siblings (#8531) 2023-08-24 10:23:07 -04:00
Andrew Sikowitz
22c35f1a23
fix(ingest/bigquery): Add config option to create DataPlatformInstance, default off (#8659) 2023-08-24 14:46:06 +05:30
Alexander
c0addf6eef
feat(ingest/bigquery): add tag to BigQuery clustering columns (#8495)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-17 12:44:15 -04:00
kr_Deepankar
94ce753bb0
fix(ingest/kafka): use SchemaReference properties instead of dict access (#8615)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-16 10:33:09 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Mayuri Nehate
ddcd5109dc
feat(ingest): allow relative start time config (#8562)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-14 17:48:06 -07:00
alplatonov
11fdfcf956
Fix(ingestion/clickhouse) move to two tier sqlalchemy (#8300)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-11 16:11:40 -04:00
Aseem Bansal
dac89fb1fb
feat(ingest): allow lower freq profiling based on date of month/day of week (#8489) 2023-08-04 10:13:48 +05:30
Aseem Bansal
bb33f015ca
fix(ingest/s3): wrong sorting in case of multi-partition key (#8536) 2023-08-02 09:54:33 +05:30
Jarod Smilkstein
f51bd01a70
feat(ingest): add ability to read other method types than GET for OAS ingest recipes (#8303) 2023-08-02 09:54:09 +05:30
Kirill Popov
eec89a884a
feat(ingest): Add metabase database id to platform instance mapping (#8359) 2023-08-02 09:53:48 +05:30
Benjamin Dornel
2e2a6748ac
fix(ingest/json-schema): convert non-string enums to strings (#8479) 2023-08-01 19:35:40 +05:30
Harshal Sheth
66074341f7
test(ingest): test case statements with sql parser (#8437) 2023-08-01 19:34:48 +05:30
Tamas Nemeth
1a47a51f1b
fix(ingest/build): Fix sagemaker mypy and flake8 issues (#8530) 2023-07-31 16:13:07 +02:00
Harshal Sheth
89f23d3c36
chore(ingest): bump sqllineage and sqlparse (#8481) 2023-07-28 13:10:19 -07:00
Harshal Sheth
9718505fc7
fix(ingest): respect max_threads for ingestion reporter (#8521) 2023-07-28 13:09:32 -07:00
Andrew Sikowitz
bf9f380350
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance (#8501) 2023-07-25 11:35:34 +05:30
Mayuri Nehate
f4fde21168
feat(ingest/nifi): add support for basic auth in nifi (#8457) 2023-07-20 14:12:33 -04:00
Aseem Bansal
9df70d7355
ingest(elasticsearch): add basic profiling (#8351) 2023-07-20 08:25:30 +05:30
Harshal Sheth
4fb77e4a25
fix(ingest): tweak ingestion exit codes (#8418) 2023-07-14 15:47:16 -07:00
Mayuri Nehate
a2fc02294c
fix(ingest/snowflake): fix azure cloud region ids in external url (#8376)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-12 17:11:33 +05:30
Harshal Sheth
d4135d57b7
feat(ingest/bigquery): support column-level lineage (#8382) 2023-07-11 11:12:51 -07:00
Andrew Sikowitz
759d0c86af
test(ingest/airflow): Fix test for airflow 2.6.3 (#8393) 2023-07-11 12:55:32 -04:00