1732 Commits

Author SHA1 Message Date
Andrew Sikowitz
301d3e6b1c
test(ingest/unity): Add Unity Catalog memory performance testing (#8932) 2023-10-04 10:23:13 -04:00
Upendra Rao Vedullapalli
13508a9d88
feat(bigquery): excluding projects without any datasets from ingestion (#8535)
Co-authored-by: Upendra Vedullapalli <upendra.rao.vedullapalli@entur.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-10-04 09:20:51 -04:00
Harshal Sheth
a300b39f15
feat(ingest/airflow): airflow plugin v2 (#8853) 2023-10-04 16:23:15 +05:30
siddiquebagwan-gslab
c415d63dda
feat(ingestion/powerbi): column level lineage extraction for M-Query (#8796) 2023-10-04 16:22:51 +05:30
Aseem Bansal
ad313ad282
feat(transfomer): add transformer to get ownership from tags (#8748) 2023-10-04 14:06:03 +05:30
Harshal Sheth
9deb7be3fc
fix(ingest): refactor test markers + fix disk space issues in CI (#8938) 2023-10-03 20:17:49 -07:00
hariishaa
622816dcb8
feat(metadata-ingestion): implement mlflow source (#7971)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-26 13:51:30 -04:00
Mayuri Nehate
874109f76e
feat(ingest/snowflake): allow shares config without platform instance (#8803) 2023-09-25 14:04:05 +05:30
Shubham Jagtap
501522d891
feat(ingest/kafka-connect): Lineage for Kafka Connect > Snowflake (#8811)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:12:48 -07:00
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Harshal Sheth
2a0200b047
feat(ingest): bump acryl-sqlglot (#8882) 2023-09-21 14:28:51 -07:00
Tony Ouyang
f4da93988e
feat(ingestion/dynamodb): Add DynamoDB as new metadata ingestion source (#8768)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-09-15 13:26:17 -07:00
Mayuri Nehate
cdb9f5ba62
feat(bigquery): add better timers around every API call (#8626) 2023-09-15 11:55:39 -07:00
cjm98332
a021053a72
fix(ingest/mssql): Add UNIQUEIDENTIFIER data type as String (#8642)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-12 19:23:39 +05:30
siddiquebagwan-gslab
95b2d437ca
feat(ingestion/looker): Add view file-path as option in view_naming_pattern config (#8713) 2023-09-11 16:55:17 +05:30
Harshal Sheth
0e8000cf18
feat(ingest): drop sql_metadata parser (#8765) 2023-09-07 11:32:28 -07:00
Harshal Sheth
4ffad4d9b9
chore(ingest): upgrade sqlglot fork (#8775)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-09-06 12:49:44 -07:00
Mayuri Nehate
e680a97046
fix(ingest/bigquery): fix partition and median queries for profiling (#8778) 2023-09-06 12:48:11 -07:00
cccs-eric
6fe60a274e
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 (#8357)
Co-authored-by: cccs-Dustin <96579982+cccs-Dustin@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-31 13:01:05 -04:00
Andrew Sikowitz
a4e726872b
fix(ingest/bigquery): Filter out fine grained lineage with no upstreams (#8758) 2023-08-31 12:44:24 -04:00
Harshal Sheth
21b2851be7
feat(sql-parser): schema-aware output column casing (#8760)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-08-31 09:43:39 -07:00
Mayuri Nehate
e867dbc3da
ci: separate airflow build and test (#8688)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-30 14:08:42 -07:00
Andrew Sikowitz
026f7abe9c
feat(ingest/usage): Make cumulative query character limit configurable (#8751) 2023-08-30 15:53:08 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Tamas Nemeth
d86b336e70
chore(ingest/s3) Bump Deequ and Pyspark version (#8638)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-29 18:11:37 +02:00
Kirill Popov
3acd25ba1d
feat(ingest/metabase): detect source table for cards sourced from other cards (#8577) 2023-08-28 13:02:41 -04:00
Harshal Sheth
7b66c32b70
feat(ingest): support writing configs to files (#8696) 2023-08-28 09:55:50 -07:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Mayuri Nehate
e285da3e75
feat(ingest/snowflake): tables from snowflake shares as siblings (#8531) 2023-08-24 10:23:07 -04:00
RChygir
43d48ddde4
feat(ingest/mssql): load jobs and stored procedures (#5363) 2023-08-24 14:48:03 +05:30
Alexander
bcef25acd3
feat(ingest/looker): Record observed lineage timestamps for Looker and LookML sources (#7735) 2023-08-24 14:47:04 +05:30
Andrew Sikowitz
22c35f1a23
fix(ingest/bigquery): Add config option to create DataPlatformInstance, default off (#8659) 2023-08-24 14:46:06 +05:30
Andrew Sikowitz
8141e2d649
remove(ingest/snowflake): Remove legacy snowflake lineage (#8653)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2023-08-23 15:57:46 -04:00
siddiquebagwan-gslab
8ee58af0c2
feat(ingestion/powerbi): support multiple tables as upstream in native SQL parsing (#8592) 2023-08-23 14:38:58 +05:30
Andrew Sikowitz
439cf4d7dc
test(ingest/vertica): Skip integration test failing CI; support arm Macs (#8694) 2023-08-22 16:27:46 -04:00
Alexander
c0addf6eef
feat(ingest/bigquery): add tag to BigQuery clustering columns (#8495)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-17 12:44:15 -04:00
Jinlin Yang
6748aecdc0
fix(ingest/s3): emit data_platform_instance aspect if the config has platform_instance (#8585) 2023-08-17 10:40:54 +05:30
kr_Deepankar
23ac9062fe
feat(ingestion/ldap): flag to ingest ldap users with email instead of username (#8606) 2023-08-16 10:33:18 +05:30
kr_Deepankar
94ce753bb0
fix(ingest/kafka): use SchemaReference properties instead of dict access (#8615)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-16 10:33:09 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Mayuri Nehate
ddcd5109dc
feat(ingest): allow relative start time config (#8562)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-14 17:48:06 -07:00
alplatonov
11fdfcf956
Fix(ingestion/clickhouse) move to two tier sqlalchemy (#8300)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-11 16:11:40 -04:00
alplatonov
b58f9bb396
Feat(ingest/ldap)fix list index out of range error (#8525)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-09 13:13:27 -04:00
Mayuri Nehate
b4e104f190
fix(ingest/snowflake): maintain qualified name casing, do not lowercase (#8574) 2023-08-04 10:43:22 -07:00
Aseem Bansal
dac89fb1fb
feat(ingest): allow lower freq profiling based on date of month/day of week (#8489) 2023-08-04 10:13:48 +05:30
mohdsiddique
6a36118b4f
feat(ingestion/snowflake): use user email-id in urn generation for top users stat (#8513)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-08-03 08:30:50 +05:30
mohdsiddique
05ef7db45e
fix(ingetion/mssql): convert dataset urns to lowercase (#8551)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-08-02 13:44:05 -07:00
TusharM
e2919cfd06
feat(ingest/kafka-connect): add support for Confluent S3 Sink Connector (#8298) 2023-08-02 15:09:50 +05:30
Harshal Sheth
08a39c4b0d
fix(ingest/presto): fix presto on hive test failures (#8548) 2023-08-02 12:46:39 +05:30
Aseem Bansal
bb33f015ca
fix(ingest/s3): wrong sorting in case of multi-partition key (#8536) 2023-08-02 09:54:33 +05:30