22 Commits

Author SHA1 Message Date
sid-acryl
9fb2df11f3
fix(ingest): sort by last modified not working in the UI (#11343) 2024-09-23 10:06:05 -07:00
Sergio Gómez Villamor
31edb46dbc
feat(ingestion): adds env property in ContainerProperties (#11214)
Co-authored-by: siladitya2 <siladitya2@gmail.com>
2024-09-18 14:56:52 +05:30
Tamas Nemeth
ef6a410091
feat(ingest/s3): Partition support improvements (#11083)
- Partition autodetection
- Option to find min/max/min-max partition of a dataset
- Generating Partition aspects
2024-08-22 17:55:43 +02:00
Tamas Nemeth
7e5610f358
feat(ingest/dagster): Dagster source (#10071)
Co-authored-by: shubhamjagtap639 <shubham.jagtap@gslab.com>
2024-03-25 13:28:35 +01:00
Harshal Sheth
b0163c4885
feat(ingest): utilities for query logs (#10036) 2024-03-12 23:20:46 -07:00
Tamas Nemeth
d86b336e70
chore(ingest/s3) Bump Deequ and Pyspark version (#8638)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-29 18:11:37 +02:00
Jinlin Yang
6748aecdc0
fix(ingest/s3): emit data_platform_instance aspect if the config has platform_instance (#8585) 2023-08-17 10:40:54 +05:30
Andrew Sikowitz
bf9f380350
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance (#8501) 2023-07-25 11:35:34 +05:30
Andrew Sikowitz
3a21c27f06
feat(ingest): Turn on browse path v2 creation (#8342)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-06 16:43:42 -04:00
Harshal Sheth
4e9c652707
feat(ingest): add env to container properties (#8027) 2023-05-22 12:07:16 -07:00
Tamas Nemeth
bdd4bc7b92
feat(ingest/s3) - Stateful ingestion and last-updated support (#8022) 2023-05-19 13:10:15 +02:00
Harsha Mandadi
bf36c935fa
feat(ingest/s3): support path_specs of different S3 buckets in the same recipe (#7514) 2023-03-14 21:55:57 -07:00
John Joyce
18f387c6ea
fix(cli): Adding exit code to correctly return failure or success (#7520)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
2023-03-13 13:32:40 -07:00
nachiket-juneja
e07cd2090b
Feat/s3 ingestion enhancement to update schema from latest partition (#7410)
Co-authored-by: Prashant Singh Thakur <prashant.thakur@nucleusteq.com>
2023-02-28 08:58:28 +01:00
Mayuri Nehate
e79b4e8c2b
feat(ingest): s3 - add status aspect for detected s3 datasets (#6402) 2022-11-13 17:29:42 -08:00
Harshal Sheth
09616ee2b3
feat(ingest): include instance in container dataPlatform when provided (#6083) 2022-10-13 11:29:54 -07:00
Shirshanka Das
e9c4c823d8
fix(ingest): bigquery-beta - ensure that status aspect is emitted for… (#6154)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-10-08 16:00:45 -07:00
Harshal Sheth
3c0f63c50a
fix(ingest): hide deprecated path_spec option from config (#5944) 2022-10-04 12:14:00 -07:00
Jordan Wolinsky
3a86ff3485
Fix profiling when using {table}. (#5531)
* profiling fix for when using {table}

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-08-08 13:16:59 -07:00
Tamas Nemeth
be91e2341f
feat(ingest): s3 - speeding up ingestion with sampling (#4927) 2022-05-24 22:17:10 -07:00
Tamas Nemeth
56ee4d9651
feat(ingest): s3 - add support for multiple pathspecs in one recipe (#4777) 2022-05-05 10:09:47 -07:00
MugdhaHardikar-GSLab
37aedfc87c
feat(s3): add s3 source (#4490)
* feat(data-lake): add containers and folder level dataset support

* docs(data-lake): Update readme for data lake

* doc(data-lake): fix examples, update doc

* lint fix

* feat(s3): add s3 source, restore old data-lake source

Co-authored-by: Mayuri N <mayuri.nehate@gslab.com>
2022-03-29 11:52:57 +02:00