Sergio Gómez Villamor
|
8cae980286
|
tests(ingestion): moving some tests so they are available for sdk users (#13540)
|
2025-05-19 08:39:53 +02:00 |
|
Jonny Dixon
|
132ff7081f
|
feat(ingestion/s3): Add externalUrls for datasets in s3 and gcs (#12763)
|
2025-05-17 17:03:40 +01:00 |
|
Austin SeungJun Park
|
41895fe24f
|
feat(ingest/s3): add table filtering (#12661)
|
2025-03-20 07:57:43 +01:00 |
|
sid-acryl
|
9fb2df11f3
|
fix(ingest): sort by last modified not working in the UI (#11343)
|
2024-09-23 10:06:05 -07:00 |
|
Sergio Gómez Villamor
|
31edb46dbc
|
feat(ingestion): adds env property in ContainerProperties (#11214)
Co-authored-by: siladitya2 <siladitya2@gmail.com>
|
2024-09-18 14:56:52 +05:30 |
|
Harshal Sheth
|
3755731f0e
|
chore(ingest): improve code formatting (#11326)
|
2024-09-11 10:48:57 -07:00 |
|
Andrew Sikowitz
|
fa1164aa63
|
feat(ingest/s3): Support reading S3 file type (#11177)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
|
2024-08-30 12:15:12 +02:00 |
|
Tamas Nemeth
|
ef6a410091
|
feat(ingest/s3): Partition support improvements (#11083)
- Partition autodetection
- Option to find min/max/min-max partition of a dataset
- Generating Partition aspects
|
2024-08-22 17:55:43 +02:00 |
|
Tamas Nemeth
|
7e5610f358
|
feat(ingest/dagster): Dagster source (#10071)
Co-authored-by: shubhamjagtap639 <shubham.jagtap@gslab.com>
|
2024-03-25 13:28:35 +01:00 |
|
Harshal Sheth
|
05930560cc
|
feat(ingest/s3): set default spark version (#10057)
|
2024-03-18 14:27:01 -07:00 |
|
Harshal Sheth
|
b0163c4885
|
feat(ingest): utilities for query logs (#10036)
|
2024-03-12 23:20:46 -07:00 |
|
Tamas Nemeth
|
d86b336e70
|
chore(ingest/s3) Bump Deequ and Pyspark version (#8638)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
|
2023-08-29 18:11:37 +02:00 |
|
Jinlin Yang
|
6748aecdc0
|
fix(ingest/s3): emit data_platform_instance aspect if the config has platform_instance (#8585)
|
2023-08-17 10:40:54 +05:30 |
|
Andrew Sikowitz
|
bf9f380350
|
fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance (#8501)
|
2023-07-25 11:35:34 +05:30 |
|
Tamas Nemeth
|
a91c78cf31
|
fix(ingest/s3): fix test flakiness (#8416)
|
2023-07-14 00:42:00 +02:00 |
|
Tamas Nemeth
|
54c7aef1bc
|
feat(ingest/presto-on-hive): Extracting all the table properties from Hive Metastore (#8348)
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
|
2023-07-12 15:56:13 -03:00 |
|
Andrew Sikowitz
|
3a21c27f06
|
feat(ingest): Turn on browse path v2 creation (#8342)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
|
2023-07-06 16:43:42 -04:00 |
|
Tamas Nemeth
|
74ab1bea06
|
fix(ingest/s3): Fix for flaky s3 test - uploading s3 files in consistent order (#8367)
|
2023-07-04 19:19:39 +02:00 |
|
Tamas Nemeth
|
d50a99935b
|
fix(ingest/s3): Path spec aware folder traversal (#8095)
|
2023-05-30 16:20:49 +02:00 |
|
Harshal Sheth
|
4e9c652707
|
feat(ingest): add env to container properties (#8027)
|
2023-05-22 12:07:16 -07:00 |
|
Tamas Nemeth
|
bdd4bc7b92
|
feat(ingest/s3) - Stateful ingestion and last-updated support (#8022)
|
2023-05-19 13:10:15 +02:00 |
|
Tamas Nemeth
|
dec54bf098
|
feat(ingest/s3): Inferring schema from the alphabetically last folder (#8005)
|
2023-05-10 21:55:05 +02:00 |
|
Harshal Sheth
|
e99875cac6
|
chore(ingest): enable flake8 bugbear linting (#7763)
|
2023-04-10 14:14:42 -07:00 |
|
Harsha Mandadi
|
bf36c935fa
|
feat(ingest/s3): support path_specs of different S3 buckets in the same recipe (#7514)
|
2023-03-14 21:55:57 -07:00 |
|
John Joyce
|
18f387c6ea
|
fix(cli): Adding exit code to correctly return failure or success (#7520)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
|
2023-03-13 13:32:40 -07:00 |
|
Shirshanka Das
|
26cf0a71ab
|
fix(test): suppress s3 golden file test for specific paths (#7551)
|
2023-03-12 10:43:02 -07:00 |
|
Harshal Sheth
|
49029943f9
|
fix(ingest): remove extraneous platform configs (#7454)
|
2023-03-02 01:10:35 -08:00 |
|
nachiket-juneja
|
e07cd2090b
|
Feat/s3 ingestion enhancement to update schema from latest partition (#7410)
Co-authored-by: Prashant Singh Thakur <prashant.thakur@nucleusteq.com>
|
2023-02-28 08:58:28 +01:00 |
|
Aseem Bansal
|
372f673aef
|
chore(ci): mark tests correctly (#7337)
|
2023-02-15 16:32:53 +05:30 |
|
Mayuri Nehate
|
e79b4e8c2b
|
feat(ingest): s3 - add status aspect for detected s3 datasets (#6402)
|
2022-11-13 17:29:42 -08:00 |
|
Harshal Sheth
|
09616ee2b3
|
feat(ingest): include instance in container dataPlatform when provided (#6083)
|
2022-10-13 11:29:54 -07:00 |
|
Harshal Sheth
|
e70c0ac4b6
|
feat(ingest): include raw s3 paths if s3 source (#6168)
|
2022-10-11 15:55:00 -07:00 |
|
Shirshanka Das
|
e9c4c823d8
|
fix(ingest): bigquery-beta - ensure that status aspect is emitted for… (#6154)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
|
2022-10-08 16:00:45 -07:00 |
|
Harshal Sheth
|
3c0f63c50a
|
fix(ingest): hide deprecated path_spec option from config (#5944)
|
2022-10-04 12:14:00 -07:00 |
|
Mayuri Nehate
|
a14617b6a4
|
fix(ingest): continue validation of s3 path_specs even if platform is set (#5951)
|
2022-09-16 12:03:57 -07:00 |
|
Ravindra Lanka
|
228f3b50ea
|
feat(ingestion): send reports of ingestion runs to datahub (#5639)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
|
2022-08-19 09:08:17 -07:00 |
|
Jordan Wolinsky
|
3a86ff3485
|
Fix profiling when using {table}. (#5531)
* profiling fix for when using {table}
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
|
2022-08-08 13:16:59 -07:00 |
|
Mayuri Nehate
|
2c48329810
|
feat(model): dashboard usage model, is_null condition added (#5397)
|
2022-07-15 15:37:06 +05:30 |
|
Aseem Bansal
|
4541379024
|
feat(build): changes to decrease build time, cancel runs in case of multiple commits (#5187)
|
2022-06-17 18:05:10 +05:30 |
|
Tamas Nemeth
|
be91e2341f
|
feat(ingest): s3 - speeding up ingestion with sampling (#4927)
|
2022-05-24 22:17:10 -07:00 |
|
Tamas Nemeth
|
56ee4d9651
|
feat(ingest): s3 - add support for multiple pathspecs in one recipe (#4777)
|
2022-05-05 10:09:47 -07:00 |
|
mayurinehate
|
c34a1ba735
|
fix(s3): improved handling for corner cases (#4774)
|
2022-04-29 12:25:41 -07:00 |
|
Jordan Wolinsky
|
bbac4a7a11
|
feat(ingestion): glue/s3 - Ingest Tags from s3 bucket on an AWS Glue job and S3 Data Lake Ingest Job (#4689)
|
2022-04-29 10:09:06 +02:00 |
|
MugdhaHardikar-GSLab
|
37aedfc87c
|
feat(s3): add s3 source (#4490)
* feat(data-lake): add containers and folder level dataset support
* docs(data-lake): Update readme for data lake
* doc(data-lake): fix examples, update doc
* lint fix
* feat(s3): add s3 source, restore old data-lake source
Co-authored-by: Mayuri N <mayuri.nehate@gslab.com>
|
2022-03-29 11:52:57 +02:00 |
|