12 Commits

Author SHA1 Message Date
Michael Maltese
5da54bf14d
feat(s3/ingest): performance improvements for get_dir_to_process and get_folder_info (#14709) 2025-10-02 15:51:02 +02:00
Michael Maltese
da127b92df
feat(s3): support wildcards in bucket name component of path_specs (#14549) 2025-08-26 16:29:52 -04:00
Tamas Nemeth
e331e807c6
fix(ingest/s3): Fix ingestion when path_spec had a wildcard character in the path (#13940) 2025-07-07 13:12:21 +02:00
Jonny Dixon
132ff7081f
feat(ingestion/s3): Add externalUrls for datasets in s3 and gcs (#12763) 2025-05-17 17:03:40 +01:00
Austin SeungJun Park
e65f133667
refactor(ingest/s3): enhance readability (#12686) 2025-02-28 14:19:46 +01:00
Austin SeungJun Park
5ed4b5bce9
feat(ingest/s3): ignore depth mismatched path (#12326) 2025-02-05 11:19:44 -08:00
Austin SeungJun Park
d8e7cb25e0
fix(ingestion/s3): groupby group-splitting issue (#12254)
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
2025-01-10 09:41:28 +01:00
Tamas Nemeth
ef6a410091
feat(ingest/s3): Partition support improvements (#11083)
- Partition autodetection
- Option to find min/max/min-max partition of a dataset
- Generating Partition aspects
2024-08-22 17:55:43 +02:00
Tamas Nemeth
71d1cdbe3b
fix(ingest/s3): Fixing container creation when there is no folder in path (#10993) 2024-07-25 23:38:10 +02:00
Aseem Bansal
bb33f015ca
fix(ingest/s3): wrong sorting in case of multi-partition key (#8536) 2023-08-02 09:54:33 +05:30
Tamas Nemeth
d50a99935b
fix(ingest/s3): Path spec aware folder traversal (#8095) 2023-05-30 16:20:49 +02:00
Tamas Nemeth
f8be9f6aee
feat(ingest/s3): type aware directory sorting (#8089) 2023-05-23 08:59:46 +02:00