20 Commits

Author SHA1 Message Date
Mayur Singal
7760663b22
MINOR: Change ingestion licence header (#20549) 2025-04-03 10:39:47 +05:30
Mayur Singal
840a102887
Fix #17195: Support automated unstructured files ingestion & tags (#17196) 2024-07-31 00:05:58 +05:30
Matt Chamberlin
d757aa9d77
Fixes 16652: add GCS storage service (#16917)
* FEAT-16652: add GCS storage service

* reformat

* update connection tests

* fix tests

* relax google-cloud-storage version constraint

* fix GCP config in tests

---------

Co-authored-by: Matthew Chamberlin <mchamberlin@ginkgobioworks.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-07-10 14:03:28 +02:00
Pere Miquel Brull
cb72a22b59
Fix - e2e tests for pydantic V2 (#16551)
* Fix - e2e tests for pydantic V2

* add correct default

* add correct default

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* fix apis

* format
2024-06-06 19:36:17 -07:00
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Pere Miquel Brull
53185fd30b
MINOR - Add Integration Test for S3 Storage (#16277)
* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* format

* format
2024-05-16 10:03:27 +02:00
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema (#14956)
* linting: fix python linting

* fix: get column types from parquet schema for parquet files

* style: python linting

* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
Pere Miquel Brull
d915254fac
Prepare Storage Connector for ADLS & Docs (#13376)
* Prepare Storage Connector for ADLS & Docs

* Format

* Fix test
2023-10-02 12:15:09 +02:00
Cristian Calugaru
5d8457b597
Fixes ISSUE-10587: global manifest option for storage services (#12017)
* global manifest option for storage services

* added a no metadata config source option for global manifest s3 services option

* merge fixes

* more merge fixes.

* black stuff

* test fixes

* formatting

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-09-28 07:55:40 +02:00
Ayush Shah
5fea08cd33
Datalake: Add manifest file support, fix profiler metrics, add array and json column type support (#13017) 2023-09-13 15:15:49 +05:30
Pere Miquel Brull
6c0e9f5061
Part of #7272 - Centralize Workflows, Status, and Exception Management (#13029)
* Prep changes

* Prep changes

* prep changes

* Update imports

* Format

* Prep delete

* Prep delete

* Fix sink

* Prep test

* Commit

* passing either

* passing either

* Prep Either

* Metadata source with Either

* Update status

* Merge remote-tracking branch 'upstream/main' into issue-7272

* Format

* Linting

* Linting

* Linting

* Linting

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Comments
2023-08-30 15:49:42 +02:00
Onkar Ravgan
5b47fd4acf
Added source url to entities (#12901)
* Added source url to entites

* added support to create and update sourceUrl

* fixed pytests

---------

Co-authored-by: 07Himank <himank07mehta@gmail.com>
2023-08-18 10:17:38 +02:00
Pere Miquel Brull
a183fc67e2
Fix ADLS parquet reads (#12840)
* Fix ADLS parquet reads

* Generalize service methods

* Fix tests
2023-08-14 19:57:06 -07:00
Ayush Shah
f80eaf3a26
Fixes 11068: mysql & postgres iam auth (#11937) 2023-06-16 13:18:12 +05:30
Ayush Shah
ad7258e7be
Fixes 10949: return Chunks for file formats & Centralize logic for different auth configs (#11639)
* Centralize Auth and File formats datalake
2023-05-19 18:54:28 +05:30
Pere Miquel Brull
5152db488d
Add partition columns details (#11062) 2023-04-14 13:06:56 +02:00
Pere Miquel Brull
47cef52fa8
Handle container parents (#11026) 2023-04-12 18:36:04 +02:00
Pere Miquel Brull
b5cb1d464a
Deprecate location and old storage service (#11004)
* Deprecate location and old storage service

* Format

* Fix test

* Refactor

* Clean location

* Rename object store to storage

* Rename object store to storage

* Rename object store to storage

* Format

* Format

* Refactor object store for storage

* Refactor object store for storage

* Rename object store to storage

* Fix test

* Fix test

* Format

* chore(ui): change Objectstore to  Storage

* Fixes

* Fix test

* Remove storage service from Glue cypress

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-04-12 11:44:46 +02:00