harshsoni2024
3b382c1bd9
issue-20737: datalake parquet different extensions ( #21048 )
2025-05-13 11:23:46 +05:30
Mayur Singal
7760663b22
MINOR: Change ingestion licence header ( #20549 )
2025-04-03 10:39:47 +05:30
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 ( #16480 )
...
* pydantic v2
* pydanticv2
* fix parser
* fix annotated
* fix model dumping
* mysql ingestion
* clean root models
* clean root models
* bump airflow
* bump airflow
* bump airflow
* optionals
* optionals
* optionals
* jdk
* airflow migrate
* fab provider
* fab provider
* fab provider
* some more fixes
* fixing tests and imports
* model_dump and model_validate
* model_dump and model_validate
* model_dump and model_validate
* union
* pylint
* pylint
* integration tests
* fix CostAnalysisReportData
* integration tests
* tests
* missing defaults
* missing defaults
2024-06-05 21:18:37 +02:00
Ayush Shah
b79e5c064b
Fix 15576 - Eval Data Type issue fix ( #15702 )
2024-04-03 15:51:19 +05:30
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 ( #15615 )
2024-03-26 10:03:21 +05:30
Imri Paran
aeb5fbe303
fixes #12591 : add BigTable ( #15122 )
...
* feat(connector): add BigTable
* bigtable work
1. docstrings
2. tests
3. created a Row BaseModel
4. implemented a ClassConverter
* docs moved to separate PR
* format files
* minor cosmetic
- removed TODO
- changed headers' year to 2024 for new files
- fixed typos
* format
* formatting and comments
1. added missing docstrings.
2. abstracted the _find_instance method.
3. aliased the IDs used in the BigTable connection
* added comment regarding private key
* added comments regarding column families
* enclose get_schema_name_list in `try/except/else`
* format
* streamlined get_schema_name_list to include all logic in the try block
2024-02-13 08:28:01 +01:00
Mayur Singal
7197356101
Minor: Fix pyarrow import error ( #15004 )
2024-02-02 16:28:40 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema ( #14956 )
...
* linting: fix python linting
* fix: get column types from parquet schema for parquet files
* style: python linting
* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
Pere Miquel Brull
b250cd8808
Fix #13699 - Add separator for Storage Container manifest ( #13924 )
...
* Fix #13699 - Add separator for Storage Container manifest
* Fix #13906 - Fix add_mlmodel_lineage description field
* Add separator
* Add separator
2023-11-10 10:44:47 +01:00
Teddy
1cbdfb3ae7
Fixes #12601 - column filter for profiler workflow ( #13535 )
...
* fix: sample data ingestion to match entity profiler column setting
* fix: python linting
* fix: updated fn call
* fix: added logic to handle json filed in datalake connector
* fix: handle NA values in parsing
* fix: reverted sampler changes from #13338
* fix: reverted metric changes from #13338
* fix: added datalake profiler ingestion test
* fix: python linting
* fix: removed normalization of json blob in NoSQL db
2023-10-12 14:51:38 +02:00
Ayush Shah
08d7ee6d55
Fixes #13052 : Datalake Nested Columns Sample Data ingestion ( #13338 )
2023-10-08 20:08:51 +05:30
Ayush Shah
5fea08cd33
Datalake: Add manifest file support, fix profiler metrics, add array and json column type support ( #13017 )
2023-09-13 15:15:49 +05:30
Mayur Singal
f6d5c7413f
Fix #6700 : Add support for table properties: file format for datalake ( #12920 )
...
* Fix #6700 : Add support for table properties: file format for datalake & storage
* pylint fix
* resolve review comments
2023-08-22 09:46:22 +02:00
Pere Miquel Brull
e97d4befb1
Fix #12770 - Cleanup DL structure & Readers & Python 3.8 ( #12776 )
2023-08-09 16:07:16 +05:30
Pere Miquel Brull
10f2567fe9
Fixes #12555 - Fix DL test suite ( #12727 )
...
* Fix DL test suite
* Fix linting
2023-08-03 11:48:22 +02:00
Ayush Shah
83e9b6c310
Fixes 10395: Validation of yaml workflow configs ( #11985 )
2023-06-20 11:20:59 +05:30
Mayur Singal
7fa963eec3
Fix #1076 : Add mongodb support ( #11943 )
2023-06-15 11:14:22 +05:30
Ayush Shah
ad7258e7be
Fixes 10949: return Chunks for file formats & Centralize logic for different auth configs ( #11639 )
...
* Centralize Auth and File formats datalake
2023-05-19 18:54:28 +05:30