11 Commits

Author SHA1 Message Date
NiharDoshi99
9145054dc6
Fix: refractor datalake datatypes and s3 for parquet (#9578)
* Fix: refractor datalake datatypes

* Fix: s3 files for parquet
2023-01-04 22:12:00 +05:30
Mayur Singal
3dc357561e
Fix #8868: Support json.gz files for datalake (#8869)
* Fix #8868: Support  files

* Review comments
2022-11-18 10:31:25 +00:00
Ayush Shah
5be0f8ee76
Dl Profiler (#8694)
* DQ commit

* Add DL Profiler

* Fix Ingestion and Profliing pylint checks

* Fix Tests

* PyFormat files

* Fix Tests

* Resolve Comments

* Fix Tests and Format Files

* Resolve Comments

* Fix Pylint and Code smells

* Resolve Comments

* Fix S3 parquet

* Fix Metrics Code Smell
2022-11-15 16:01:10 +01:00
Mayur Singal
0b6e3741b3
Fix Datalake Json Error (#8246) 2022-10-19 14:12:23 +05:30
Onkar Ravgan
107eeef8c7
Added fixes according to pylint (#8009)
Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local>
2022-10-10 12:53:47 +02:00
Nahuel
a878aa911c
Fix#6212: Retrieve connection params from secret manager in CLI commands (#6441)
* Retrieve connection params from secret manager for database connectors

* Retrieve connection params from secret manager for all services except database connectors

* Stop retrieving connection from SM in Airflow rest plugin

* Retrieve connection params from secret manager for dashboard services

* Retrieve connection params when initializing Workflow/ProfilerWorkflow objects

* Align services topologies + comment changes in topology runner

* Address SonarCloud bug detected

* Update database service topology

* Address PR comments

* Address PR comments

* Address PR comments
2022-08-02 09:13:46 +02:00
Mayur Singal
bafca3b7b6
Fix #6387: Clean Ingestion (#6405)
* Fix #6387: Clean Ingestion

* postgres fix

* Fixed Location in Sample Data
2022-07-29 08:51:58 +02:00
Francisco J. Jurado Moreno
ae491f747f
[TASK-6295] Reduce memory footprint for S3 ingestion (#6308)
* Reduce memory footprint for parquet & CSV formats

* Add pagination, remove local var

* Add jsonl parser
2022-07-25 07:24:57 +02:00
Onkar Ravgan
8c9dc91ccf
Refactored Datalake and Deltalake for Topology (#6034)
* rebasing with main

* refactored deltalake for topology

* using requests instead of urllib

* formatting fixes

Co-authored-by: Onkar Ravgan <onkarravgan@Onkars-MacBook-Pro.local>
2022-07-19 06:37:27 +02:00
Mayur Singal
4b5b184177
Fix #6091: Fix Datalake arrays must be of the same length (#6092) 2022-07-17 18:26:54 +02:00
Abhishek Pandey
e8975aac01
datalake-csv-files-ingestion-added (#5343)
datalake-csv-files-ingestion-added (#5343)
2022-06-15 08:57:21 +02:00