IceS2
e79c54e6a5
MINOR: Add injection to profiler ( #21738 )
...
* Initial implementation for our Connection Class
* Implement the Initial Connection class
* Add Unit Tests
* Implement Dependency Injection for the Ingestion Framework
* Fix Test
* Fix Profile Test Connection
* Add Injection to Metrics in Profiler
* Add Injection to the Profiler
* Fix UnitTests
* Fix Pytests
* Fix Tests
* Fix types
2025-06-17 19:01:00 +02:00
Mayur Singal
7760663b22
MINOR: Change ingestion licence header ( #20549 )
2025-04-03 10:39:47 +05:30
Teddy
ef131d7e20
MINOR: Wrong attribute name in SampleConfig model ( #19641 )
...
* fix: wrong attribute name in SampleConfig model
* fix: test attribute
* fix: failing tests
* fix: trino filter error + adjust test to take into account null value
* fix: mssql and azuresql tablesample on views
2025-02-04 10:40:40 +01:00
Teddy
58699063db
MINOR -- Fix DQ Partition Issue ( #18641 )
...
* fix: renamed `random_sample` to `get_dataset` and change dunder method access for SQA Table object
* fix: removed handle_partition decorator
* fix: fixed DQ partition issue + moved to `tablesample` method
* style: ran python linting
* style: fix python format check issues
* feat: added postgres tablesample
* style: ran python linting
* fix: sampling delta
* fix: merge conflicts
* fix: resolved conflicts
* style: ran python linting
* fix: patch orm call in test case
* fix: mock build_table_orm call in tests
* fix: test case failures and errors
* fix: removed unused import
* fix: patch typo
* fix: trino table schema retrieval
* fix: remove tuple context manager for 3.8 test support
2024-11-27 08:50:54 +01:00
Pere Miquel Brull
c68a45e7d8
Create new Auto Classification Workflow ( #18610 )
2024-11-19 08:10:45 +01:00
Imri Paran
a3d6c1dd20
MINOR: tests(datalake): use minio ( #17805 )
...
* tests(datalake): use minio
1. use minio instead of moto for mimicking s3 behavior.
2. removed moto dependency as it is not compatible with aiobotocore (https://github.com/getmoto/moto/issues/7070#issuecomment-1828484982 )
* - moved test_datalake_profiler_e2e.py to datalake/test_profiler
- use minio instead of moto
* fixed tests
* fixed tests
* removed default name for minio container
2024-09-12 07:13:01 +02:00
Teddy
e4c01c5702
fix: region typo in test ( #17766 )
2024-09-09 17:54:07 +05:30
IceS2
f0049853ec
FIXES 14885: Initial deltalake implementation for s3 ( #16665 )
...
* Initial deltalake implementation for s3
* Fix styles
* Fix test_amundsen
* Fix UnitTests
* Fix Checkstyle
* Fix integration tests due to datalake client refactor
* Fix unit tests
* Fix tests
* Fix Integration DeltaLake Storage test
* Skip delta storage integration test for python 3.8
* DeltaLake JSONSchema changes migrations
* Update import name
* Add some comments based on sonarcloud suggestions
* Update DeltaLake documentation
* Resolve some comments
2024-06-20 12:08:21 +05:30
Mayur Singal
7359d6210c
MINOR: Fix Profiler for SSL Enabled Source ( #16613 )
2024-06-12 11:40:30 +05:30
Pere Miquel Brull
cb72a22b59
Fix - e2e tests for pydantic V2 ( #16551 )
...
* Fix - e2e tests for pydantic V2
* add correct default
* add correct default
* revert datetime aware
* revert datetime aware
* revert datetime aware
* revert datetime aware
* revert datetime aware
* revert datetime aware
* revert datetime aware
* revert datetime aware
* fix apis
* format
2024-06-06 19:36:17 -07:00
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 ( #16480 )
...
* pydantic v2
* pydanticv2
* fix parser
* fix annotated
* fix model dumping
* mysql ingestion
* clean root models
* clean root models
* bump airflow
* bump airflow
* bump airflow
* optionals
* optionals
* optionals
* jdk
* airflow migrate
* fab provider
* fab provider
* fab provider
* some more fixes
* fixing tests and imports
* model_dump and model_validate
* model_dump and model_validate
* model_dump and model_validate
* union
* pylint
* pylint
* integration tests
* fix CostAnalysisReportData
* integration tests
* tests
* missing defaults
* missing defaults
2024-06-05 21:18:37 +02:00
Ayush Shah
b79e5c064b
Fix 15576 - Eval Data Type issue fix ( #15702 )
2024-04-03 15:51:19 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema ( #14956 )
...
* linting: fix python linting
* fix: get column types from parquet schema for parquet files
* style: python linting
* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
Teddy
d228a93fbf
fix: increase floating point precision ( #14827 )
2024-01-24 09:19:19 +01:00
Teddy
c7ac28f2c2
Fixes #11357 - Implement profiler custom metric processing ( #14021 )
...
* feat: add backend support for custom metrics
* feat: fix python test
* feat: support custom metrics computation
* feat: updated tests for custom metrics
* feat: added dl support for min max of datetime
* feat: added is safe query check for query sampler
* feat: added support for custom metric computation in dl
* feat: added explicit addProper for pydantic model import fo Extra
* feat: added custom metric to returned obj
* feat: wrapped trino import in __init__
* feat: fix python linting
* feat: fix typing in 3.8
2023-11-17 17:51:39 +01:00
Teddy
f3da919329
Feat: Backend Support for Custom Metrics ( #13965 )
...
* feat: add backend support for custom metrics
* feat: fix python test
2023-11-17 19:16:35 +05:30
Mayur Singal
a8145a82fa
Fix #13603 : Configurable Sample Data Rows for Profiler ( #13807 )
...
* Fix #13603 : Configurable Sample Data Rows
* Fix #13603 : Configurable Sample Data Rows for Profiler
* fix table config
* support configurable overwriting of sample data
* add support for schema and database profiler configuration
* chore(ui): put sampleDataStorageConfig under advanced config
* fix tests
* py format
* chore(ui): add sampleDataCount in table profiler config
* fix tests
* pylint & tests
* feat(ui): add profiler settings tab in database and database schema page
* chore(ui): show different inputs for profile sample type
* schema changes to make default storange config null
* add unit test
* schema changes to fix api
* update profiler setting schema
* move profiler settings to manage button
* sync locals
* fix(ui): unit tests
* fix tests
* py format
* fix lint
* minor improvements
* chore(ui): update profiler settings schema
* resolve review comments
* pytest
---------
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-11-09 18:49:42 +05:30
Teddy
1cbdfb3ae7
Fixes #12601 - column filter for profiler workflow ( #13535 )
...
* fix: sample data ingestion to match entity profiler column setting
* fix: python linting
* fix: updated fn call
* fix: added logic to handle json filed in datalake connector
* fix: handle NA values in parsing
* fix: reverted sampler changes from #13338
* fix: reverted metric changes from #13338
* fix: added datalake profiler ingestion test
* fix: python linting
* fix: removed normalization of json blob in NoSQL db
2023-10-12 14:51:38 +02:00
Ayush Shah
08d7ee6d55
Fixes #13052 : Datalake Nested Columns Sample Data ingestion ( #13338 )
2023-10-08 20:08:51 +05:30
Ayush Shah
5fea08cd33
Datalake: Add manifest file support, fix profiler metrics, add array and json column type support ( #13017 )
2023-09-13 15:15:49 +05:30
Ayush Shah
ab1ec50c2c
Fixes Mssql Ntext, text and Image ( #12490 )
2023-07-20 13:34:35 +05:30
Ayush Shah
27a0c9e802
Fix Docker Import ( #12455 )
2023-07-17 12:50:11 +05:30
Teddy
42a426226e
Fixes Issue #11803 #12103 - Add BigQuery Struct Support ( #12435 )
...
* ref: implemented interface for profiler components + removed struct logic
* ref: ran python linting
* ref: added UML diagram to readme.md
* ref: empty commit for labeler check
* ref: remove multiple context manager for 3.7 3.8 compatibility
* ref: remove
* fix: mapper logic for BQ struct types
* feat: added BQ support for structs
* feat: clean code smell + handle null self.col.table value
* feat: ran python linting
* feat: updated test for profiler handler + disabled flaky test
* Update ingestion/tests/unit/profiler/pandas/test_sample.py
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
---------
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2023-07-14 09:12:46 +02:00
Teddy
54fbe250a1
fix: import error + BQ E2E CLI ( #12420 )
2023-07-13 13:35:37 +02:00
Teddy
b89cf64f14
Clean up profiler ( #12369 )
...
* ref: implemented interface for profiler components + removed struct logic
* ref: ran python linting
* ref: added UML diagram to readme.md
* ref: empty commit for labeler check
* ref: remove multiple context manager for 3.7 3.8 compatibility
* ref: remove
2023-07-12 17:02:32 +02:00