* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
* Implement Dimensionality for ColumnMeanToBeBetween
* Removed useless comments and improved minor things
* Implement UnitTests
* Fixes
* Moved import pandas to type checking
* Fix Min/Max being optional
* Fix Unittests
* small fixes
* Fix Unittests
* Fix Issue with counting total rows on mean
* Improve code
* Fix Merge
* Removed unused type
* Refactor to reduce code repetition and complexity
* Fix conflict
* Rename method
* Refactor some metrics
* Implement Dimensionality to ColumnLengthToBeBetween
* Implement Dimensionality for ColumnMedianToBeBetween in Pandas
* Implement Median Dimensionality for SQL
* Add database tests
* Fix median metric
* Implement Dimensionality SumToBeBetween
* Implement dimensionality for Column Values not In Set
* Implement Dimensionality for ColumnValuestoMatchRegex and ColumnValuesToNotMatchRegex
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* fix: show all the category for cardinality distribution graph
* feat: enhance CardinalityDistributionChart with category selection and custom Y-axis ticks
* fix: update cursor fill color in visualisation charts for better visibility
---------
Co-authored-by: Harsh Vador <58542468+harsh-vador@users.noreply.github.com>
* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
* Implement Dimensionality for ColumnMeanToBeBetween
* Removed useless comments and improved minor things
* Implement UnitTests
* Fixes
* Moved import pandas to type checking
* Fix Min/Max being optional
* Fix Unittests
* small fixes
* Fix Unittests
* Fix Issue with counting total rows on mean
* Improve code
* Fix Merge
* Removed unused type
* Refactor to reduce code repetition and complexity
* Fix conflict
* Rename method
* Refactor some metrics
* Implement Dimensionality to ColumnLengthToBeBetween
* Implement Dimensionality for ColumnMedianToBeBetween in Pandas
* Implement Median Dimensionality for SQL
* Add database tests
* Fix median metric
* Implement Dimensionality SumToBeBetween
* Implement dimensionality for Column Values not In Set
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
* Implement Dimensionality for ColumnMeanToBeBetween
* Removed useless comments and improved minor things
* Implement UnitTests
* Fixes
* Moved import pandas to type checking
* Fix Min/Max being optional
* Fix Unittests
* small fixes
* Fix Unittests
* Fix Issue with counting total rows on mean
* Improve code
* Fix Merge
* Removed unused type
* Refactor to reduce code repetition and complexity
* Fix conflict
* Rename method
* Refactor some metrics
* Implement Dimensionality to ColumnLengthToBeBetween
* Implement Dimensionality for ColumnMedianToBeBetween in Pandas
* Implement Median Dimensionality for SQL
* Add database tests
* Fix median metric
* Implement Dimensionality SumToBeBetween
* Update columnValueLengthsToBeBetween.py
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
* Implement Dimensionality for ColumnMeanToBeBetween
* Removed useless comments and improved minor things
* Implement UnitTests
* Fixes
* Moved import pandas to type checking
* Fix Min/Max being optional
* Fix Unittests
* small fixes
* Fix Unittests
* Fix Issue with counting total rows on mean
* Improve code
* Fix Merge
* Removed unused type
* Refactor to reduce code repetition and complexity
* Fix conflict
* Rename method
* Refactor some metrics
* Implement Dimensionality to ColumnLengthToBeBetween
* Implement Dimensionality for ColumnMedianToBeBetween in Pandas
* Implement Median Dimensionality for SQL
* Add database tests
* Fix median metric
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* fixed fqn parsing problem in clickhouse and added more logging
* ran py format commands
* fixed python formatting issues
---------
Co-authored-by: Nancy Amandi <nancy.amandi@moniepoint.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
* Implement Dimensionality for ColumnMeanToBeBetween
* Removed useless comments and improved minor things
* Implement UnitTests
* Fixes
* Moved import pandas to type checking
* Fix Min/Max being optional
* Fix Unittests
* small fixes
* Fix Unittests
* Fix Issue with counting total rows on mean
* Improve code
* Fix Merge
* Removed unused type
* Refactor to reduce code repetition and complexity
* Fix conflict
* Rename method
* Refactor some metrics
* Implement Dimensionality to ColumnLengthToBeBetween
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Update schemas
* Remove the allowedEmailRegistrationDomains, allowedDomains, useRolesFromProvider fields from hidden state
* Refactor the SSO Configuration Form and add tests
* Fix code smells and refactor the code for SSOConfigurationForm
* Fix the code smells
* Remove the custom functions to create patch for SSO configurations
* Add mock for structuredClone
* Update generated TypeScript types
* Empty commit
* Fix the unnecessary cleanup of data before saving
* Update the default values for oidc configs
* Fix unit test
* Remove the unnecessary util function
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Refactor previous tests for shared resources
* Add validation result models
This also includes a method for merging them, useful when running validation in batches
* Added `DataFrameValidationEngine` for running tests
This also includes a registry for mapping test names to pandas test classes
* Implement the DataFrameValidator facade
This includes the logic to load tests from different sources (OpenMetadata or code) and pass them down to the engine.
It also includes tests for the integration with OpenMetadata
* Add examples for the API
* Apply comments
* Implement Ingestion side to return a flag when all values are unique
* Update generated TypeScript types
* feat: Enhance CardinalityDistributionChart to display messages when all values are unique
- Added logic to check if all values are unique for both first day and current day data.
- Implemented a placeholder message when all values are unique, indicating no distribution available.
- Updated tests to cover scenarios for unique values and ensure correct rendering of charts and messages.
- Added localization for the new message in multiple languages.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
* refactor: used hashing to reduce api calls, replace distinct with group by to optimize lineage queries & minor code optimizations
* Update generated TypeScript types
* fix: self.job_table_lineage defaultdict function
* refactor: improved hashing
* fix: added _table_lookup_cache and _dlt_table_cache in tests
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>