* feat(statistics-profiler): use statistics tables to profile trino tables
- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic
* added ABC to terminal classes in collaborative root
* fixed docstring for TestSuiteInterface
* reverted unintended changes
* typo
* tests(datalake): use minio
1. use minio instead of moto for mimicking s3 behavior.
2. removed moto dependency as it is not compatible with aiobotocore (https://github.com/getmoto/moto/issues/7070#issuecomment-1828484982)
* - moved test_datalake_profiler_e2e.py to datalake/test_profiler
- use minio instead of moto
* fixed tests
* fixed tests
* removed default name for minio container
* fix: Allow non numeric numbers to be sent via Json, Replace NaN values with None in SQAProfilerInterface
Replace NaN values with None in the SQAProfilerInterface class to maintain database parity. NaN values will be cast to null in OpenMetadata. This change ensures that data handling processes account for this conversion.
* fix: histogram overflow error
* test: Add Unit Test for Null and Null Ratio Metric
* chore: Address comments
* chore: Address comments
* fix: checkstyle and message
* fix: failing tests as null count works as expected
* mysql integration tests
* fix(data-quality): accept between with no bounds
add between filters only when the bounds are defined. if they are not (ie: resolve to 'inf' values), do not add any filters
* format
* consolidated ingestion_config
* format
* fixed handling of date and time columns
* fixed tests
* tests: refactor
refactor tests and consolidate common functionality in integrations.conftest
this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more
* format
* removed helpers
* changed scope of fictures
* changed scope of fixtures
* added profiler test for mssql
* fixed import in data_quality test
* json safe serialization
* format
* set MARS_Connection
* fix(data-quality): empty test suite
do not raise for empty test suite
* format
* dont need to check length in _get_test_cases_from_test_suite
* fix
* added warning if no test cases are found
* tests: refactor
refactor tests and consolidate common functionality in integrations.conftest
this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more
* format
* removed helpers
* changed scope of fictures
* changed scope of fixtures
* added profiler test for mssql
* fixed import in data_quality test
* json safe serialization
* format
* set MARS_Connection
* use SerializableTableData instead of TableData
* deleted file test_postgres.py
* fixed tests
* added more test cases
* format
* changed name test_models.py
* removed the logic for serializing table data
* wip
* changed mapping in common type map
* changed mapping in common type map
* reverted TableData imports
* reverted TableData imports
* reverted TableData imports
* fix(data-quality): incompatible columns
gracefully fail when a column of incompatible type is submitted for a test case
* format
* added condition to handle only colum test cases
* fixed tests
* format
* feat: add tableDiff test case
This changed introduces a "table diff" test case which
compares two tables and fails if they are not identical.
The similarity is made based on a specific "key" (because the test only makes sense when performed on ordered collections).
1. Added the `tableDiff` test definition.
2. Implemented a "runtime" parameters feature which injects additional parameters for the test at runtime.
3. Integration tests (because of course).
This feature was not tested end-to-end yet because "array" data
* pydantic v2
* format
* format
* format and added data diff to setup.py
* format
* fixed param issue which has type ARRAY
* fixed runtime_parameter_setter
* moved models to parent directory
* handle errors in table diff
* fixed issue with edit test case
* format
* added more details to pytest skip
* format
* refactor: Improve createTestCaseParameters function in DataQualityUtils
* fixed unit test
* removed unused fixture
* removed validator.py
* fixed tests
* added validate kwarg to tests_mixin
* removed "postgres" data diff extra as they interfere with psycopg2-binary
* fixed tests
* pinned tenacity for tests
* reverted tenacity pinning
* added ui support for test diff
* fixed dq cypress and added edit flow
* organized the test case
* added dialect support
* fixed tests
* option style fix
* fixed calculation for passing/failing rows
* restrict the tableDiff test to limited services
* set where to None if blank string
* fixed where clause
* fixed tests for where clause
* use displayName in place of name in edit form
* added docs for RuntimeParameterSetter
* fixed cypress
---------
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
* added trino integration test
* - removed warnings for classes which are not real tests
- removed "helpers" as its being used
* use a docker network instead of host
* print logs for hive failure
* removed superset unit tests
* try pinning requests for test
* try pinning requests for test
* wait for hive to be ready
* fix trino fixture
* - reduced testcontainers_config.max_tries to 5
- remove intermediate containers
* print with logs
* disable capture logging
* updated db host
* removed debug stuff
* removed debug stuff
* removed version pin for requests
* reverted superset
* ignore trino integration on python 3.8