18 Commits

Author SHA1 Message Date
Imri Paran
16875853a0
test(data-dff): fix flaky test (#18898)
use 99.5 CI for data diff sampling
2024-12-06 18:55:27 +05:30
Imri Paran
d6470b7800
MINOR: fix(data-diff): get added columns (#18694)
* fix(data-diff): get added columns

- use both columns to calculate schema diff

* fix tests
2024-11-25 15:53:50 +01:00
Pere Miquel Brull
c68a45e7d8
Create new Auto Classification Workflow (#18610) 2024-11-19 08:10:45 +01:00
Imri Paran
bde6ee4125
MINOR: Data diff sample fix (#18632)
* fix(data-diff): sampling configuration

handle the sampling condition separately for the 2 tables allowing to apply sampling on columns with mismatching cases

* format
2024-11-15 08:22:13 +01:00
Teddy
45d27a377d
GEN 1184 - Added Workflow Classification and Metric LevelConfig (#18572) 2024-11-11 15:59:42 +01:00
Imri Paran
cdaa5c10af
[GEN-1996] feat(data-quality): use sampling config in data diff (#18532)
* feat(data-quality): use sampling config in data diff

- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes

* - reverted missing md5
- added missing database service type

* - use a custom substr sql function

* fixed nounce

* added failure for mssql with sampling because it requires a larger change in the data-diff library

* fixed unit tests

* updated range for sampling
2024-11-11 10:07:23 +01:00
Imri Paran
be82086e25
MINOR: add column case sensitivity parameter (#18115)
* fix(data-quality): table diff

- added handling for case-insensitive columns
- added handling for different numeric types (int/float/Decimal)
- added handling of boolean test case parameters

* add migrations for table diff

* add migrations for table diff

* removed cross type diff for now. it appears to be flaky

* fixed migrations

* use casefold() instead of lower()

* - implemented utils.get_test_case_param_value
- fixed params for case sensitive column

* handle bool test case parameters

* format

* testing

* format

* list -> List

* list -> List

* - change caseSensitiveColumns default to fase
- added migration to stay backward compatible

* - removed migration files
- updated logging message for table diff migration

* changed bool test case parameters default to always be false

* format

* docs: data diff

- added the caseSensitiveColumns parameter

requires: https://github.com/open-metadata/OpenMetadata/pull/18115

* fixed test_get_bool_test_case_param
2024-10-15 16:29:43 +02:00
Imri Paran
71720ebc51
fix(table-diff): support cross database (#18085)
fixed table diff url to include database in all cases
2024-10-04 15:31:17 +02:00
Teddy
33c50efdbf
GEN-1192 - Move Test Case to its Own Resource (#17862)
* feat: indexed test case results

* feat: added indexation logic for test case results

* style: ran java linting

* fix: IDE warnigns

* chore: added test case results migration

* style: ran java linting

* fix: postgres migration column json ref

* empty commit to trigger queued

* chore: extracted test case results to its own resource

* chore: fix failing tests

* chore: move testCaseResult state from testSuite and testCase to dynamic field fetched from test case results search index

* chore: clean up test case repository

* style: ran java linting

* chore: removed testCaseResultSummary and testCaseResult state from db

* fix: test failures

* chore: fix index mapping type for result value

* chore: fix test failure
2024-09-18 11:58:59 +02:00
Imri Paran
7508848376
fix(dq): data types for unique columns (#17431)
1. remove json and array from supported data types of unique column test.
2. migrations.
3. tests.
2024-08-19 14:28:42 +02:00
Imri Paran
d59b83f9d1
MINOR[GEN-978]: Fix empty test suites (#16975)
* tests: refactor

refactor tests and consolidate common functionality in integrations.conftest

this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more

* format

* removed helpers

* changed scope of fictures

* changed scope of fixtures

* added profiler test for mssql

* fixed import in data_quality test

* json safe serialization

* format

* set MARS_Connection

* fix(data-quality): empty test suite

do not raise for empty test suite

* format

* dont need to check length in _get_test_cases_from_test_suite

* fix

* added warning if no test cases are found
2024-07-19 12:12:34 +02:00
Imri Paran
0fee79b200
MINOR: fix sample data issue with Pydantic v2 and refactor python integration tests (#16943)
* tests: refactor

refactor tests and consolidate common functionality in integrations.conftest

this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more

* format

* removed helpers

* changed scope of fictures

* changed scope of fixtures

* added profiler test for mssql

* fixed import in data_quality test

* json safe serialization

* format

* set MARS_Connection

* use SerializableTableData instead of TableData

* deleted file test_postgres.py

* fixed tests

* added more test cases

* format

* changed name test_models.py

* removed the logic for serializing table data

* wip

* changed mapping in common type map

* changed mapping in common type map

* reverted TableData imports

* reverted TableData imports

* reverted TableData imports
2024-07-17 08:11:34 +02:00
Imri Paran
d08af1f86d
MINOR: Fix data diff with threshold (#16926)
* fix: table-diff

passed threshold and diff count in wrong order. test was not covering this due to how the parameters were configured.
2024-07-05 07:51:24 +02:00
Pere Miquel Brull
7e98ece3e5
MINOR - Pydantic V2 warnings and better exception msg (#16916) 2024-07-04 14:54:41 +02:00
Imri Paran
9b5ce3560c
MINOR: Pydantic equal assert util (#16918)
* tests: pydantic object assertion

added util for comparing pydantic objects

* fixed test_data_diff.py
2024-07-04 09:59:46 +05:30
Imri Paran
2c9aeebcb8
MINOR: add column diff for table diff test case (#16809)
* feat(table-diff): added column validation

added column validation for table diff that will be carried out before running the row level diff. If a diff for the column exists, it will short circuit the test and report.

* fixed unit tests

* format

* - resolve column types more robustly
- changed test result metric to include "rows" or "columns"
2024-07-02 10:36:03 +00:00
Teddy
141ceb4c8d
MNINOR add common test elements to _openmetadata_testutils module (#16758)
* fix: add common test to _testutils module

* fix: renamed _testutils to _openmetadata_testutils
2024-06-21 15:11:34 +02:00
Imri Paran
b960b60965
Fix #16421: add tableDiff test case (#16554)
* feat: add tableDiff test case

This changed introduces a "table diff" test case which
compares two tables and fails if they are not identical.
The similarity is made based on a specific "key" (because the test only makes sense when performed on ordered collections).

1. Added the `tableDiff` test definition.
2. Implemented a "runtime" parameters feature which injects additional parameters for the test at runtime.
3. Integration tests (because of course).

This feature was not tested end-to-end yet because "array" data

* pydantic v2

* format

* format

* format and added data diff to setup.py

* format

* fixed param issue which has type ARRAY

* fixed runtime_parameter_setter

* moved models to parent directory

* handle errors in table diff

* fixed issue with edit test case

* format

* added more details to pytest skip

* format

* refactor: Improve createTestCaseParameters function in DataQualityUtils

* fixed unit test

* removed unused fixture

* removed validator.py

* fixed tests

* added validate kwarg to tests_mixin

* removed "postgres" data diff extra as they interfere with psycopg2-binary

* fixed tests

* pinned tenacity for tests

* reverted tenacity pinning

* added ui support for test diff

* fixed dq cypress and added edit flow

* organized the test case

* added dialect support

* fixed tests

* option style fix

* fixed calculation for passing/failing rows

* restrict the tableDiff test to limited services

* set where to None if blank string

* fixed where clause

* fixed tests for where clause

* use displayName in place of name in edit form

* added docs for RuntimeParameterSetter

* fixed cypress

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-06-20 16:54:12 +02:00