* fix(data-diff): sampling configuration
handle the sampling condition separately for the 2 tables allowing to apply sampling on columns with mismatching cases
* format
* fix(redshift-system): redshift return type
* fixed bigquery profiler
* fixed snowflake profiler
* job id action does not support matrix. using plain action summary.
* reverted gha change
* feat(data-quality): use sampling config in data diff
- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes
* - reverted missing md5
- added missing database service type
* - use a custom substr sql function
* fixed nounce
* added failure for mssql with sampling because it requires a larger change in the data-diff library
* fixed unit tests
* updated range for sampling
* feat(statistics-profiler): use statistics tables to profile trino tables
- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic
* added ABC to terminal classes in collaborative root
* fixed docstring for TestSuiteInterface
* reverted unintended changes
* typo
* feat: added column value to be in expected location test
* fix: renamed value -> values
* doc: added 1.6 documentatio entry
* style: ran python linting
* fix: move data packaging to pyproject.yaml
* fix: add init file back for data package
* fix: failing test case
* Add flake.nix
* Add lockfile for flake
* Update nix environment and document usage
* Add schema for exasol connector
* Add Exasol definitions to databaseService
* Fix error in exasol connector schema
* Add additional connection options/settings to exasol connector
* Add exasol-connector to ui
* Add depdencies for exasol-connector
* Update notes
* Update ingestion code
* Add Basic Documentation for Exasol Connector
* Update flake file
* Add developer notes
* Add python script which can be used as entry point for debugging in ide
* Add config file which can be used for debugging (manual execution)
* Update debug script
* Update developer notes
* Remove old developer notes
* Add .venv to gitignore
* Update dev notes
* Update development notes
* Update ExasolSource
* Establish basic connection to Exasol DB from connector
* Update exasol connector connection settings
* Add service_spec for exasol plugin
* Remove development files
* Remove unused module
* Applied code formatter
* Update exasol dependency constraint(s)
* Add unit test for exasol connection url(s)
* Fixed test expectations for exasol connection url test(s)
* Adjust the test query for the Exasol connection test
* ref(profiler): use di for system profile
- use source classes that can be overridden in system profiles
- use a manifest class instead of factory to specify which class to resolve for connectors
- example usage can be seen in redshift and snowflake
* - added manifests for all custom profilers
- used super() dependency injection in order for system metrics source
- formatting
* - implement spec for all source types
- added docs for the new specification
- added some pylint ignores in the importer module
* remove TYPE_CHECKING in core.py
* - deleted valuedispatch function
- deleted get_system_metrics_by_dialect
- implemented BigQueryProfiler with a system metrics source
- moved import_source_class to BaseSpec
* - removed tests related to the profiler factory
* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger
* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger
* fixed tests
* format
* bigquery system profile e2e tests
* fixed module docstring
* - removed import_side_effects from redshift. we still use it in postgres for the orm conversion maps.
- removed leftover methods
* - tests for BaseSpec
- moved get_class_path to importer
* - moved constructors around to get rid of useless kwargs
* - changed test_system_metric
* - added linage and usage to service_spec
- fixed postgres native lineage test
* add comments on collaborative constructors
* fix(data-quality): table diff
- added handling for case-insensitive columns
- added handling for different numeric types (int/float/Decimal)
- added handling of boolean test case parameters
* add migrations for table diff
* add migrations for table diff
* removed cross type diff for now. it appears to be flaky
* fixed migrations
* use casefold() instead of lower()
* - implemented utils.get_test_case_param_value
- fixed params for case sensitive column
* handle bool test case parameters
* format
* testing
* format
* list -> List
* list -> List
* - change caseSensitiveColumns default to fase
- added migration to stay backward compatible
* - removed migration files
- updated logging message for table diff migration
* changed bool test case parameters default to always be false
* format
* docs: data diff
- added the caseSensitiveColumns parameter
requires: https://github.com/open-metadata/OpenMetadata/pull/18115
* fixed test_get_bool_test_case_param
* ref(profiler): redshift system metrics
- moved redshift system metrics to the redshift source module
- use Timestamp in data quality
- added plugin feature to test utils
* use timezone.utc
* format
* reverted unintended snowflake changes
* fixed import test_system_metrics.py
* revert
* fixed import in tests
* GEN-1322: API Entity - Remove Beta
* minor: add doc for the metadata pipeline
* api service refactor
* api service refactor backend changes
* add apiconnection in test service connection
* pytest fix
* fix java file formatting
* Fix casing of REST in ApiServiceRest.spec.ts
* Refactor REST to Rest in API classes
* minor change
* minor change
* minor change
* fix cashing for API to Api
* add playwright test for api service ingestion
* fix: playwright test
---------
Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
* fix snowflake system metrics
* format
* add link to logs and commit
fixed the dq cli test
* reverted bad formatting
* fixed models.py
* removed version pinning for data diff in tests
* fix(data-quality): snowflake data diff
- fixed schema in snowflake URL for data diff
- added e2e for snowflake data quality
* reverted unintended change
* fix import issue
* chore: add raise from status on ingestion step in tests
---------
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
* feat: indexed test case results
* feat: added indexation logic for test case results
* style: ran java linting
* fix: IDE warnigns
* chore: added test case results migration
* style: ran java linting
* fix: postgres migration column json ref
* empty commit to trigger queued
* chore: extracted test case results to its own resource
* chore: fix failing tests
* chore: move testCaseResult state from testSuite and testCase to dynamic field fetched from test case results search index
* chore: clean up test case repository
* style: ran java linting
* chore: removed testCaseResultSummary and testCaseResult state from db
* fix: test failures
* chore: fix index mapping type for result value
* chore: fix test failure
* GEN-1493 - Fix opensearch pagination
* GEN-1494 - Add CI for py-tests with Postgres and Opensearch
* GEN-1494 - Add CI for py-tests with Postgres and Opensearch