OpenMetadata

mirror of https://github.com/open-metadata/OpenMetadata.git synced 2025-07-07 09:09:30 +00:00

Author	SHA1	Message	Date
Mayur Singal	7760663b22	MINOR: Change ingestion licence header (#20549 )	2025-04-03 10:39:47 +05:30
Imri Paran	cd74d8f55a	MINOR: ref(data-quality): modularized test case validator import (#18716 ) * ref(data-quality): modularized test case validator import - removed test_suite_factory - implemented TestCaseImporter - removed SQAValidatorBuilder and PandasValidatorBuilder in favor of a SourceType enum - removed the orm table creation from test suite source * format * IValidatorBuilder -> ValidatorBuilder * use the table from the sampler in the test suite interface * linting * fixed the profiler with similar solution * removed unused inheritance * removed unneeded super().__init__() * removed all instances of orm_table * fixed tests * add reportExplicitAny=false * fixed tests	2024-11-27 16:25:12 +01:00
Teddy	58699063db	MINOR -- Fix DQ Partition Issue (#18641 ) * fix: renamed `random_sample` to `get_dataset` and change dunder method access for SQA Table object * fix: removed handle_partition decorator * fix: fixed DQ partition issue + moved to `tablesample` method * style: ran python linting * style: fix python format check issues * feat: added postgres tablesample * style: ran python linting * fix: sampling delta * fix: merge conflicts * fix: resolved conflicts * style: ran python linting * fix: patch orm call in test case * fix: mock build_table_orm call in tests * fix: test case failures and errors * fix: removed unused import * fix: patch typo * fix: trino table schema retrieval * fix: remove tuple context manager for 3.8 test support	2024-11-27 08:50:54 +01:00
Pere Miquel Brull	c68a45e7d8	Create new Auto Classification Workflow (#18610 )	2024-11-19 08:10:45 +01:00
Teddy	d579008c99	GEN 1683 - Add Column Value to be At Expected Location Test (#18524 ) * feat: added column value to be in expected location test * fix: renamed value -> values * doc: added 1.6 documentatio entry * style: ran python linting * fix: move data packaging to pyproject.yaml * fix: add init file back for data package * fix: failing test case	2024-11-06 11:17:13 +01:00
Imri Paran	b960b60965	Fix #16421 : add tableDiff test case (#16554 ) * feat: add tableDiff test case This changed introduces a "table diff" test case which compares two tables and fails if they are not identical. The similarity is made based on a specific "key" (because the test only makes sense when performed on ordered collections). 1. Added the `tableDiff` test definition. 2. Implemented a "runtime" parameters feature which injects additional parameters for the test at runtime. 3. Integration tests (because of course). This feature was not tested end-to-end yet because "array" data * pydantic v2 * format * format * format and added data diff to setup.py * format * fixed param issue which has type ARRAY * fixed runtime_parameter_setter * moved models to parent directory * handle errors in table diff * fixed issue with edit test case * format * added more details to pytest skip * format * refactor: Improve createTestCaseParameters function in DataQualityUtils * fixed unit test * removed unused fixture * removed validator.py * fixed tests * added validate kwarg to tests_mixin * removed "postgres" data diff extra as they interfere with psycopg2-binary * fixed tests * pinned tenacity for tests * reverted tenacity pinning * added ui support for test diff * fixed dq cypress and added edit flow * organized the test case * added dialect support * fixed tests * option style fix * fixed calculation for passing/failing rows * restrict the tableDiff test to limited services * set where to None if blank string * fixed where clause * fixed tests for where clause * use displayName in place of name in edit form * added docs for RuntimeParameterSetter * fixed cypress --------- Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>	2024-06-20 16:54:12 +02:00
Ayush Shah	a15da7ec98	Issue #14812 : Add support for empty string as missing count (#16017 )	2024-04-25 09:45:26 +05:30
Teddy	3dc642989c	Fixes #7729 - Add logic to compute passed/failed rows (#14472 ) * feat: add test case resolution task workflow * chore: add migration for test case resolution feature * fix: removed required field for object compatibiity in older migrations * fix: minor testCaseResolution status logic * chore: revert migration for test case incident * chore: update migration file * style: renamed variables * feat: added logic to compute failed/passed rows * feat: add support for row level computation in schema * chore: add test definition migration * feat: add logic to explicitly compute row level failure * chore: clean up code * style: fix java * style: fix pyton format * fix: unhidde API for incident manager * style: fix java styling	2023-12-27 13:38:51 +01:00
Pere Miquel Brull	b786064bc2	#11857 - Store workflow status in the Ingestion Pipeline Status (#14462 ) * Register StackTraceError in spec * Register StackTraceError in spec * Register StackTraceError in spec * Add todos * Update status * docs * format * Fix tests * Fix tests * Fix tests * Ignore generated * Fix tests * Fix tests * Tests * Try constants * Try constants * Print * Print * Print * order * Fix service name * fix ui error --------- Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>	2023-12-22 15:43:50 +01:00
Ayush Shah	ebc0a551e5	Fixes 12947: Add Support For DQ and Profiler in Databricks Unity Catalog (#14424 )	2023-12-20 21:18:05 +05:30
Teddy	3bbf55fcda	FIXES #14049 - Split test case resolution status from test case result (#14204 ) * refactor: entityFQN as ListFilter condition * feat: implement resolution entity timeseries * fix: rename to testCaseResolutionStatus * ref: extracted ES query builder into private method * ref: extract OS query builder in its own method * ref: remove ingestion logic for test case resolution * fix: reorganize json schemas to fix circular import in Python * ref: object names in typescript code * feat: added indexing of test case resolution * feat: added test case resolution sample data * fix: test case resolution api logic * fix: audit logger for entityTimeSeriesInterface * fix: DDL generation * style: python linting * fix: skip UI test case resolution tests * fix: remove extension field * fix: renamed testCaseFailureStatus to testCaseResolutionStatus * fix: remove reviewer * fix: rename sequenceId to stateId * fix: re adjust search weights * fix: removed InReview status * style: ran python linting	2023-12-04 23:18:01 -08:00
Ayush Shah	ab1ec50c2c	Fixes Mssql Ntext, text and Image (#12490 )	2023-07-20 13:34:35 +05:30
Teddy	4b9f213dbf	Fixes Issue #11863 - Add Status to DQ (#11893 ) * feat: added entityReference field in testSuite to link testSuite to an entity when the testSuite is executable. * feat: added `executableEntityReference` as an entity reference for executable test suite to their entity * feat: add status object to test case results * feat: ran python linting * feat: fixed update to	2023-06-06 10:09:16 +00:00
Teddy	721869428e	Revert "Fixe Issue #11863 - Add Status logic for test case results (#11881 )" (#11892 ) This reverts commit 06735fe8dbaac5b267c9a2cf744ca154f88a9247.	2023-06-06 09:56:12 +02:00
Teddy	06735fe8db	Fixe Issue #11863 - Add Status logic for test case results (#11881 ) * feat: added entityReference field in testSuite to link testSuite to an entity when the testSuite is executable. * feat: added `executableEntityReference` as an entity reference for executable test suite to their entity * feat: add status object to test case results * feat: ran python linting	2023-06-06 09:45:49 +02:00
Teddy	d0cffdcd66	Fixes Issue #11438 - Implement threshold and startegy for custom SQL (#11847 ) * feat: Add threshold and strategy logic on the custom SQL object test * feat: ran python linting * feat: added safety checks for custom sql query * feat: ran python linting	2023-06-02 09:41:31 +02:00
Teddy	c98a15ca19	Fixes #11705 - Update ingestion and backend to match new DQ flow (#11836 ) * feat: refactor ingestion flow logic * feat: ran python linting * feat: update tests to match new workflow * feat: ran python linting * feat: update sample data test suite name * feat: Added backend logic to support logical and executable test suites * feat: clean up java and json code * feat: added sample data for logical and executable test suites * feat: remove executable from CreateTestSuite * feat: ran python and java linting * feat: added README info for data quality structure * skipping cypress to keep main green * fixed typescript type issue --------- Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>	2023-06-01 23:19:13 -07:00
Pere Miquel Brull	b988f39152	Fix test usage resources (#11014 )	2023-04-12 05:46:29 +00:00
Ayush Shah	9d11029ec8	Fixes 10351: Fixes Metrics Computation, Samping, test suites and partioning (#10603 ) Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>	2023-04-11 20:58:31 +05:30
Teddy	9b4e9132ae	fixed #9656 - Add support for date type to column values to be between (#10890 ) * fix: renamed to submodule * fix: linting * fix: columnValuesToBeBetween test for date column type	2023-04-04 17:16:44 +02:00
Teddy	5208b6f684	Fixes #4368 - Add Histogram Metric (#10422 )	2023-03-03 21:56:32 +01:00
Teddy	83be5d933b	Fixes #9301 - Refactor TestSuite and Remove Pandas from Base Requirements (#10244 ) * feat(testSuite): extracted out column test for SQA type * refactor(testSuite): extracted SQA column and table tests into their own classes * refactor(testSuite): Added pkutil namespace package style for test suite classes * refactor(testSuite): added dynamic importer function for test cases * refactor(testSuite): black formatting * refactor(testSuite): fixed linting issues * refactor(testSuite): refactor metrics for dataframe * refactor(testSuite): Added Mixins and base methods * refactor(testSuite): extrcated out get bound for floats * refactor(testSuite): Added pandas column test cases * refactor(testSuite): Deleted old column tests * refactor(testSuite): Added table tests for datalake * refactor(testSuite): Removed old tests definition * refactor(testSuite): changed registry to dynamic class inport * refactor(testSuite): renamed dl_fn to df_fn * refactor(testSuite): updated registry unit test * refactor(testSuite): updated import path to sqa like column * refactor(testSuite): cleaned up imports in old files * refactor(testSuite): harmonzied SQALikeColumn object to replicate SQA Column object * refactor(testSuite): linting * refactor(testSuite): linting * refactor(testSuite): raise expection on DQ exception * refactor(testSuite): linting * refactor(testSuite): removed pandas from base requirements * refactor(testSuite): Added __futur__ for py3.7 type hint * refactor(testSuite): added `df` to good-names * refactor(testSuite): renamed Handler to Validator * refactor(testSuite): Added test inheritance for column tests * refactor(testSuite): cleaned up column type check * refactor(testSuite): cleaned up typo * refactor(testSuite): extracted main table test logic into parent class * refactor(testSuite): linting * refactor(testSuite): linting fixes * refactor(testSuite): address doc string and linting issues	2023-02-22 09:42:34 +01:00
Teddy	ba08302ea1	Issue #7291 - Implements Table Rows Inserted to be Between test (#9813 ) * staging commit * staging commit * refactor: partitioning logic * refactor (tests): move to parametrized tests for test validations * refactor: local variables into global * (feat): Added logic for table row inserted test * (feat): fix python checkstyle * feature: extracted get_query_filter logic into its own function	2023-01-31 15:57:51 +01:00
Onkar Ravgan	b539b299ee	Integrated schema parsers (#9305 ) * Integrated schema parsers * Addressed review comments * fixed pytests	2022-12-15 16:54:55 +05:30
Ayush Shah	a6ae9fd11a	Add Test Suite Implementation for Datalake (#9235 )	2022-12-14 21:14:51 +05:30
Teddy	3cad959e44	Fixes #6760 -- Implements REGEX for regex test (#9033 ) feat(testCase): impelemented regex logic for test suite	2022-11-29 13:00:28 +01:00
Teddy	989f2911c2	Fixes #7810 - Allow to only pass min or max (#8474 ) * ISSUE-7810 Added default values for min and max For all data validations on columns:- min_bound is set to float("-inf"), if there is no next value max_bound is set to float("inf"), if there is no next value * Fixed PR errors by removing tuple + added tests Co-authored-by: demi <deepak1212365@gmail.com>	2022-11-01 13:26:51 +01:00
Teddy	f883863b8a	Fixes #7490 - Split Profiler and TestSuite Interface (#8032 ) * Clean up test suite workflow and interface * Fixed tests * Split profiler and testSuite interfaces * Cleaned up workflows and runners * Fixed code formatting * - remove old code - remove `table` attribute used for testing and used mock instead * Fixed execution bugs from refactor * Fixed static type checking for profiler/api/workflow.py * Fixed linting * Added __init__ files	2022-10-11 15:57:25 +02:00
Teddy	15f7c4aa41	Fix param name for median test (#7942 ) * Fixed param name for median test * Fixed unite test for median DQ	2022-10-05 06:32:28 +02:00
Teddy	f2bf5194bb	Fixes #7623 -- Added logic to encode and decode entityLink (#7670 ) * Encode entityLink string when processing request * Added logic to decode column type from entityLink * mvn code formating * Extracted unquote step into its own function	2022-09-23 09:42:33 +02:00
Teddy	1ba6e284fe	Fixes #7118 by cleaning up test names (#7494 ) * Cleaned up tests names and add registry name tests * Updated documentation for test types supported by OM	2022-09-16 07:04:56 +02:00
Sriharsha Chintalapani	656b50dd3a	Fix #7469 : Refactor OpenMetadata code modules (#7474 )	2022-09-14 23:14:02 -07:00
Teddy	9dbcb3911b	Fix minor column data quality test bugs (#7111 ) * Fixed test name issue + filtered out partition details for non BQ tables * Exclude non BQ table from partition processing * Fixed test + formating	2022-09-01 13:47:00 +02:00
Teddy	ef41382cb1	Fixes #7094 by fixing minior bugs in table tests (#7095 )	2022-08-31 21:35:33 +02:00
Teddy	a39c4db8e7	Add partial support for BQ partitioned table (#7066 ) * Added support for BQ time based partition (not ingestion) * Fixed minor errors in test suite workflow	2022-08-30 11:39:15 -07:00
Teddy	ce578e73d4	Fixes #5831 by implenting testSuite workflow logic (#6911 ) * Added database filter in workflow * Removed association between profiler and data quality * fixed tests with removed association * Fixed sonar code smells and bugs * Updated profiler workflow to: - support only running profiler (removed test run) - support column inclusion and exclusion - added back support for partitioned table and sample * moved status to workflow * Fixed tests * removed test logic from profiler sink * Added logic to return sample from workflow sample value * Added profiler examples * Updated documentation for profiler * Fixed code smells * commited changed to profiler * initial commit of the revamp workflow * Fixed python formating * cleaned up profiler submodule by removing test related files and functions * Added airflow DAG logic for testSuite workflow * Fixed code smells + added airflow ingestion tests + fixed comments	2022-08-25 10:01:28 +02:00

36 Commits