* feat(data-quality): use sampling config in data diff
- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes
* - reverted missing md5
- added missing database service type
* - use a custom substr sql function
* fixed nounce
* added failure for mssql with sampling because it requires a larger change in the data-diff library
* fixed unit tests
* updated range for sampling
* feat(statistics-profiler): use statistics tables to profile trino tables
- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic
* added ABC to terminal classes in collaborative root
* fixed docstring for TestSuiteInterface
* reverted unintended changes
* typo
* feat: added column value to be in expected location test
* fix: renamed value -> values
* doc: added 1.6 documentatio entry
* style: ran python linting
* fix: move data packaging to pyproject.yaml
* fix: add init file back for data package
* fix: failing test case
* Add flake.nix
* Add lockfile for flake
* Update nix environment and document usage
* Add schema for exasol connector
* Add Exasol definitions to databaseService
* Fix error in exasol connector schema
* Add additional connection options/settings to exasol connector
* Add exasol-connector to ui
* Add depdencies for exasol-connector
* Update notes
* Update ingestion code
* Add Basic Documentation for Exasol Connector
* Update flake file
* Add developer notes
* Add python script which can be used as entry point for debugging in ide
* Add config file which can be used for debugging (manual execution)
* Update debug script
* Update developer notes
* Remove old developer notes
* Add .venv to gitignore
* Update dev notes
* Update development notes
* Update ExasolSource
* Establish basic connection to Exasol DB from connector
* Update exasol connector connection settings
* Add service_spec for exasol plugin
* Remove development files
* Remove unused module
* Applied code formatter
* Update exasol dependency constraint(s)
* Add unit test for exasol connection url(s)
* Fixed test expectations for exasol connection url test(s)
* Adjust the test query for the Exasol connection test
* ref(profiler): use di for system profile
- use source classes that can be overridden in system profiles
- use a manifest class instead of factory to specify which class to resolve for connectors
- example usage can be seen in redshift and snowflake
* - added manifests for all custom profilers
- used super() dependency injection in order for system metrics source
- formatting
* - implement spec for all source types
- added docs for the new specification
- added some pylint ignores in the importer module
* remove TYPE_CHECKING in core.py
* - deleted valuedispatch function
- deleted get_system_metrics_by_dialect
- implemented BigQueryProfiler with a system metrics source
- moved import_source_class to BaseSpec
* - removed tests related to the profiler factory
* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger
* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger
* fixed tests
* format
* bigquery system profile e2e tests
* fixed module docstring
* - removed import_side_effects from redshift. we still use it in postgres for the orm conversion maps.
- removed leftover methods
* - tests for BaseSpec
- moved get_class_path to importer
* - moved constructors around to get rid of useless kwargs
* - changed test_system_metric
* - added linage and usage to service_spec
- fixed postgres native lineage test
* add comments on collaborative constructors
* fix(data-quality): table diff
- added handling for case-insensitive columns
- added handling for different numeric types (int/float/Decimal)
- added handling of boolean test case parameters
* add migrations for table diff
* add migrations for table diff
* removed cross type diff for now. it appears to be flaky
* fixed migrations
* use casefold() instead of lower()
* - implemented utils.get_test_case_param_value
- fixed params for case sensitive column
* handle bool test case parameters
* format
* testing
* format
* list -> List
* list -> List
* - change caseSensitiveColumns default to fase
- added migration to stay backward compatible
* - removed migration files
- updated logging message for table diff migration
* changed bool test case parameters default to always be false
* format
* docs: data diff
- added the caseSensitiveColumns parameter
requires: https://github.com/open-metadata/OpenMetadata/pull/18115
* fixed test_get_bool_test_case_param
* ref(profiler): redshift system metrics
- moved redshift system metrics to the redshift source module
- use Timestamp in data quality
- added plugin feature to test utils
* use timezone.utc
* format
* reverted unintended snowflake changes
* fixed import test_system_metrics.py
* revert
* fixed import in tests
* GEN-1322: API Entity - Remove Beta
* minor: add doc for the metadata pipeline
* api service refactor
* api service refactor backend changes
* add apiconnection in test service connection
* pytest fix
* fix java file formatting
* Fix casing of REST in ApiServiceRest.spec.ts
* Refactor REST to Rest in API classes
* minor change
* minor change
* minor change
* fix cashing for API to Api
* add playwright test for api service ingestion
* fix: playwright test
---------
Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
* Issue-15768: Support Metric Entity
* Issue-15768: Support Metric Entity
* Issue-15768: Support Metric Entity
* Fix tests
* Fix tests
* Fix tests
* Minor: Fix tests
* ui: add metricsAPI rest utils
* ui: metric list page part 1
* feat: Add metric translations for multiple languages
* chore: Add "metric" field to SearchIndexingApplication schema
* ui: add create metric page
* ui: metric details page patch 1
* ui: add custom property and lineage support for metric entity
* ui: add expression component
* ui: add metric summary component
* chore: Update tab labels in MetricDetails and MetricVersion components
* ui: show other info like metric type, granularity, etc
* feat: Add support for metric entity in search dropdown
* feat: Rename custom property to Metric in MetricEntity.md
* feat: Add OwnerLabel component to MetricListPage
* Fix expression field in Metric
* chore: update expression to metricExpression
* ui: add metric header component with edit option
* Add metric to SearchIndexApp
* chore: Update expression to metricExpression
* ui: allow metric expression edit
* ui: update metric icon
* minor improvements
* Fix lineage indexing for Metric
* Update GlobalSettingsClassBase.ts to use MetricIcon for metrics in the global settings menu
* Fix error handling in MetricListPage component
* add related metrics
* minor improvements
* Fix relatedTerms patch
* Fix relatedTerms validation
* Add Boolean for deleted
* filter active entity from related metric list
* playwrite e2e part 1
* Refactor MetricSummary component to include RelatedMetrics in the summary panel
* test: add playwright test for metric special cases
* Add 'Metrics' to Explore Tree
* test: add e2e for add metric page
* test: add test for metric listing page content
* Add Boolean for deleted, remove deleted from suggests
* Refactor LineageProvider to handle deleted flag properly
* add playwright for metric listing
* fix test
* Add colored metric icon and update its usage in GlobalSettingsClassBase
* Fixed py_test test_ometa_endpoint for metric
---------
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: SumanMaharana <sumanmaharana786@gmail.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
* support side effects on source classes by always importing source class
* streamlined error message
* fixed service type extraction for test suite pipeline
* - replaced "custom" with constant
- added quotes for the plugin exception for copy/paste ergonomics
* tests(datalake): use minio
1. use minio instead of moto for mimicking s3 behavior.
2. removed moto dependency as it is not compatible with aiobotocore (https://github.com/getmoto/moto/issues/7070#issuecomment-1828484982)
* - moved test_datalake_profiler_e2e.py to datalake/test_profiler
- use minio instead of moto
* fixed tests
* fixed tests
* removed default name for minio container
* configure api service metadata
* add rest api service
* fix test con. pyformat changes
* add models, fix test con.
* improve test con.
* add docs, side doc
* fix model data parse, url error fix
* add tests
* fix pytest errors
---------
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* fix(profiler): snowflake
resolve tables using the snowflake engine instead of OpenMetadata
* added env for cleaning up dbs in E2E
* moved system metric method to profiler. all the rest says in snowflake
* format
* revert unnecessary changes
* removed test for previous resolution method
* use shutdown39
* fix(bigquery): unquote and convert any escaped characters to their actual representations
* test: bigquery description with multiple line
---------
Co-authored-by: Imri Paran <imri.paran@gmail.com>
* fix: Allow non numeric numbers to be sent via Json, Replace NaN values with None in SQAProfilerInterface
Replace NaN values with None in the SQAProfilerInterface class to maintain database parity. NaN values will be cast to null in OpenMetadata. This change ensures that data handling processes account for this conversion.
* fix: histogram overflow error
* test: Add Unit Test for Null and Null Ratio Metric
* chore: Address comments
* chore: Address comments
* fix: checkstyle and message
* fix: failing tests as null count works as expected
* fixes arrayDataType must be not null, adding db name to queries as it fails
* Fix Pydantic Issue
* Partial: Add Unity Catalog Topology Test
* Fix lint
* Fix Tests, Fix UnityCatalog Array Column issue
* Fix Tests
* Address comments, add logger to the exception
* introduce gitlab option to lookml ingestion
* fix reader and disable test
* fix copy paste in test case
* fix file read and keyset pagination for tree
* fix credentials to include gitlab credentials
* uncomment arguments for unused credentials to fix validation error
* fix credentials test
* fix credentials test
---------
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>