1212 Commits

Author SHA1 Message Date
Ayush Shah
17ffdf9850
fix: modify fqn to allow quotes with dots (#18719) 2024-11-22 09:33:50 +05:30
Pere Miquel Brull
61021be98a
TEST - Add autoClassification for e2e (#18722) 2024-11-21 15:07:04 +01:00
IceS2
35ce7d7602
MINOR: Update Glossary Term tests (#18698)
* Update Glossary Term tests

* Remove unused code

* Fix test
2024-11-20 15:24:53 +01:00
Pere Miquel Brull
c68a45e7d8
Create new Auto Classification Workflow (#18610) 2024-11-19 08:10:45 +01:00
Ayush Shah
6f1df37ba1
Fixes GEN-1260: Add Validators while creating table to escape special characters (#18456) 2024-11-18 15:02:57 +05:30
Sriharsha Chintalapani
88c8fb48f3
Add Edit glossary terms, Edit Tier , Edit Tags as separate permissions (#18331)
* Add EditGlossaryTerms Permission

* Fix #18330: Add EDIT_GLOSSARY_TERM permission and enforce EDIT_TIER permisson

* add edit glossary term permission check in UI

* revert EDIT_GLOSSARY_TERMS operation

* Add EDIT_GLOSSARY_TERMS to common operations

* Add EDIT_TIER to common operations

* add default empty array for tags field, as patch calls can run into issues

* Fix tests

* Fix tests

* added glossary terms

* fix conflicts

* fix permission check for data model

* Add EditGlossaryTerms to DataConsumerPolicy

* Add EditGlossaryTerms,EditTier to DataConsumerPolicy

* fix tests

* Fix migrations for EditTier,EditGlossaryTerms

* add edit tier permission to data consumer

* Fix tests

* fix pytests

* missing test_dbt.py

---------

Co-authored-by: karanh37 <karanh37@gmail.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2024-11-15 10:50:15 -08:00
Suman Maharana
a218bbf5cb
Minor: Fix Mysql cli Update table count (#18582) 2024-11-15 14:27:02 +05:30
mgorsk1
3d2dfeb583
feat: use native trino client authentication classes (#16196)
---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2024-11-15 12:54:42 +05:30
Imri Paran
bde6ee4125
MINOR: Data diff sample fix (#18632)
* fix(data-diff): sampling configuration

handle the sampling condition separately for the 2 tables allowing to apply sampling on columns with mismatching cases

* format
2024-11-15 08:22:13 +01:00
Mayur Singal
75d417d267
MINOR: Fix user search - exclude bots (#18645) 2024-11-14 21:25:21 +05:30
Akash Verma
fa30be0589
fix #17726 Databricks schema name with hyphen issue (#18598) 2024-11-14 20:02:20 +05:30
harshsoni2024
cd3fcb5d22
MINOR: quicksight e2e fix (#18629) 2024-11-14 16:31:11 +05:30
Ayush Shah
6fa03ee66a
Fixes GEN-1994: Remove View Lineage from Metadata Ingestion flow (#18558) 2024-11-13 00:08:55 +05:30
Mayur Singal
f4fdafeb8a
MINOR: Athena & Tableau E2E fix (#18596) 2024-11-12 19:14:45 +05:30
Imri Paran
70c7880dfa
fixed bigquery system metrics e2e test (#18601) 2024-11-12 14:06:54 +01:00
Teddy
45d27a377d
GEN 1184 - Added Workflow Classification and Metric LevelConfig (#18572) 2024-11-11 15:59:42 +01:00
Imri Paran
a6d97b67a8
MINOR: fix system profile return types (#18470)
* fix(redshift-system): redshift return type

* fixed bigquery profiler

* fixed snowflake profiler

* job id action does not support matrix. using plain action summary.

* reverted gha change
2024-11-11 10:49:42 +01:00
Imri Paran
cdaa5c10af
[GEN-1996] feat(data-quality): use sampling config in data diff (#18532)
* feat(data-quality): use sampling config in data diff

- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes

* - reverted missing md5
- added missing database service type

* - use a custom substr sql function

* fixed nounce

* added failure for mssql with sampling because it requires a larger change in the data-diff library

* fixed unit tests

* updated range for sampling
2024-11-11 10:07:23 +01:00
Mayur Singal
efed932d97
Mask SQL Queries in Usage & Lineage Workflow (#18565) 2024-11-11 11:44:47 +05:30
Mayur Singal
b02c64931e
MINOR: Fix table not found error (#18560) 2024-11-09 20:33:32 +05:30
Imri Paran
b92b950060
Fix 18434: feat(statistics-profiler): use statistics tables to profile trino tables (#18433)
* feat(statistics-profiler): use statistics tables to profile trino tables

- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic

* added ABC to terminal classes in collaborative root

* fixed docstring for TestSuiteInterface

* reverted unintended changes

* typo
2024-11-07 18:37:31 +01:00
Teddy
d579008c99
GEN 1683 - Add Column Value to be At Expected Location Test (#18524)
* feat: added column value to be in expected location test

* fix: renamed value -> values

* doc: added 1.6 documentatio entry

* style: ran python linting

* fix: move data packaging to pyproject.yaml

* fix: add init file back for data package

* fix: failing test case
2024-11-06 11:17:13 +01:00
IceS2
dccba20101
Return s3 endpoint as str() instead of Url (#18521) 2024-11-05 17:39:50 +00:00
Katarzyna Kałek
47c75fe6a7
Enhanced Glue ingestion with external table features (#18511)
* added fileFormat, locationPath and external table lineage to Glue ingestion

* Improve Lineage Label

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-11-05 21:48:20 +05:30
Imri Paran
84391e7078
MINOR: tests: fix Tuple in bigquery e2e cli (#18499)
* tests: fix Tuple in bigquery e2e cli

* tests: fix Tuple in bigquery e2e cli

* fix workflow condition
2024-11-04 09:54:10 -08:00
Nicola Coretti
7ebc62dca7
feat: Add support for exasol datasource (#17166)
* Add flake.nix

* Add lockfile for flake

* Update nix environment and document usage

* Add schema for exasol connector

* Add Exasol definitions to databaseService

* Fix error in exasol connector schema

* Add additional connection options/settings to exasol connector

* Add exasol-connector to ui

* Add depdencies for exasol-connector

* Update notes

* Update ingestion code

* Add Basic Documentation for Exasol Connector

* Update flake file

* Add developer notes

* Add python script which can be used as entry point for debugging in ide

* Add config file which can be used for debugging (manual execution)

* Update debug script

* Update developer notes

* Remove old developer notes

* Add .venv to gitignore

* Update dev notes

* Update development notes

* Update ExasolSource

* Establish basic connection to Exasol DB from connector

* Update exasol connector connection settings

* Add service_spec for exasol plugin

* Remove development files

* Remove unused module

* Applied code formatter

* Update exasol dependency constraint(s)

* Add unit test for exasol connection url(s)

* Fixed test expectations for exasol connection url test(s)

* Adjust the test query for the Exasol connection test
2024-10-31 08:11:30 +01:00
Mayur Singal
9d91325af8
Lineage-1: Move view lineage processing to lineage workflow (#18220)
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2024-10-28 18:18:22 +05:30
Imri Paran
95982b9395
[GEN-356] Use ServiceSpec for loading sources based on connectors (#18322)
* ref(profiler): use di for system profile

- use source classes that can be overridden in system profiles
- use a manifest class instead of factory to specify which class to resolve for connectors
- example usage can be seen in redshift and snowflake

* - added manifests for all custom profilers
- used super() dependency injection in order for system metrics source
- formatting

* - implement spec for all source types
- added docs for the new specification
- added some pylint ignores in the importer module

* remove TYPE_CHECKING in core.py

* - deleted valuedispatch function
- deleted get_system_metrics_by_dialect
- implemented BigQueryProfiler with a system metrics source
- moved import_source_class to BaseSpec

* - removed tests related to the profiler factory

* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger

* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger

* fixed tests

* format

* bigquery system profile e2e tests

* fixed module docstring

* - removed import_side_effects from redshift. we still use it in postgres for the orm conversion maps.
- removed leftover methods

* - tests for BaseSpec
- moved get_class_path to importer

* - moved constructors around to get rid of useless kwargs

* - changed test_system_metric

* - added linage and usage to service_spec
- fixed postgres native lineage test

* add comments on collaborative constructors
2024-10-24 07:47:50 +02:00
Pere Miquel Brull
5074f6588f
MINOR - Validate app runner init (#18316) 2024-10-18 09:40:06 +02:00
Pere Miquel Brull
7012e73d75
GEN-1166 - Improve Ingestion Workflow Error Summary (#18280)
* GEN-1166 - Improve Ingestion Workflow Error Summary

* fix test

* docs

* comments
2024-10-16 18:15:50 +02:00
Imri Paran
be82086e25
MINOR: add column case sensitivity parameter (#18115)
* fix(data-quality): table diff

- added handling for case-insensitive columns
- added handling for different numeric types (int/float/Decimal)
- added handling of boolean test case parameters

* add migrations for table diff

* add migrations for table diff

* removed cross type diff for now. it appears to be flaky

* fixed migrations

* use casefold() instead of lower()

* - implemented utils.get_test_case_param_value
- fixed params for case sensitive column

* handle bool test case parameters

* format

* testing

* format

* list -> List

* list -> List

* - change caseSensitiveColumns default to fase
- added migration to stay backward compatible

* - removed migration files
- updated logging message for table diff migration

* changed bool test case parameters default to always be false

* format

* docs: data diff

- added the caseSensitiveColumns parameter

requires: https://github.com/open-metadata/OpenMetadata/pull/18115

* fixed test_get_bool_test_case_param
2024-10-15 16:29:43 +02:00
Onkar Ravgan
e6705f25b3
fixed dbt tag name (#18273) 2024-10-15 16:43:03 +05:30
Onkar Ravgan
2ee015e426
Add array supp for json schema parser (#18255) 2024-10-15 07:30:16 +05:30
Suman Maharana
dd08bc9ffd
GEN-895: Added Glue Pipeline Lineage (#18063) 2024-10-14 13:08:17 +05:30
Suman Maharana
69b34684b5
Fixed mysql E2E (#18229) 2024-10-11 10:49:03 +00:00
Imri Paran
d0ca05efbf
MINOR: add data quality tests (#18193)
* test: add dq tests

* wip

* fixed test_all_definition_exists
2024-10-11 12:07:58 +05:30
Pere Miquel Brull
f9e99f49e4
TEST - Add ES pagination with multiple filters (#18162)
* add filtering test

* adding test to paginate with filters
2024-10-10 17:14:22 +02:00
Imri Paran
68e71cb3dc
GEN-970: Refactor redshift system metrics to support freshness test (#17981)
* ref(profiler): redshift system metrics

- moved redshift system metrics to the redshift source module
- use Timestamp in data quality
- added plugin feature to test utils

* use timezone.utc

* format

* reverted unintended snowflake changes

* fixed import test_system_metrics.py

* revert

* fixed import in tests
2024-10-10 08:32:07 +02:00
Sachin Chaurasiya
457f3d919a
GEN-1322: API Entity - Remove Beta (#17967)
* GEN-1322: API Entity - Remove Beta

* minor: add doc for the metadata pipeline

* api service refactor

* api service refactor backend changes

* add apiconnection in test service connection

* pytest fix

* fix java file formatting

* Fix casing of REST in ApiServiceRest.spec.ts

* Refactor REST to Rest in API classes

* minor change

* minor change

* minor change

* fix cashing for API to Api

* add playwright test for api service ingestion

* fix: playwright test

---------

Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
2024-10-08 14:39:55 +05:30
Ethan
49fceb4674
Fixes #18104 : change parse_obj and assertEquals which was deprecated (#18105)
* change deprecationwarning

* fix format python

* fix replace module

* change : java function name
2024-10-07 09:02:41 +02:00
Imri Paran
71720ebc51
fix(table-diff): support cross database (#18085)
fixed table diff url to include database in all cases
2024-10-04 15:31:17 +02:00
Onkar Ravgan
23c6f1a6c1
AlationSink conn improvements (#18091) 2024-10-03 16:20:35 +05:30
Suman Maharana
bc6f4824ea
Added DBT tests with versionless and fixed v7 parsing (#18028) 2024-09-27 19:53:27 +05:30
sam-mccarty-mavenclinic
0dd3e97170
Fix 17911: Looker parsing improvements for liquid templating and view/model aliasing (#17912)
* Looker parsing improvements for liquid templating and view/model aliasing

* add python-liquid dependency to looker plugin requirements

* move to static method with 'openmetadata' context and add rendering tests

* remove backtick stripping

---------

Co-authored-by: Imri Paran <imri.paran@gmail.com>
2024-09-27 13:55:15 +02:00
Pere Miquel Brull
d26449576a
GEN-1234 - Clean up suggestions when a user is deleted (#17988)
* GEN-1234 - Clean up suggestions when a user is deleted

* add method

* add method

* fix postgres query
2024-09-26 16:22:36 +02:00
Imri Paran
25284e0232
MINOR: fix snowflake system metrics (#17989)
* fix snowflake system metrics

* format

* add link to logs and commit
fixed the dq cli test

* reverted bad formatting

* fixed models.py

* removed version pinning for data diff in tests
2024-09-26 11:55:17 +00:00
Suman Maharana
37b6dc8290
Add Sigma Dashboard Connector (#17855)
* Add Sigma Dashboard Connector

* changed to id instead of name in dashboard entity

* Address Comments

* addressed comments

* Added Docs

* yaml file changes

* fix ui changes
2024-09-26 16:29:35 +05:30
IceS2
d36f01abf6
Fix tearDown by using the proper file loader (#17994) 2024-09-25 17:37:56 +02:00
Pere Miquel Brull
4cccaae446
GEN-996 - Allow PII Processor without storing Sample Data (#17927)
* GEN-996 - Allow PII Processor without storing Sample Data

* fix import

* fix import
2024-09-20 16:05:29 +02:00
Pere Miquel Brull
1e56c76c0e
FIX #17896 - Python lineage SDK to work with Uuid & FQN models (#17928) 2024-09-20 10:37:41 +02:00