377 Commits

Author SHA1 Message Date
Mayur Singal
5140203311
Fix #16692: Override Lineage Support for View & Dashboard Lineage (#17064) 2024-07-22 20:42:38 +05:30
Imri Paran
d59b83f9d1
MINOR[GEN-978]: Fix empty test suites (#16975)
* tests: refactor

refactor tests and consolidate common functionality in integrations.conftest

this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more

* format

* removed helpers

* changed scope of fictures

* changed scope of fixtures

* added profiler test for mssql

* fixed import in data_quality test

* json safe serialization

* format

* set MARS_Connection

* fix(data-quality): empty test suite

do not raise for empty test suite

* format

* dont need to check length in _get_test_cases_from_test_suite

* fix

* added warning if no test cases are found
2024-07-19 12:12:34 +02:00
Imri Paran
0fee79b200
MINOR: fix sample data issue with Pydantic v2 and refactor python integration tests (#16943)
* tests: refactor

refactor tests and consolidate common functionality in integrations.conftest

this enables writing tests more concisely.
demonstrated with postgres and mssql.
will migrate more

* format

* removed helpers

* changed scope of fictures

* changed scope of fixtures

* added profiler test for mssql

* fixed import in data_quality test

* json safe serialization

* format

* set MARS_Connection

* use SerializableTableData instead of TableData

* deleted file test_postgres.py

* fixed tests

* added more test cases

* format

* changed name test_models.py

* removed the logic for serializing table data

* wip

* changed mapping in common type map

* changed mapping in common type map

* reverted TableData imports

* reverted TableData imports

* reverted TableData imports
2024-07-17 08:11:34 +02:00
Pere Miquel Brull
2aef457785
FIX #16481 - Truncate ingestion pipeline status (#16997)
* FIX #16481 - Truncate ingestion pipeline status

* FIX #16481 - Truncate ingestion pipeline status

* FIX #16481 - Truncate ingestion pipeline status
2024-07-12 09:44:21 +02:00
Mayur Singal
4eadcfdc5d
Fix #16590: Allow only team groups to be owner (#16995) 2024-07-11 14:17:13 +05:30
Mayur Singal
afafb4af92
MINOR: Add support for s3 unstructured files (#16936) 2024-07-08 15:24:39 +05:30
Imri Paran
d08af1f86d
MINOR: Fix data diff with threshold (#16926)
* fix: table-diff

passed threshold and diff count in wrong order. test was not covering this due to how the parameters were configured.
2024-07-05 07:51:24 +02:00
Pere Miquel Brull
7e98ece3e5
MINOR - Pydantic V2 warnings and better exception msg (#16916) 2024-07-04 14:54:41 +02:00
Imri Paran
9b5ce3560c
MINOR: Pydantic equal assert util (#16918)
* tests: pydantic object assertion

added util for comparing pydantic objects

* fixed test_data_diff.py
2024-07-04 09:59:46 +05:30
Onkar Ravgan
00d74d1776
Fix #15721: Added Override flag to force update of Description, Tags and Owner from Source System (#16815) 2024-07-03 11:48:06 +05:30
Imri Paran
2c9aeebcb8
MINOR: add column diff for table diff test case (#16809)
* feat(table-diff): added column validation

added column validation for table diff that will be carried out before running the row level diff. If a diff for the column exists, it will short circuit the test and report.

* fixed unit tests

* format

* - resolve column types more robustly
- changed test result metric to include "rows" or "columns"
2024-07-02 10:36:03 +00:00
Imri Paran
404b40ad53
Fix #16700: Fail ingestion gracefully when column is not compatible with test type (#16806)
* fix(data-quality): incompatible columns

gracefully fail when a column of incompatible type is submitted for a test case

* format

* added condition to handle only colum test cases

* fixed tests

* format
2024-07-02 09:56:35 +02:00
Ayush Shah
fe04b0a201
Fixes #16435: Fix pyodbc error for mssql (#16800) 2024-06-27 15:46:06 +05:30
Ayush Shah
527b714d34
Fixes #16760: Remove maxLength for tagFQN (#16794) 2024-06-26 21:14:07 +05:30
Mayur Singal
bd2bf4a044
MINOR: Include default values in custom pydantic model (#16795) 2024-06-26 20:31:02 +05:30
Imri Paran
5e5c811ef2
moved int_admin_ometa to a dedicated module (#16768) 2024-06-25 11:21:22 +05:30
Imri Paran
54ca82f64d
MINOR: raise lineage error when table does not exist (#16756)
* raise lineage error when table does not exist

* added test case for partial success

* format

* format

* fixed tests
2024-06-24 21:41:59 +05:30
Teddy
141ceb4c8d
MNINOR add common test elements to _openmetadata_testutils module (#16758)
* fix: add common test to _testutils module

* fix: renamed _testutils to _openmetadata_testutils
2024-06-21 15:11:34 +02:00
Onkar Ravgan
ceaf4bf08a
MINOR: Add method to list custom properties for a entity for python sdk (#16753)
* List custom properties for a entity

* added test

* fixed test
2024-06-21 16:34:49 +05:30
Matt Chamberlin
ac6ddbf6c4
MINOR: support JSONL datalake file types (#16614)
* fix: support JSONL datalake file types

* add jsonl zip file types

* update fileFormat enum in table schema

* add tests

* fix test data ref

* reformat

* fix tests

---------

Co-authored-by: Matthew Chamberlin <mchamberlin@ginkgobioworks.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-06-21 09:54:19 +02:00
Imri Paran
b960b60965
Fix #16421: add tableDiff test case (#16554)
* feat: add tableDiff test case

This changed introduces a "table diff" test case which
compares two tables and fails if they are not identical.
The similarity is made based on a specific "key" (because the test only makes sense when performed on ordered collections).

1. Added the `tableDiff` test definition.
2. Implemented a "runtime" parameters feature which injects additional parameters for the test at runtime.
3. Integration tests (because of course).

This feature was not tested end-to-end yet because "array" data

* pydantic v2

* format

* format

* format and added data diff to setup.py

* format

* fixed param issue which has type ARRAY

* fixed runtime_parameter_setter

* moved models to parent directory

* handle errors in table diff

* fixed issue with edit test case

* format

* added more details to pytest skip

* format

* refactor: Improve createTestCaseParameters function in DataQualityUtils

* fixed unit test

* removed unused fixture

* removed validator.py

* fixed tests

* added validate kwarg to tests_mixin

* removed "postgres" data diff extra as they interfere with psycopg2-binary

* fixed tests

* pinned tenacity for tests

* reverted tenacity pinning

* added ui support for test diff

* fixed dq cypress and added edit flow

* organized the test case

* added dialect support

* fixed tests

* option style fix

* fixed calculation for passing/failing rows

* restrict the tableDiff test to limited services

* set where to None if blank string

* fixed where clause

* fixed tests for where clause

* use displayName in place of name in edit form

* added docs for RuntimeParameterSetter

* fixed cypress

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-06-20 16:54:12 +02:00
IceS2
f0049853ec
FIXES 14885: Initial deltalake implementation for s3 (#16665)
* Initial deltalake implementation for s3

* Fix styles

* Fix test_amundsen

* Fix UnitTests

* Fix Checkstyle

* Fix integration tests due to datalake client refactor

* Fix unit tests

* Fix tests

* Fix Integration DeltaLake Storage test

* Skip delta storage integration test for python 3.8

* DeltaLake JSONSchema changes migrations

* Update import name

* Add some comments based on sonarcloud suggestions

* Update DeltaLake documentation

* Resolve some comments
2024-06-20 12:08:21 +05:30
Imri Paran
95d2d0f82f
skip mssql test for python (#16683) 2024-06-17 13:19:04 +00:00
Imri Paran
18206393e2
MINOR: added lineage ingestion test for mssql (#16436)
* added test case for SQL Server lineage

* reasons for failing tests
2024-06-17 08:56:28 +02:00
Ayush Shah
b3eae8c1b9
Minor: Fix Deprecated utcnow to timezone support (#16607) 2024-06-14 15:23:51 +05:30
Mayur Singal
e3fa340c8f
MINOR: Pydantic fixes for redshift & kafka (#16638) 2024-06-14 14:08:59 +05:30
Trs
fc9033b953
Fixes(ingestion/source/dbt): Handle None Type in get_tag_labels Function for DBT Metadata Processing (#16648)
* fix condition

* fix

* lint
2024-06-13 17:19:46 +05:30
Onkar Ravgan
38e2793705
MINOR: Enabled pbit file test (#16531)
* Enabled pbit file test

* Updated test

* update test

* fixed pylint
2024-06-07 17:59:45 +05:30
Pere Miquel Brull
cb72a22b59
Fix - e2e tests for pydantic V2 (#16551)
* Fix - e2e tests for pydantic V2

* add correct default

* add correct default

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* fix apis

* format
2024-06-06 19:36:17 -07:00
Mohit Yadav
9ec3d94e3b
[FIX] GlossaryTerm reviewers should be user or team only (#16372)
* add teams as reviewer

* Check Users to be reviewers

* Reviewers can be a team or user

* Fix check by id or name

* Review can be team or user both

* Validate Reviewers

* add multi select control

* - Fix Reviewers

* - Centralize Reviewer Relationship to EntityRepository

* - Sort

* add team as reviewer for glossary terms

* locales

* cleanup

* - Update Reviewer should remove existing reviewers

* fix selectable owner control

* fix code smells

* fix reviewer issue

* add glossary cypress

* fix patch issue on reviewers set to null

* update cypress tests

* fix cypress

* fix cypress

* fix reviewers in glossary task and supported cypress

* fix pytest

* Fix

* fix cypress

* fix code smells

* Inherited Reviewers need to be present always

* filter out inherited users

* fix cypress

* fix backend tests failure

* fix backend tests failure -checkstyle

* restrict owner to accept task in case of reviewer present

* fix pytest

---------

Co-authored-by: karanh37 <karanh37@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: sonikashah <sonikashah94@gmail.com>
2024-06-06 20:23:37 +05:30
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Imri Paran
067fb510ab
MINOR: test case for usage->delete->usage (#16409)
* test: added test case to demonstrate cache issue
2024-05-28 11:23:43 +02:00
Imri Paran
a4c516d2c7
Fixes 16305: Added Test Case for Matching Enum (#16362)
* Added Test Case for Matching Enum

1. Implemented the test case using the `matchEnum` parameter.
2. Added integration tests.
3. Added migrations.

* fix tests

* fixed tests

* format

* fixed tests

* clear search cache before running ingestion

* format

* changed scopt of aws fixture

* moved migrations to 1.5.0
2024-05-28 09:30:30 +02:00
juntao
8dd613caa5
Fixes #16235: need quote fullyQualifiedName in Ingestion Framework (#16273)
* Fixes #16235: need quote fullyQualifiedName in Ingestion Framework

* MINOR: fix UT issue

* revert: fix UT issue

* revert code

* revert code

* format code
2024-05-23 17:45:47 +02:00
Imri Paran
d5bf30ccd3
MINOR: trino integration test (#16291)
* added trino integration test

* - removed warnings for classes which are not real tests
- removed "helpers" as its being used

* use a docker network instead of host

* print logs for hive failure

* removed superset unit tests

* try pinning requests for test

* try pinning requests for test

* wait for hive to be ready

* fix trino fixture

* - reduced testcontainers_config.max_tries to 5
- remove intermediate containers

* print with logs

* disable capture logging

* updated db host

* removed debug stuff

* removed debug stuff

* removed version pin for requests

* reverted superset

* ignore trino integration on python 3.8
2024-05-22 15:12:00 +00:00
Mayur Singal
89829949ce
MINOR: Fix flaky pytests (#16379) 2024-05-22 14:12:34 +05:30
Imri Paran
7af7f9322f
MINOR: skip pbit test (#16312)
* skip superset test in CI

* fixed pytest mark

* format

* format

* skip pbi

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-05-17 07:27:12 +02:00
Pere Miquel Brull
53185fd30b
MINOR - Add Integration Test for S3 Storage (#16277)
* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* format

* format
2024-05-16 10:03:27 +02:00
Imri Paran
c277233ef1
MINOR: use archive instead of volume for postgres test (#16245)
* using archive instead of volume for postgres test

* format

* remove usage of request
2024-05-14 09:11:16 +00:00
Onkar Ravgan
4a6849a05d
MINOR: Added custom property EntityReference support to python sdk (#16132)
* Added cust prop entityref to python sdk

* Added name and displayName fields to entityref
2024-05-07 17:35:39 +05:30
Mohit Yadav
0769d71ee7
Fixes Test Suite Reference in Table Schema (#16129)
* Fixes Test Suite Reference in Table Schema

* fix: fix test suite to interact with entity reference

---------

Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>
2024-05-06 19:03:23 +05:30
IceS2
795879d776
MINOR: Fix issue with SQLAlchemy types not being correctly mapped to OM Type on the profiler (#16122)
* Fix issue with SQLAlchemy types not being correctly mapped to OM Types on the profiler

* Fix checkstyle
2024-05-03 17:05:52 +02:00
Onkar Ravgan
ceaa9d3e8a
Fix #15611 Parse PowerBI Dax files for lineage (#15975) 2024-04-29 14:55:06 +05:30
Onkar Ravgan
828e9abc97
Added enum support in custom prop python sdk (#16026) 2024-04-25 14:46:55 +05:30
Ayush Shah
3621407642
Fixes #15732: Modify Reference for Tags to EntityName (#15938) 2024-04-25 11:53:46 +05:30
Ayush Shah
0963a111fe
Fixes #12127: Add Support for Complex types of Databricks & UnityCatalog in profiler (#15976) 2024-04-23 15:54:36 +05:30
Pere Miquel Brull
df5d5e1866
MINOR - Fix datamodel lineage call (#15991)
* MINOR - Fix datamodel lineage call

* amend merge
2024-04-23 09:56:24 +02:00
Mayur Singal
85b6983eee
Fix #15062 & #14810: Fix Column level lineage overwrites pipeline Lineage & manual col lineage (#15897) 2024-04-23 09:37:43 +05:30
Teddy
449a5f2de3
FIX #11951 - ingestion logic for global profiler config (#15948)
* feat: add global metric configuration for the profiler

* style: ran java linting

* fix: renamed disable to disabled

* style: ran java linting

* feat: ometa sdk for profiler setting

* test: ingestion profiler global config tests

* fix: update metric name to use MetricType Enum

* fix: allow bot to retrieve settings

* fix: exclude GX artifacts

* feat: implement global profiler setting logic for ingestion side

* fix: exclude metrics if Metric is empty

* style: ran python linting

* style: ran python linting

* fix: skip empty metrics

* style: ran python linting

* fix: moved GET profiler config to seperate endpoint in system resource

* fix: moved compute metric filter to MetricFilter + renamed container

* fix: test failures

* fix: profiler test case
2024-04-22 22:35:37 +02:00
Imri Paran
93ec391f5c
MINOR: Dynamodb sample data (#15264)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* feat(profiler): add dynamodb row count

* feat(profiler): add dynamodb row count

* formatting

* validate_compose: raise exception for bad status code.

* fixed import

* format

* feat(nosql-profiler): added sample data

1. Implemented the NoSQL sampler.
2. Some naming changes to the NoSQL adaptor to avoid fixing names with the profiler interface.
3. Tests.

* added default sample limit

* formatting

* fixed import

* feat(profiler): dynamodb sample data

* tests for dynamo db sample data

* format

* format

* use service connection for nosql adaptor factory

* fixed tests

* format

* fixed after merge
2024-04-22 17:46:40 +02:00