428 Commits

Author SHA1 Message Date
Mayur Singal
34c43eaea0
MINOR: Fix pytests (#21807) 2025-06-17 23:44:29 +05:30
harshsoni2024
0f79d8ea1d
MINOR: pytest opt out flaky test (#21800)
* remove mlflow test until fixed

* alationsink test count fixed

* pylint fix gx
2025-06-17 14:23:28 +05:30
Pere Menal-Ferrer
44e09e41a2
Revert "FIX #1464 (#21520)" (#21726)
This reverts commit 1e86f9870fd663122b9bbb64f3cf17cf32619c7f.
2025-06-13 17:27:32 +02:00
IceS2
891ff4184d
MINOR: Initial implementation for our Connection Class (#21581)
* Initial implementation for our Connection Class

* Implement the Initial Connection class

* Add Unit Tests

* Fix Test

* Fix Profile Test Connection

* Remove unit test

* Remove comment

* Fix tests and missing changes
2025-06-13 14:52:29 +02:00
Teddy
c09a8b27ae
ISSUE #16676 - Add Tag to CreateTestCase (#21366)
* refactor: removed testSuite field from CreateTestCase

BREAKING CHANGE: when creating a test case, testsuite is now derived from entityLink (fetch or created)

* feat: allow setting tags when creating a test case

* style: ran linters

* fix: compiling error

* fix: failing test case

* fix: failing tests

* removed testSuite from required filed

* fixed ui side

* style: ran java linting

* deprecation: remove testSuite param from ingestion

* fix: remove test suite filed

* fix: remove test_suite field

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-06-11 09:59:08 +02:00
Pere Menal-Ferrer
1e86f9870f
FIX #1464 (#21520)
* Add PIICategoryTags and some utilities on top of them.

* Fix static-check

* Add test for fqn representation

* Add NEREntityGeneralTags.json from Collate

* Add test to check PIICategoryTags agree with the ones used by OM server

* Add LabelExtractor

* Fix style

* Add ignore superflous-parens for pylint

* Ass comment as per PR review

* Fix not-updated PII-IT

* Remove duplicated IT test for PII

---------

Co-authored-by: Pere Menal <pere.menal@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-06-09 16:05:35 -07:00
Teddy
5078a2fbb9
DEPRECATION: Remove testCaseResults endpoint from testCaseResource (#21527)
* deprecation: remove testCaseResults endpoint from testCaseResource

* fix: path in test e2e test

* fix: endpoint name to testCaseResults

* style: fix java linting
2025-06-07 21:02:54 +02:00
Teddy
2a120c166a
MINOR: Py failing test cases (#21437)
* fix: failing test cases

* fix: skip test for now
2025-05-28 17:52:32 +02:00
Pere Menal-Ferrer
ca812852d6
ci/nox-setup-testing (#21377)
* Make pytest to user code from src rather than from install package

* Fix test_amundsen: missing None

* Update pytest configuration to use importlib mode

* Fix custom_basemodel_validation to check model_fields on type(values) to prevent noisy warnings

* Refactor referencedByQueries validation to use field_validator as per deprecation warning

* Update ColumnJson to use model_rebuild rather as replacement for forward reference updates as per deprecation warning

* Move superset test to integration test as they are using testcontainers

* Update coverage source path

* Fix wrong import.

* Add install_dev_env target to Makefile for development dependencies

* Add test-unit as extra in setup.py

* Modify dependencies in dev environment.

* Ignore all airflow tests

* Remove coverage in unit_ingestion_dev_env. Revert coverage source to prevent broken CI.

* Add nox for running unit test

* FIx PowerBI integration test to use pathlib for resource paths and not os.getcwd to prevent failures when not executed from the right path

* Move test_helpers.py to unit test, as it is not an integration test.

* Remove utils empty folder in integration tests

* Refactor testcontainers configuration to avoid pitfalls with max_tries setting

* Add nox unit testing basic setup

* Add format check session

* Refactor nox-unit and add plugins tests

* Add GHA for py-nox-ci

* Add comment to GHA

* Restore conftest.py file

* Clarify comment

* Simplify function

* Fix matrix startegy and nox mismatch

* Improve python version strategy with nox and GHA

---------

Co-authored-by: Pere Menal <pere.menal@getcollate.io>
2025-05-27 10:56:52 +02:00
Pere Menal-Ferrer
6ea630d7ef
DevEx: Ingestion development improvement (focus on unit testing) (#21362)
* Fix test_amundsen: missing None

* Fix custom_basemodel_validation to check model_fields on type(values) to prevent noisy warnings

* Refactor referencedByQueries validation to use field_validator as per deprecation warning

* Update ColumnJson to use model_rebuild rather as replacement for forward reference updates as per deprecation warning

* Move superset test to integration test as they are using testcontainers

* Add install_dev_env target to Makefile for development dependencies

* Add test-unit as extra in setup.py

* Skip failing IT test. Requires further investigation.
2025-05-26 10:38:17 +02:00
Pere Menal-Ferrer
5d2dfa712a
feature/pii-processor-improvement (#21248)
* Add PII Tag and Sensitivity Level enums.

* Add feature-extraction for PII classification tasks

* Add faker as test dependency

* Add unit tests for presidio tag extractor

* Add PIISensitivityTags enum and update sensitivity mapping logic

* Add Presidio utility functions for PII analysis

* Extend column name regexs for PII

* Add tests for PAN, NIF, SSN entities

* Fix version of faker to prevent flaky tests. Fix failing tests.

* Add Generated to State enum

* Integrate PIISensitive classifier to PIIProcessor
2025-05-19 17:52:17 +00:00
Suman Maharana
f81ee52ec4
Chore Ingestion Tableau library change (#21076) 2025-05-15 17:48:39 +05:30
Teddy
cd6434dd73
ISSUE #21146 - Properly handle connection on sampler (#21186)
* fix: properly close connection on sampler ingestion

* fix: dangling connection test

* style: ran python linting

* fix: revert to 9
2025-05-15 12:21:01 +02:00
Teddy
209793f315
MINOR - Add support for GX 1.4 (#20934)
* fix: add support for GX 0.18.22 and GX 1.4.x

* fix: add  support for GX 0.18.22 and GX 1.4.x

* style: ran python linting

* fix: skip test if GX version is not installed
2025-04-24 11:55:04 +02:00
Mayur Singal
40ab1814c0
MINOR: Always Include DDL for Views (#20784) 2025-04-15 12:59:50 +05:30
Pere Miquel Brull
c38209c63b
FIX CL-#1427 - PATCH applies inherited owners (#20759)
* FIX CL-#1427 - PATCH applies inherited owners

* FIX CL-#1427 - PATCH applies inherited owners

* format
2025-04-13 06:56:33 +02:00
Mayur Singal
4a407f6d0d
MINOR: Implement column validation in lineage patch api (#20545) 2025-04-07 21:24:46 +05:30
Pere Miquel Brull
3186937cc2
MINOR - Update Auto Classification defaults for sample data & classif… (#20587)
* MINOR - Update Auto Classification defaults for sample data & classification

* fix tests
2025-04-07 15:56:57 +02:00
Mayur Singal
ee5d8eee8b
Revert "MINOR: Implement Column Validation in Lineage (#20544)" (#20658) 2025-04-07 17:13:35 +05:30
Imri Paran
f6441ad404
fix: trino data diff paths (#20457)
requires https://github.com/open-metadata/collate-data-diff/pull/6
2025-04-03 15:48:10 +02:00
Mayur Singal
7760663b22
MINOR: Change ingestion licence header (#20549) 2025-04-03 10:39:47 +05:30
Mayur Singal
7991715135
MINOR: Implement Column Validation in Lineage (#20544) 2025-04-02 17:40:40 +05:30
Imri Paran
663839bd85
test: assert dangling db connections (#20458)
added dangling connection assertions for mysql integration test
2025-04-02 08:38:17 +02:00
Pere Miquel Brull
c08273b4ad
MINOR: Allow loading ometa from env (#20511) 2025-03-31 12:06:33 +02:00
Mayur Singal
e6b7b89f86
Fix #20236: Handle Sample Data with non-utf8 characters (#20380) 2025-03-27 14:20:26 +05:30
Ayush Shah
7a3990f350
Fixes 19119: Enhance TableCustomSQLQueryValidator to support threshold operation (#20307) 2025-03-27 13:11:56 +05:30
Mayur Singal
fb3ba391ff
MINOR: Fix failing pytest (#20332) 2025-03-19 12:35:37 +05:30
fuzmish
7fa3e53403
Fix: Pass raw value of extraHeaders to ClientConfig (#19989) 2025-03-18 13:55:51 +05:30
Pere Miquel Brull
55d7e50441
MINOR - Add and remove data products Actions in Automator (#19948)
* MINOR - Add and remove Data Product assets in Automator config

* MINOR - Add and remove Data Product assets in Automator config

* domain mixin

* build ref

* build ref

* create types

* fix tests

* fix conflicts

---------

Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
2025-03-05 07:11:17 +01:00
Sriharsha Chintalapani
799e49e391
Search: improve relevancy for plural/singular words, partial matches,… (#20000)
* Search: improve relevancy for plural/singular words, partial matches, exact matches

* apply to all indexes

* Fix other query patterns

* Revert changes of database and databaseSchema fields in TableIndex.getFields() and table index mapping

* add missing boost query builder in es

* fix ci

* add max_ngram_diff setting in di-assets index

* fix TestCaseResourceTest mvn test failure

---------

Co-authored-by: sonikashah <sonikashah94@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-02-27 16:47:08 +01:00
Imri Paran
97fad806a2
Fixes 19755: Publish app config with status (#19754)
* feat(app): add config to status

add config to the reported status of the ingestion pipeline

* added separate pipeline service client call for external apps

* fix masking of pydantic model

* - overload model_dump to mask secrets instead of a separate method
- moved tests to test_custom_pydantic.py

* fix: execution time

* fix: mask secrets in dump json

* fix: for python3.8

* fix: for python3.8

* fix: use mask_secrets=False when dumping a model for create

* format

* fix: update mask_secrets=False for workflow configurations

* fix: use context directly when using model_dump_json

* fix: default behavior when dumping json

* format

* fixed tests
2025-02-25 16:51:49 +00:00
Sriharsha Chintalapani
a924064c09
Fix #17723: Generate Incremental Change Events even when consolidation of events applied (#19550)
* Fix #17723: Generate Incremental Change Events even when consolidation of events applied

* Fix #17723: Generate Incremental Change Events even when consolidation of events applied

* fix tests

* Fix tests

* clean policy tests

* update search methods to use incrementalChangeDescription part-1

* Fix the version page playwrights

* update search methods to use incrementalChangeDescription part-2

* introduce new field incrementalChangeDescription for search part-3

* fix mvn endpoint test

* fix followers and page search test

* fix following of assets

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: sonikashah <sonikashah94@gmail.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
2025-02-20 10:23:08 +05:30
Pere Miquel Brull
91b62fdc32
FIX #19798 - Shortening SQA __tablename__ to avoid hitting errors in … (#19809)
* FIX #19798 - Shortening SQA __tablename__ to avoid hitting errors in postgres

* fix tests

---------

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-02-17 09:37:06 +01:00
sonika-shah
c0eb7d08de
GEN -19588 Sort Enum type Custom Property Values (#19637)
* GEN -19588 Sort Enum type Custom Property Values

* fix py-tests

* use streams for sorting
2025-02-11 14:29:01 +05:30
Teddy
28bd01c471
MINOR: Remove default 100 when profileSample is None (#19672)
* fix: remove default 100% percent

* fix: use get_dataset

* fix: orm_profiler tests
2025-02-05 19:14:31 +01:00
Ethan
48700ae9ea
Fixes #18075: Dockerfile lint warning (#18077)
* fix docker warning

* for running actions

---------

Co-authored-by: Akash Jain <15995028+akash-jain-10@users.noreply.github.com>
2025-02-04 15:28:36 +05:30
Teddy
ef131d7e20
MINOR: Wrong attribute name in SampleConfig model (#19641)
* fix: wrong attribute name in SampleConfig model

* fix: test attribute

* fix: failing tests

* fix: trino filter error + adjust test to take into account null value

* fix: mssql and azuresql tablesample on views
2025-02-04 10:40:40 +01:00
Imri Paran
41b1ec081d
tests(e2e): increase CI for sampling test (#19519)
based on experiment in https://gist.github.com/sushi30/3083e96c9081371fa55e55b0847b96d2
2025-01-27 09:31:43 +00:00
Akash Verma
9ecc8a8afe
Added integration testcontainer test for mongodb (#19282) 2025-01-10 10:10:11 +05:30
Pere Miquel Brull
e56f477a4a
Fix #19147 - Executable Test Suites (#19221)
* backend

* format & tests

* rename backend

* migrations and ingestion

* format & tests

* format & tests

* tests

* format & tests

* tests

* updated ui side of changes

* addressing comment

* fixed failing unit test

* fix test list

* added e2e test, and fixed existing test

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-01-07 17:59:54 +01:00
Pere Miquel Brull
9dc56c3bb0
TEST - Add bots search RBAC validation (#19159)
* TEST - Add bots search RBAC validation

* format

* format
2025-01-02 15:32:03 +01:00
Akash Verma
39dcb5baef
Feature : Cockroach db connector (#18961) 2025-01-02 13:07:55 +05:30
Mayur Singal
a49aab7111
MINOR: User search should only look in name & displayName (#19121)
* MINOR: User search should only look in name & displayname

* py_format

* pyformat

---------

Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
2024-12-18 16:44:54 +05:30
Keshav Mohta
cde3a7dd1e
Feature: Cassandra Connector (#18943) 2024-12-12 15:12:55 +05:30
Imri Paran
16875853a0
test(data-dff): fix flaky test (#18898)
use 99.5 CI for data diff sampling
2024-12-06 18:55:27 +05:30
Pere Miquel Brull
7aacfe032c
MINOR - FQN encoding in ometa_api, TestSuite pipeline creation & serialization of test case results (#18877)
* DOCS - Update ES config

* MINOR - Add missing FQN encoding & force types

* MINOR - Add missing FQN encoding & force types

* format

* fix tests
2024-12-02 17:17:21 +01:00
Mayur Singal
9b9509f4b9
MINOR: Mysql Lineage Support Main (#18780)
* MINOR: Mysql Lineage Support Main

* fix test

* fix test

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2024-11-29 20:48:42 +05:30
Pere Miquel Brull
460d20a856
MINOR - Fix clean_uri and add before pagination (#18826)
* print

* MINOR - Fix clean_uri and add before pagination

* MINOR - Fix clean_uri and add before pagination
2024-11-28 09:35:41 +01:00
Imri Paran
cd74d8f55a
MINOR: ref(data-quality): modularized test case validator import (#18716)
* ref(data-quality): modularized test case validator import

- removed test_suite_factory
- implemented TestCaseImporter
- removed SQAValidatorBuilder and PandasValidatorBuilder in favor of a SourceType enum
- removed the orm table creation from test suite source

* format

* IValidatorBuilder -> ValidatorBuilder

* use the table from the sampler in the test suite interface

* linting

* fixed the profiler with similar solution

* removed unused inheritance

* removed unneeded super().__init__()

* removed all instances of orm_table

* fixed tests

* add reportExplicitAny=false

* fixed tests
2024-11-27 16:25:12 +01:00
Teddy
58699063db
MINOR -- Fix DQ Partition Issue (#18641)
* fix: renamed `random_sample` to `get_dataset` and change dunder method access for SQA Table object

* fix: removed handle_partition decorator

* fix: fixed DQ partition issue + moved to `tablesample` method

* style: ran python linting

* style: fix python format check issues

* feat: added postgres tablesample

* style: ran python linting

* fix: sampling delta

* fix: merge conflicts

* fix: resolved conflicts

* style: ran python linting

* fix: patch orm call in test case

* fix: mock build_table_orm call in tests

* fix: test case failures and errors

* fix: removed unused import

* fix: patch typo

* fix: trino table schema retrieval

* fix: remove tuple context manager for 3.8 test support
2024-11-27 08:50:54 +01:00