1037 Commits

Author SHA1 Message Date
Suman Maharana
f414665976 Fix - switch to collate-dbt-artifacts-parser (#19647)
* Switch to collate-dbt-artifacts-parser

(cherry picked from commit 3e3c7029426f6973b0d800f45ebf15b5bdb225ef)
2025-02-04 06:29:01 +00:00
Mayur Singal
d5aed31a2d Fix #19633: Fix databricks schema not found (#19646)
(cherry picked from commit 208c40be09bdc7cdd68f8c93b3a5cbca58002499)
2025-02-04 06:13:18 +00:00
Mayur Singal
d68711d36f Fix #19489: Optimise multithreading for lineage (#19524) 2025-01-28 15:47:58 +05:30
harshsoni2024
b813294bf9 issue-16744: salesforce column description with toggle api (#19527)
(cherry picked from commit b1d481f2f1461cef05998ead1084a59f0029199b)
2025-01-27 11:25:42 +00:00
olof-nn
f61f62919a ISSUE-19454: Fixes broken looker lineage (#19456)
* ISSUE-19454: Fixes the broken lineage in looker when backticks enclosed table refs

* refactor

* use isort

* Update ingestion/src/metadata/ingestion/source/dashboard/looker/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-01-24 12:41:19 +05:30
agriev
cbbbca5472 Adds percona server for postgresql support (#19322)
* percona server for postgresql support

The only meaningful difference is version string in percona server for postgresql. So commit propose universal and safe way to detect server version by integer string, not complicated parsing of unformatted string.

* updated tests with get_server_version_num

commented outdated tests

---------

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
(cherry picked from commit dcebc41e3f845933aaa9b9de8761eeb8fc52f9ee)
2025-01-14 01:52:45 +00:00
Suman Maharana
f763575cfe Fixes #17747: dbt update owners (#19144)
* Fixes 17747: dbt update owners

* update messages

* addressed comments

* py_format

* py_format

* Added tests
2025-01-08 11:49:48 +05:30
Mayur Singal
dfe34e191c MINOR: User search should only look in name & displayName (#19121)
* MINOR: User search should only look in name & displayname

* py_format

* pyformat

---------

Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
2025-01-08 09:21:37 +05:30
Pere Miquel Brull
8fc6e8f52b Fix #19147 - Executable Test Suites (#19221)
* backend

* format & tests

* rename backend

* migrations and ingestion

* format & tests

* format & tests

* tests

* format & tests

* tests

* updated ui side of changes

* addressing comment

* fixed failing unit test

* fix test list

* added e2e test, and fixed existing test

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-01-07 21:35:23 +01:00
Imri Paran
16875853a0
test(data-dff): fix flaky test (#18898)
use 99.5 CI for data diff sampling
2024-12-06 18:55:27 +05:30
Imri Paran
e30571cf4e
[GEN-2187] fix(data-diff): added MD5 handling for bigquery (#18904)
* fix(data-diff): added nd5 handling for bigquery

- added MD5 handling for bigquery
- use URL instead of Engine because it requires less steps and less prone to failure

* added e2e test for data diff with sampling in bigquery
2024-12-06 14:21:33 +01:00
Teddy
610322ffed
MINOR - MSSQL timestamp type profiler fix (#18935)
* fix: mssql timestamp processing

* fix: min/max test type on datetime column

* style: fix python format
2024-12-06 08:03:42 +01:00
Teddy
03bd8e9dc4
FEAT: added TABLESAMPLE for MSSQL (#18926)
* feat: added TABLESAMPLE for sqlserver

* fix: class name

* test: added test to generated sample query
2024-12-05 14:17:39 +01:00
Ayush Shah
8664c8df75
Fixes GEN-2199: Allow Fivetran filtering of pipelines using name instead of id (#18929) 2024-12-05 10:55:11 +05:30
Pere Miquel Brull
7aacfe032c
MINOR - FQN encoding in ometa_api, TestSuite pipeline creation & serialization of test case results (#18877)
* DOCS - Update ES config

* MINOR - Add missing FQN encoding & force types

* MINOR - Add missing FQN encoding & force types

* format

* fix tests
2024-12-02 17:17:21 +01:00
Mayur Singal
9b9509f4b9
MINOR: Mysql Lineage Support Main (#18780)
* MINOR: Mysql Lineage Support Main

* fix test

* fix test

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2024-11-29 20:48:42 +05:30
Teddy
ac2f6d7132
MINOR - Fix sqa table reference (#18839)
* fix: sqa table reference

* style: ran python linting

* fix: added raw dataset to query runner

* fix: get table and schema name from orm object

* fix: get table level config for table tests
2024-11-28 18:49:11 +01:00
mgorsk1
da176767a8
feat: add dbt freshness check test (#18730)
* add dbt freshness check

* docs

* run linting

* add test case param definition

* fix test case param definition

* add config for dbt http, fix linting

* refactor (only create freshness test definition when user executed one)

* fix dbt files class

* fix dbt files class 2

* fix dbt objects class

* fix linting

* fix pylint

* fix linting once and for all

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2024-11-28 18:30:11 +01:00
harshsoni2024
cb33f274fc
Connector: rename microstrategy connector (#18604) 2024-11-28 18:50:42 +05:30
Suman Maharana
9a21e77e15
Added dbt cloud multi projects and jobs filter (#18801)
* Added dbt cloud multi project and jobs filter

* added tests

* change to array type

* updated yaml config

* added migrations
2024-11-28 16:10:34 +05:30
Pere Miquel Brull
460d20a856
MINOR - Fix clean_uri and add before pagination (#18826)
* print

* MINOR - Fix clean_uri and add before pagination

* MINOR - Fix clean_uri and add before pagination
2024-11-28 09:35:41 +01:00
Imri Paran
cd74d8f55a
MINOR: ref(data-quality): modularized test case validator import (#18716)
* ref(data-quality): modularized test case validator import

- removed test_suite_factory
- implemented TestCaseImporter
- removed SQAValidatorBuilder and PandasValidatorBuilder in favor of a SourceType enum
- removed the orm table creation from test suite source

* format

* IValidatorBuilder -> ValidatorBuilder

* use the table from the sampler in the test suite interface

* linting

* fixed the profiler with similar solution

* removed unused inheritance

* removed unneeded super().__init__()

* removed all instances of orm_table

* fixed tests

* add reportExplicitAny=false

* fixed tests
2024-11-27 16:25:12 +01:00
Teddy
58699063db
MINOR -- Fix DQ Partition Issue (#18641)
* fix: renamed `random_sample` to `get_dataset` and change dunder method access for SQA Table object

* fix: removed handle_partition decorator

* fix: fixed DQ partition issue + moved to `tablesample` method

* style: ran python linting

* style: fix python format check issues

* feat: added postgres tablesample

* style: ran python linting

* fix: sampling delta

* fix: merge conflicts

* fix: resolved conflicts

* style: ran python linting

* fix: patch orm call in test case

* fix: mock build_table_orm call in tests

* fix: test case failures and errors

* fix: removed unused import

* fix: patch typo

* fix: trino table schema retrieval

* fix: remove tuple context manager for 3.8 test support
2024-11-27 08:50:54 +01:00
Imri Paran
d6470b7800
MINOR: fix(data-diff): get added columns (#18694)
* fix(data-diff): get added columns

- use both columns to calculate schema diff

* fix tests
2024-11-25 15:53:50 +01:00
Imri Paran
ee7d043035
[GEN-2109] feat(mongo): added ssl support (#18731)
* feat(mongo): added ssl support

Added SSL support for MongoDB using the SSL manager.

Attached a video demo.

- [Example repository for setting up mongodb with SSL](https://github.com/sushi30/mongodb-docker-ssl-example)
- [MongoDB TLS documentation](https://www.mongodb.com/docs/manual/tutorial/configure-ssl/)

* fixed test_doris.py
2024-11-22 08:54:13 -08:00
Ayush Shah
17ffdf9850
fix: modify fqn to allow quotes with dots (#18719) 2024-11-22 09:33:50 +05:30
Pere Miquel Brull
61021be98a
TEST - Add autoClassification for e2e (#18722) 2024-11-21 15:07:04 +01:00
IceS2
35ce7d7602
MINOR: Update Glossary Term tests (#18698)
* Update Glossary Term tests

* Remove unused code

* Fix test
2024-11-20 15:24:53 +01:00
Pere Miquel Brull
c68a45e7d8
Create new Auto Classification Workflow (#18610) 2024-11-19 08:10:45 +01:00
Ayush Shah
6f1df37ba1
Fixes GEN-1260: Add Validators while creating table to escape special characters (#18456) 2024-11-18 15:02:57 +05:30
Sriharsha Chintalapani
88c8fb48f3
Add Edit glossary terms, Edit Tier , Edit Tags as separate permissions (#18331)
* Add EditGlossaryTerms Permission

* Fix #18330: Add EDIT_GLOSSARY_TERM permission and enforce EDIT_TIER permisson

* add edit glossary term permission check in UI

* revert EDIT_GLOSSARY_TERMS operation

* Add EDIT_GLOSSARY_TERMS to common operations

* Add EDIT_TIER to common operations

* add default empty array for tags field, as patch calls can run into issues

* Fix tests

* Fix tests

* added glossary terms

* fix conflicts

* fix permission check for data model

* Add EditGlossaryTerms to DataConsumerPolicy

* Add EditGlossaryTerms,EditTier to DataConsumerPolicy

* fix tests

* Fix migrations for EditTier,EditGlossaryTerms

* add edit tier permission to data consumer

* Fix tests

* fix pytests

* missing test_dbt.py

---------

Co-authored-by: karanh37 <karanh37@gmail.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2024-11-15 10:50:15 -08:00
Suman Maharana
a218bbf5cb
Minor: Fix Mysql cli Update table count (#18582) 2024-11-15 14:27:02 +05:30
mgorsk1
3d2dfeb583
feat: use native trino client authentication classes (#16196)
---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2024-11-15 12:54:42 +05:30
Imri Paran
bde6ee4125
MINOR: Data diff sample fix (#18632)
* fix(data-diff): sampling configuration

handle the sampling condition separately for the 2 tables allowing to apply sampling on columns with mismatching cases

* format
2024-11-15 08:22:13 +01:00
Mayur Singal
75d417d267
MINOR: Fix user search - exclude bots (#18645) 2024-11-14 21:25:21 +05:30
Akash Verma
fa30be0589
fix #17726 Databricks schema name with hyphen issue (#18598) 2024-11-14 20:02:20 +05:30
harshsoni2024
cd3fcb5d22
MINOR: quicksight e2e fix (#18629) 2024-11-14 16:31:11 +05:30
Ayush Shah
6fa03ee66a
Fixes GEN-1994: Remove View Lineage from Metadata Ingestion flow (#18558) 2024-11-13 00:08:55 +05:30
Mayur Singal
f4fdafeb8a
MINOR: Athena & Tableau E2E fix (#18596) 2024-11-12 19:14:45 +05:30
Imri Paran
70c7880dfa
fixed bigquery system metrics e2e test (#18601) 2024-11-12 14:06:54 +01:00
Teddy
45d27a377d
GEN 1184 - Added Workflow Classification and Metric LevelConfig (#18572) 2024-11-11 15:59:42 +01:00
Imri Paran
a6d97b67a8
MINOR: fix system profile return types (#18470)
* fix(redshift-system): redshift return type

* fixed bigquery profiler

* fixed snowflake profiler

* job id action does not support matrix. using plain action summary.

* reverted gha change
2024-11-11 10:49:42 +01:00
Imri Paran
cdaa5c10af
[GEN-1996] feat(data-quality): use sampling config in data diff (#18532)
* feat(data-quality): use sampling config in data diff

- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes

* - reverted missing md5
- added missing database service type

* - use a custom substr sql function

* fixed nounce

* added failure for mssql with sampling because it requires a larger change in the data-diff library

* fixed unit tests

* updated range for sampling
2024-11-11 10:07:23 +01:00
Mayur Singal
efed932d97
Mask SQL Queries in Usage & Lineage Workflow (#18565) 2024-11-11 11:44:47 +05:30
Mayur Singal
b02c64931e
MINOR: Fix table not found error (#18560) 2024-11-09 20:33:32 +05:30
Imri Paran
b92b950060
Fix 18434: feat(statistics-profiler): use statistics tables to profile trino tables (#18433)
* feat(statistics-profiler): use statistics tables to profile trino tables

- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic

* added ABC to terminal classes in collaborative root

* fixed docstring for TestSuiteInterface

* reverted unintended changes

* typo
2024-11-07 18:37:31 +01:00
Teddy
d579008c99
GEN 1683 - Add Column Value to be At Expected Location Test (#18524)
* feat: added column value to be in expected location test

* fix: renamed value -> values

* doc: added 1.6 documentatio entry

* style: ran python linting

* fix: move data packaging to pyproject.yaml

* fix: add init file back for data package

* fix: failing test case
2024-11-06 11:17:13 +01:00
IceS2
dccba20101
Return s3 endpoint as str() instead of Url (#18521) 2024-11-05 17:39:50 +00:00
Katarzyna Kałek
47c75fe6a7
Enhanced Glue ingestion with external table features (#18511)
* added fileFormat, locationPath and external table lineage to Glue ingestion

* Improve Lineage Label

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-11-05 21:48:20 +05:30
Imri Paran
84391e7078
MINOR: tests: fix Tuple in bigquery e2e cli (#18499)
* tests: fix Tuple in bigquery e2e cli

* tests: fix Tuple in bigquery e2e cli

* fix workflow condition
2024-11-04 09:54:10 -08:00