3502 Commits

Author SHA1 Message Date
Imri Paran
089fa785a8
build(setup-py): update pydantic version (#18541)
Update pydantic version to ">=2.7.0" in order to include `IncEx` that was introduced in 3d1355f168
2024-11-13 10:14:06 +01:00
Ayush Shah
6fa03ee66a
Fixes GEN-1994: Remove View Lineage from Metadata Ingestion flow (#18558) 2024-11-13 00:08:55 +05:30
Mayur Singal
f4fdafeb8a
MINOR: Athena & Tableau E2E fix (#18596) 2024-11-12 19:14:45 +05:30
Imri Paran
70c7880dfa
fixed bigquery system metrics e2e test (#18601) 2024-11-12 14:06:54 +01:00
Teddy
45d27a377d
GEN 1184 - Added Workflow Classification and Metric LevelConfig (#18572) 2024-11-11 15:59:42 +01:00
Imri Paran
a6d97b67a8
MINOR: fix system profile return types (#18470)
* fix(redshift-system): redshift return type

* fixed bigquery profiler

* fixed snowflake profiler

* job id action does not support matrix. using plain action summary.

* reverted gha change
2024-11-11 10:49:42 +01:00
Suman Maharana
fc79d60d83
Fixes: Added Sigma Column Level Lineage and Datamodels (#18571) 2024-11-11 14:42:57 +05:30
Imri Paran
cdaa5c10af
[GEN-1996] feat(data-quality): use sampling config in data diff (#18532)
* feat(data-quality): use sampling config in data diff

- get the table profiling config
- use hashing to sample deterministically the same ids from each table
- use dirty-equals to assert results of stochastic processes

* - reverted missing md5
- added missing database service type

* - use a custom substr sql function

* fixed nounce

* added failure for mssql with sampling because it requires a larger change in the data-diff library

* fixed unit tests

* updated range for sampling
2024-11-11 10:07:23 +01:00
Mayur Singal
efed932d97
Mask SQL Queries in Usage & Lineage Workflow (#18565) 2024-11-11 11:44:47 +05:30
Mayur Singal
b02c64931e
MINOR: Fix table not found error (#18560) 2024-11-09 20:33:32 +05:30
Suman Maharana
da039b197f
Add: Azure Data factory Connector (#18543)
* Added Azure Data factory Connector

* Added Lineage data factory

* removed not required files

* removed not required files

* Removed datafactory ui changes from oss

* resolve merge conflicts

* resolve merge conflicts

* added python requirements
2024-11-08 07:38:45 +01:00
Imri Paran
b92b950060
Fix 18434: feat(statistics-profiler): use statistics tables to profile trino tables (#18433)
* feat(statistics-profiler): use statistics tables to profile trino tables

- implemented the collaborative root class
- added the "useStatistics" profiler parameter
- added the "supportsStatistics" database connection property
- implemented the ProfilerWithStatistics and StoredStatisticsSource to add this functionality to specific profilers
- implemented TrinoStoredStatisticsSource for specific trino statistics logic

* added ABC to terminal classes in collaborative root

* fixed docstring for TestSuiteInterface

* reverted unintended changes

* typo
2024-11-07 18:37:31 +01:00
Imri Paran
729a06b5f0
fix: use enum.Enum instead of sqlalchemy enum (#18464) 2024-11-07 11:42:03 +01:00
Mayur Singal
8d40d8ea77
MINOR: Fix Materialized View Lineage (#18539) 2024-11-07 09:21:54 +01:00
Mayur Singal
66cf003cc3
MINOR: Fix pytest 3.11 taking 2hr (#18533) 2024-11-06 19:28:48 +05:30
Mayur Singal
f813ab730e
MINOR: Airflow dependency Fix (#18530) 2024-11-06 15:51:43 +05:30
Teddy
d579008c99
GEN 1683 - Add Column Value to be At Expected Location Test (#18524)
* feat: added column value to be in expected location test

* fix: renamed value -> values

* doc: added 1.6 documentatio entry

* style: ran python linting

* fix: move data packaging to pyproject.yaml

* fix: add init file back for data package

* fix: failing test case
2024-11-06 11:17:13 +01:00
Mayur Singal
5660a751e3
GEN-2000: Add Support for PowerBI Report Server (#18513) 2024-11-06 14:55:05 +05:30
Suman Maharana
426ad2000b
Fix #17778 : Databricks query run optimisation (#18467)
* Fix : Databricks query run  optimization

* Fixed dialect error

* fix get columns

* py format

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2024-11-06 10:10:01 +05:30
IceS2
dccba20101
Return s3 endpoint as str() instead of Url (#18521) 2024-11-05 17:39:50 +00:00
Katarzyna Kałek
47c75fe6a7
Enhanced Glue ingestion with external table features (#18511)
* added fileFormat, locationPath and external table lineage to Glue ingestion

* Improve Lineage Label

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-11-05 21:48:20 +05:30
Imri Paran
84391e7078
MINOR: tests: fix Tuple in bigquery e2e cli (#18499)
* tests: fix Tuple in bigquery e2e cli

* tests: fix Tuple in bigquery e2e cli

* fix workflow condition
2024-11-04 09:54:10 -08:00
Teddy
9a685d5f19
fix: pass row and result computation for inSet test (#18466) 2024-10-31 08:15:18 +00:00
Nicola Coretti
7ebc62dca7
feat: Add support for exasol datasource (#17166)
* Add flake.nix

* Add lockfile for flake

* Update nix environment and document usage

* Add schema for exasol connector

* Add Exasol definitions to databaseService

* Fix error in exasol connector schema

* Add additional connection options/settings to exasol connector

* Add exasol-connector to ui

* Add depdencies for exasol-connector

* Update notes

* Update ingestion code

* Add Basic Documentation for Exasol Connector

* Update flake file

* Add developer notes

* Add python script which can be used as entry point for debugging in ide

* Add config file which can be used for debugging (manual execution)

* Update debug script

* Update developer notes

* Remove old developer notes

* Add .venv to gitignore

* Update dev notes

* Update development notes

* Update ExasolSource

* Establish basic connection to Exasol DB from connector

* Update exasol connector connection settings

* Add service_spec for exasol plugin

* Remove development files

* Remove unused module

* Applied code formatter

* Update exasol dependency constraint(s)

* Add unit test for exasol connection url(s)

* Fixed test expectations for exasol connection url test(s)

* Adjust the test query for the Exasol connection test
2024-10-31 08:11:30 +01:00
Imri Paran
016a840b2f
MINOR Fix snowflake profiler by using case-insensitive strings (#18438)
* use snowflake system metrics computer instead of source

* reverted pylint

* use case-insensitive strings equality for snowflake filters
2024-10-29 18:33:36 +01:00
Suman Maharana
67a9e63439
Minor: Fixed dbtcloud test connection and improved docs (#18408) 2024-10-29 14:39:52 +05:30
Onkar Ravgan
4a0c8406e9
[ER Diagrams] Add ER diagram APIs and sample data (#18021)
* Add ER diag APIs and sample data

* fix pylint

* formatting fixes2

* fixed es client return

* fixed os client return

* supported TableDetailPage tabs as classBase for supporting collate only tabs

* Added schema Apis

* change the base class to .ts and move the component in the util files

* beautify function arguments

* Added optimizations

* Ingestion changes

* svg dimension change

* supported class base tab in databaseSchema

* supported classBase action button in schema table name column

* added further keys data for constraint modal

* fix sonar issue

* remove old method to override edit action on column and shifted to DisplayNameModal for fields

* supported table right panel component to further extends on collate side

* minor fix around duplicate constraint

* added support to update table constraints and column constraints in the UI

* code optimization and minor fixes

* review comments and multi col fix

* added queryFilter option in NodeSuggestion and tableConstrainst to fetch and use only in service tables

---------

Co-authored-by: Ashish Gupta <ashish@getcollate.io>
2024-10-28 20:26:19 +05:30
Mayur Singal
9d91325af8
Lineage-1: Move view lineage processing to lineage workflow (#18220)
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2024-10-28 18:18:22 +05:30
Nicola Coretti
24de281026
Fix docstring in the Doris metadata module (#18421) 2024-10-28 17:58:04 +05:30
Mayur Singal
4083838056
MINOR: Couchbase Secondary Index Fix (#18398) 2024-10-24 20:34:55 +05:30
harshsoni2024
1a8bba6058
GEN-1911: Quicksight lineage source fix (#18348) 2024-10-24 11:41:37 +05:30
Imri Paran
95982b9395
[GEN-356] Use ServiceSpec for loading sources based on connectors (#18322)
* ref(profiler): use di for system profile

- use source classes that can be overridden in system profiles
- use a manifest class instead of factory to specify which class to resolve for connectors
- example usage can be seen in redshift and snowflake

* - added manifests for all custom profilers
- used super() dependency injection in order for system metrics source
- formatting

* - implement spec for all source types
- added docs for the new specification
- added some pylint ignores in the importer module

* remove TYPE_CHECKING in core.py

* - deleted valuedispatch function
- deleted get_system_metrics_by_dialect
- implemented BigQueryProfiler with a system metrics source
- moved import_source_class to BaseSpec

* - removed tests related to the profiler factory

* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger

* - reverted start_time
- removed DML_STAT_TO_DML_STATEMENT_MAPPING
- removed unused logger

* fixed tests

* format

* bigquery system profile e2e tests

* fixed module docstring

* - removed import_side_effects from redshift. we still use it in postgres for the orm conversion maps.
- removed leftover methods

* - tests for BaseSpec
- moved get_class_path to importer

* - moved constructors around to get rid of useless kwargs

* - changed test_system_metric

* - added linage and usage to service_spec
- fixed postgres native lineage test

* add comments on collaborative constructors
2024-10-24 07:47:50 +02:00
Ayush Shah
51347a981a
fixes: Mode test connection returns data in dict instead of json (#18386) 2024-10-24 11:11:39 +05:30
Vijay Lakshmanan
4f2ef6fe5c
Fixes #16263: Fixed Mode dashboard ingestion API call (#18355) 2024-10-23 12:03:08 +05:30
Pere Miquel Brull
5e80ad9fc3
MINOR - Only timeout on main threads (#18341) 2024-10-21 15:18:33 +02:00
Teddy
dcf71aa0ea
fix: lazy load classes from factory method (#18321) 2024-10-21 11:29:03 +02:00
Mayur Singal
a4d62f6d85
MINOR: Add location path to table entity (#18307) 2024-10-21 10:31:27 +05:30
Pere Miquel Brull
c2929e67e6
MINOR - Return TestConnectionResult from test_connection_fn (#18320)
* MINOR - Return TestConnectionResult from test_connection fn

* MINOR - Return TestConnectionResult from test_connection fn
2024-10-18 09:54:07 +02:00
Pere Miquel Brull
5074f6588f
MINOR - Validate app runner init (#18316) 2024-10-18 09:40:06 +02:00
Katarzyna Kałek
c9995eecb6
FIX #18309: fixed task deserialization in Airflow metadata ingestion (#18310)
* fixed task deserialization in Airflow metadata ingestion

* fixed formatting

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
2024-10-17 14:51:55 -07:00
Pere Miquel Brull
7012e73d75
GEN-1166 - Improve Ingestion Workflow Error Summary (#18280)
* GEN-1166 - Improve Ingestion Workflow Error Summary

* fix test

* docs

* comments
2024-10-16 18:15:50 +02:00
Pere Miquel Brull
89b6c1c1cd
MINOR - Pass timeout to test connection and return TestConnectionStep (#18236)
* update connections

* MINOR - Pass timeout in test connection and return TestConnectionStep

* format

* comments

* comments
2024-10-16 18:15:28 +02:00
harshsoni2024
4f89dc582b
salesforce table description from label if not through query (#18286) 2024-10-16 12:56:44 +05:30
harshsoni2024
51448452d0
MINOR: Fix pinotdb col. datatype error (#18268) 2024-10-16 11:35:27 +05:30
Mayur Singal
592d7396bc
MINOR: Fix Couchbase columns not fetched (#18284) 2024-10-16 09:53:57 +05:30
Ayush Shah
40bd3bd3fa
Fixes #18186: Quicksight Ingestion Error handled (#18218) 2024-10-16 09:52:07 +05:30
Imri Paran
be82086e25
MINOR: add column case sensitivity parameter (#18115)
* fix(data-quality): table diff

- added handling for case-insensitive columns
- added handling for different numeric types (int/float/Decimal)
- added handling of boolean test case parameters

* add migrations for table diff

* add migrations for table diff

* removed cross type diff for now. it appears to be flaky

* fixed migrations

* use casefold() instead of lower()

* - implemented utils.get_test_case_param_value
- fixed params for case sensitive column

* handle bool test case parameters

* format

* testing

* format

* list -> List

* list -> List

* - change caseSensitiveColumns default to fase
- added migration to stay backward compatible

* - removed migration files
- updated logging message for table diff migration

* changed bool test case parameters default to always be false

* format

* docs: data diff

- added the caseSensitiveColumns parameter

requires: https://github.com/open-metadata/OpenMetadata/pull/18115

* fixed test_get_bool_test_case_param
2024-10-15 16:29:43 +02:00
Onkar Ravgan
e6705f25b3
fixed dbt tag name (#18273) 2024-10-15 16:43:03 +05:30
harshsoni2024
eb49d7a5bc
fix query for mysql con. (#18272) 2024-10-15 14:03:49 +05:30
Mayur Singal
8322c0f684
Fix #17963: Fix PinotDB Ingestion (#18266)
* Fix #17963: Fix PinotDB Ingestion

* fix conn args
2024-10-15 08:36:40 +05:30