3502 Commits

Author SHA1 Message Date
Mayur Singal
56d3c335a7
MINOR: Fix pydantic warnings in ingestion (#20822)
* MINOR: Fix pydantic warnings in ingestion

* pydantic fix
2025-04-15 06:59:05 +02:00
chrisrayrayne
b14f83940a
Fixes Issue 20189: REST connector checks updated (#20736) 2025-04-15 10:24:57 +05:30
Suman Maharana
4dd94516e7
Fix : Dashboard OverrideMetadata (#20793) 2025-04-14 12:07:02 +05:30
Pere Miquel Brull
6d02ed8b5c
MINOR - Fix py checkstyle (#20789)
* MINOR - Fix py checkstyle

* dummy commit
2025-04-13 11:58:52 +02:00
Pere Miquel Brull
c38209c63b
FIX CL-#1427 - PATCH applies inherited owners (#20759)
* FIX CL-#1427 - PATCH applies inherited owners

* FIX CL-#1427 - PATCH applies inherited owners

* format
2025-04-13 06:56:33 +02:00
Sasha Malahov
6dac3550ba
fix(ingestion): Correct Topic import in Kinesis source (#20787)
This PR fixes a bug in the Kinesis messaging source where the `Topic` class was incorrectly imported from `metadata.generated.schema.type.schema` instead of the correct entity
  definition path `metadata.generated.schema.entity.data.topic`.

  **Problem:**
  The `yield_topic_sample_data` method used the `type.schema.Topic` definition when calling `fqn.build` and `metadata.get_by_name`. These functions expect the main entity
  class.

  **Fix:**
  Changed the import statement to use `metadata.generated.schema.entity.data.topic.Topic`.

  This ensures the correct type definition is used when interacting with the FQN utility and metadata API, preventing potential downstream issues.
2025-04-13 06:38:07 +02:00
Akash Verma
5cbe9badef
Wherescape Connector (#20500)
* omd side ws connector files

* Removed files

* add beta tag

* update enum name

* rename connection to databaseConnection

* Revert "rename connection to databaseConnection"

This reverts commit 9f1bc74e7aa6c156bedb8eefeb1a5435fcf72319.

* rename from connection to metastore

* rename connection to dbconnection

* UI Generated files

* fix connector UI

* fix connector UI

---------

Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: Sweta Agarwalla <swetaagarwalla13@gmail.com>
2025-04-11 18:04:58 +05:30
Keshav Mohta
d65a05e865
fix: added metadata in snowflake get_ometa_tag_and_classification (#20719) 2025-04-09 18:00:12 +05:30
Imri Paran
a0d631b7cb
chore: more linient pinning for python neo4j (#20722) 2025-04-09 09:05:44 +00:00
Keshav Mohta
18d6701429
Fixes: Snowflake Tags Ingestion (#20710) 2025-04-09 07:12:22 +02:00
Pere Miquel Brull
bcde18b387
DEPRECATE - Remove support for Python 3.8 (#20553) 2025-04-08 07:36:07 +02:00
Mayur Singal
4a407f6d0d
MINOR: Implement column validation in lineage patch api (#20545) 2025-04-07 21:24:46 +05:30
Pere Miquel Brull
3186937cc2
MINOR - Update Auto Classification defaults for sample data & classif… (#20587)
* MINOR - Update Auto Classification defaults for sample data & classification

* fix tests
2025-04-07 15:56:57 +02:00
Mayur Singal
b7d43e7ee2
MINOR: Improve threading for lineage (#20668) 2025-04-07 18:31:52 +05:30
Mohit Tilala
f7c4cc54f4
Revert "Fixes #20649: removed memory leak in Redshift ingestion (#20650)" (#20667)
This reverts commit 6c725278e3e7bd5cb5f492108fa27dc8f9487f82.
2025-04-07 18:16:50 +05:30
Mayur Singal
ee5d8eee8b
Revert "MINOR: Implement Column Validation in Lineage (#20544)" (#20658) 2025-04-07 17:13:35 +05:30
Keshav Mohta
0796c6274b
Fixes: Databricks httpPath Required (#20611)
* fix: made databricks httpPath required and added a migration file for the same

* fix: added sql migration in postDataMigration file and fix databricks tests

* fix: added httpPath in test_source_connection.py and test_source_parsing.py files

* fix: added httpPath in test_databricks_lineage.py

* fix: table name in postgres migration
2025-04-07 13:33:55 +05:30
harshsoni2024
7953f98097
issue-20546: REST connector enhancements (#20634) 2025-04-07 10:22:45 +05:30
Katarzyna Kałek
6c725278e3
Fixes #20649: removed memory leak in Redshift ingestion (#20650)
* closing proxy result in redshift ingestion

* fixed formatting

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
2025-04-06 23:32:38 +05:30
Suman Maharana
61e500253f
Fix: Improved test suite logging (#20635)
* Fix: Improved test suite logging

* linting
2025-04-04 16:51:21 +05:30
Imri Paran
f6441ad404
fix: trino data diff paths (#20457)
requires https://github.com/open-metadata/collate-data-diff/pull/6
2025-04-03 15:48:10 +02:00
Ayush Shah
76371e4a64
Enhance ingestion setup: Add dbt plugin to Playwright dependencies (#20605) 2025-04-03 19:11:33 +05:30
Suman Maharana
5275975d31
Fix: dbt cloud latest run execution (#20573)
* Fix: dbt cloud latest run execution

* update latest run id

* set default to 100
2025-04-03 11:13:17 +05:30
Mayur Singal
7760663b22
MINOR: Change ingestion licence header (#20549) 2025-04-03 10:39:47 +05:30
Mayur Singal
7991715135
MINOR: Implement Column Validation in Lineage (#20544) 2025-04-02 17:40:40 +05:30
harshsoni2024
f267d4ef01
issue-20519: Support PowerBI Owners ingestion (#20525) 2025-04-02 16:11:27 +05:30
Mayur Singal
c16b3df547
MINOR: Fix public schema lieage for postgres (#20548) 2025-04-02 15:30:24 +05:30
Pere Miquel Brull
7402feba6f
MINOR - Remove airflow_lineage_operator from final ingestion image (#20551) 2025-04-02 11:53:55 +02:00
Abdallah Serghine
2e0822b830
ISSUE-20427: fix tableau ingestion for null upstream table queries (#20428)
For tableau ingestion, code does not handle properly null upstream custom table queries
and null values for table OM entities.

Co-authored-by: Abdallah Serghine <abdallah.serghine@olx.pl>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-04-02 11:53:30 +02:00
Imri Paran
663839bd85
test: assert dangling db connections (#20458)
added dangling connection assertions for mysql integration test
2025-04-02 08:38:17 +02:00
Katarzyna Kałek
4ec2077bbc
Unpinned google-cloud-secret-manager version in ingestion dependencies (#19469)
* Unpinned google-cloud-secret-manager version in ingestion dependencies

* Restrict google-cloud-secret-manager version to <2.20.1 because of mlflow-skinny dependency issue

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Mohit Tilala <tilalamohit123@gmail.com>
2025-04-02 07:53:43 +02:00
Ayush Shah
eca3770a93
MINOR: Update Playwright integration test workflows to use 'playwright' deps (#20558) 2025-04-01 23:42:31 +05:30
Teddy
fbf6377e3f
fix: switch dbx to approx_percentile (#20554) 2025-04-01 17:36:43 +02:00
Mohit Tilala
06ab82170b
Fixes #19534: Snowflake stream ingestion support (#20278) 2025-04-01 13:02:37 +05:30
Mohit Tilala
7ad97afa62
Fixes #19690: Add QlikCloud dashboard filter by space name type (#20315) 2025-04-01 13:00:50 +05:30
Pere Miquel Brull
c08273b4ad
MINOR: Allow loading ometa from env (#20511) 2025-03-31 12:06:33 +02:00
Mayur Singal
d9fc607768
MINOR: Fix python checkstyle (#20489) 2025-03-31 12:02:32 +05:30
Katarzyna Kałek
fdec8c7b6b
deleted empty warning from s3 ingestion (#20477)
Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
2025-03-27 16:36:51 +01:00
Keshav Mohta
3c17a3025c
Fixes: Ingest System Tags (#20432) 2025-03-27 20:00:33 +05:30
Mayur Singal
e6b7b89f86
Fix #20236: Handle Sample Data with non-utf8 characters (#20380) 2025-03-27 14:20:26 +05:30
Ayush Shah
7a3990f350
Fixes 19119: Enhance TableCustomSQLQueryValidator to support threshold operation (#20307) 2025-03-27 13:11:56 +05:30
Ayush Shah
653c878497
MINOR: Transform Reserved keywords like quotes to OM compatible (#20459) 2025-03-27 13:02:07 +05:30
Mayur Singal
766d0caebc
MINOR: Lineage Improvements (#20446) 2025-03-27 11:57:23 +05:30
Suman Maharana
b4ec3f29a5
Fix - DBT Better Logs and Improved Error Handling (#20433)
* Fix - DBT Better Logs and Improved Error Handling

* lint
2025-03-26 15:38:54 +01:00
Suman Maharana
f9cad9e379
Fix : Test Suite 'NoneType' object has no attribute 'id' Handling (#20444) 2025-03-26 15:38:43 +01:00
Imri Paran
1db72c2c1e
MINOR: fix: close client after query (#19711)
* fix: close client after query

use context clients in SQL sampler to close the connection once the query is complete

* use self.context_client in all sql sampler implementations

* use sqlalchemy's built-in session management

* format

* format

* use get_client directly
2025-03-25 13:48:18 +01:00
Ayush Shah
1434b5dba2
Enhance SQL column processing for BigQuery ingestion (#20408)
- Refactored the handling of nested columns in `sql_column_handler.py` to prioritize source-provided children, ensuring they override any derived children.
- Removed the overridden `_process_col_type` method in `bigquery/metadata.py` to streamline column type handling, enforcing the use of the standard path for BigQuery.

This update improves the accuracy of column metadata processing and simplifies the codebase.
2025-03-25 14:40:46 +05:30
Ayush Shah
a9d6b5760d
Fixes #18346: feat(BigQuery) Enhance project ID discovery and ADC support (#20085)
* Refactor SQL column processing and enhance BigQuery project ID handling

* Introduced a new `process_column` function in `sql_column_handler.py` to streamline column processing logic.
* Updated `BigquerySource` to improve project ID retrieval from service connections, ensuring compatibility with various credential types.
* Added handling for nested columns in BigQuery schema processing.
* Enhanced error handling and logging for better debugging during project ID setup.

* Add support for GCP Application Default Credentials in BigQuery ingestion

* Enhanced `BigquerySource` to include handling for GCP Application Default Credentials (ADC).
* Updated JSON schema for GCP credentials to define `gcpADC` and its properties.
* Improved logging for credential setup in `set_google_credentials` function.
* Added comments and TODOs for future enhancements related to project ID fetching from the resource manager.

* Update .gitignore to include cursor rules files

* Added .cursorrules and .cursor/ to the .gitignore to prevent tracking of cursor rule files in the repository.
* This change helps maintain a cleaner repository by excluding unnecessary files from version control.

* refactor: Bigquery Credentials to allow multiple project ids

* fix: Handle unknown array data types in SQL column processing
2025-03-24 15:31:10 -07:00
mgorsk1
760022c185
fix: DBT GCS Integration fails when BigQuery multi-project config is used (#20372) 2025-03-24 11:44:58 +05:30
Mayur Singal
71927cc30b
MINOR: Fix mariadb profiling with Time datatype (#20376) 2025-03-24 10:46:18 +05:30