3385 Commits

Author SHA1 Message Date
harshsoni2024
7953f98097
issue-20546: REST connector enhancements (#20634) 2025-04-07 10:22:45 +05:30
Katarzyna Kałek
6c725278e3
Fixes #20649: removed memory leak in Redshift ingestion (#20650)
* closing proxy result in redshift ingestion

* fixed formatting

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
2025-04-06 23:32:38 +05:30
Suman Maharana
61e500253f
Fix: Improved test suite logging (#20635)
* Fix: Improved test suite logging

* linting
2025-04-04 16:51:21 +05:30
Imri Paran
f6441ad404
fix: trino data diff paths (#20457)
requires https://github.com/open-metadata/collate-data-diff/pull/6
2025-04-03 15:48:10 +02:00
Ayush Shah
76371e4a64
Enhance ingestion setup: Add dbt plugin to Playwright dependencies (#20605) 2025-04-03 19:11:33 +05:30
Suman Maharana
5275975d31
Fix: dbt cloud latest run execution (#20573)
* Fix: dbt cloud latest run execution

* update latest run id

* set default to 100
2025-04-03 11:13:17 +05:30
Mayur Singal
7760663b22
MINOR: Change ingestion licence header (#20549) 2025-04-03 10:39:47 +05:30
Mayur Singal
7991715135
MINOR: Implement Column Validation in Lineage (#20544) 2025-04-02 17:40:40 +05:30
harshsoni2024
f267d4ef01
issue-20519: Support PowerBI Owners ingestion (#20525) 2025-04-02 16:11:27 +05:30
Mayur Singal
c16b3df547
MINOR: Fix public schema lieage for postgres (#20548) 2025-04-02 15:30:24 +05:30
Pere Miquel Brull
7402feba6f
MINOR - Remove airflow_lineage_operator from final ingestion image (#20551) 2025-04-02 11:53:55 +02:00
Abdallah Serghine
2e0822b830
ISSUE-20427: fix tableau ingestion for null upstream table queries (#20428)
For tableau ingestion, code does not handle properly null upstream custom table queries
and null values for table OM entities.

Co-authored-by: Abdallah Serghine <abdallah.serghine@olx.pl>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-04-02 11:53:30 +02:00
Imri Paran
663839bd85
test: assert dangling db connections (#20458)
added dangling connection assertions for mysql integration test
2025-04-02 08:38:17 +02:00
Katarzyna Kałek
4ec2077bbc
Unpinned google-cloud-secret-manager version in ingestion dependencies (#19469)
* Unpinned google-cloud-secret-manager version in ingestion dependencies

* Restrict google-cloud-secret-manager version to <2.20.1 because of mlflow-skinny dependency issue

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Mohit Tilala <tilalamohit123@gmail.com>
2025-04-02 07:53:43 +02:00
Ayush Shah
eca3770a93
MINOR: Update Playwright integration test workflows to use 'playwright' deps (#20558) 2025-04-01 23:42:31 +05:30
Teddy
fbf6377e3f
fix: switch dbx to approx_percentile (#20554) 2025-04-01 17:36:43 +02:00
Mohit Tilala
06ab82170b
Fixes #19534: Snowflake stream ingestion support (#20278) 2025-04-01 13:02:37 +05:30
Mohit Tilala
7ad97afa62
Fixes #19690: Add QlikCloud dashboard filter by space name type (#20315) 2025-04-01 13:00:50 +05:30
Pere Miquel Brull
c08273b4ad
MINOR: Allow loading ometa from env (#20511) 2025-03-31 12:06:33 +02:00
Mayur Singal
d9fc607768
MINOR: Fix python checkstyle (#20489) 2025-03-31 12:02:32 +05:30
Katarzyna Kałek
fdec8c7b6b
deleted empty warning from s3 ingestion (#20477)
Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
2025-03-27 16:36:51 +01:00
Keshav Mohta
3c17a3025c
Fixes: Ingest System Tags (#20432) 2025-03-27 20:00:33 +05:30
Mayur Singal
e6b7b89f86
Fix #20236: Handle Sample Data with non-utf8 characters (#20380) 2025-03-27 14:20:26 +05:30
Ayush Shah
7a3990f350
Fixes 19119: Enhance TableCustomSQLQueryValidator to support threshold operation (#20307) 2025-03-27 13:11:56 +05:30
Ayush Shah
653c878497
MINOR: Transform Reserved keywords like quotes to OM compatible (#20459) 2025-03-27 13:02:07 +05:30
Mayur Singal
766d0caebc
MINOR: Lineage Improvements (#20446) 2025-03-27 11:57:23 +05:30
Suman Maharana
b4ec3f29a5
Fix - DBT Better Logs and Improved Error Handling (#20433)
* Fix - DBT Better Logs and Improved Error Handling

* lint
2025-03-26 15:38:54 +01:00
Suman Maharana
f9cad9e379
Fix : Test Suite 'NoneType' object has no attribute 'id' Handling (#20444) 2025-03-26 15:38:43 +01:00
Imri Paran
1db72c2c1e
MINOR: fix: close client after query (#19711)
* fix: close client after query

use context clients in SQL sampler to close the connection once the query is complete

* use self.context_client in all sql sampler implementations

* use sqlalchemy's built-in session management

* format

* format

* use get_client directly
2025-03-25 13:48:18 +01:00
Ayush Shah
1434b5dba2
Enhance SQL column processing for BigQuery ingestion (#20408)
- Refactored the handling of nested columns in `sql_column_handler.py` to prioritize source-provided children, ensuring they override any derived children.
- Removed the overridden `_process_col_type` method in `bigquery/metadata.py` to streamline column type handling, enforcing the use of the standard path for BigQuery.

This update improves the accuracy of column metadata processing and simplifies the codebase.
2025-03-25 14:40:46 +05:30
Ayush Shah
a9d6b5760d
Fixes #18346: feat(BigQuery) Enhance project ID discovery and ADC support (#20085)
* Refactor SQL column processing and enhance BigQuery project ID handling

* Introduced a new `process_column` function in `sql_column_handler.py` to streamline column processing logic.
* Updated `BigquerySource` to improve project ID retrieval from service connections, ensuring compatibility with various credential types.
* Added handling for nested columns in BigQuery schema processing.
* Enhanced error handling and logging for better debugging during project ID setup.

* Add support for GCP Application Default Credentials in BigQuery ingestion

* Enhanced `BigquerySource` to include handling for GCP Application Default Credentials (ADC).
* Updated JSON schema for GCP credentials to define `gcpADC` and its properties.
* Improved logging for credential setup in `set_google_credentials` function.
* Added comments and TODOs for future enhancements related to project ID fetching from the resource manager.

* Update .gitignore to include cursor rules files

* Added .cursorrules and .cursor/ to the .gitignore to prevent tracking of cursor rule files in the repository.
* This change helps maintain a cleaner repository by excluding unnecessary files from version control.

* refactor: Bigquery Credentials to allow multiple project ids

* fix: Handle unknown array data types in SQL column processing
2025-03-24 15:31:10 -07:00
mgorsk1
760022c185
fix: DBT GCS Integration fails when BigQuery multi-project config is used (#20372) 2025-03-24 11:44:58 +05:30
Mayur Singal
71927cc30b
MINOR: Fix mariadb profiling with Time datatype (#20376) 2025-03-24 10:46:18 +05:30
Suman Maharana
b764a3577f
Fix: Tableau CustomSQL datasource SQL not shown (#20377) 2025-03-21 09:28:40 +00:00
harshsoni2024
aca5956ab6
issue-20345: powerbi process workspace efficiently (#20346) 2025-03-21 10:52:07 +05:30
Ayush Shah
60974e4ea1
Revert "Fixes #17660: Oracle handle quotes for lowercase columns in workflow agents (#20309)" (#20364) 2025-03-20 21:02:58 +05:30
Teddy
ae176c2c0f
fix: raise error if no databases return in list ops (#20322) 2025-03-19 08:42:19 +01:00
Teddy
52382d2737
ISSUE #19185 -- Allow user to choose non random sample (#20299)
* feat: allow user to turn off rnadomized sample

* style: ran python linting

* fix: models default value for randomizedSample

* style: ran linting

* doc: move config to advanced
2025-03-19 08:42:00 +01:00
Mayur Singal
fb3ba391ff
MINOR: Fix failing pytest (#20332) 2025-03-19 12:35:37 +05:30
Suman Maharana
a2057077ed
Fix: Added Tableau Customsql lineage (#20317) 2025-03-19 09:28:39 +05:30
Sriharsha Chintalapani
706cebd97a
Opensearch connector (#19698)
* Fix #19667: OpenSearch Connector

* Fix #19667: OpenSearch Connector

* do not ingest any system level indexes

* fix pyformat

* Add AWS auth

* Use common schema and fix ssl config in client

* Add openseach connector docs and update schema

* Remove api key auth type and complete docs checklist

* Remove unnecessary httpx dependency and pyformat

* Add compatible version of httpx for elasticsearch

* Fix pylint fails and py-tests validation error

---------

Co-authored-by: Mohit Tilala <tilalamohit123@gmail.com>
Co-authored-by: Mohit Tilala <63147650+mohittilala@users.noreply.github.com>
2025-03-18 18:45:25 +05:30
mgorsk1
09743368b0
🎉 Init (#19044) 2025-03-18 17:23:50 +05:30
Ayush Shah
20ab64d1f1
Fixes #17660: Oracle handle quotes for lowercase columns in workflow agents (#20309) 2025-03-18 15:48:58 +05:30
fuzmish
7fa3e53403
Fix: Pass raw value of extraHeaders to ClientConfig (#19989) 2025-03-18 13:55:51 +05:30
KC31
dddfdcb7b5
Fixes ISSUE-13953: Converted Nifi Client from requests module to OM REST client (#20039)
* ISSUE-13953 Converted Nifi Client from requests module to OM REST client

* pyformat

* lint

---------

Co-authored-by: kc <kc@kcs-MacBook-Pro.local>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-03-18 11:43:20 +05:30
Raul Marquez
b2497fb36e
Fix: Handle NULL created_at (#20015)
* Fix: Handle NULL created_at

* pyformat

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-03-18 03:42:14 +00:00
mgorsk1
31044bd06f
🎉 Init (#19267) 2025-03-17 17:06:32 -07:00
harshsoni2024
dba37820d7
MINOR: e2e fixes (#20301) 2025-03-17 21:00:26 +05:30
Akash Verma
cf7a442e32
Fixes #19891 : Added measures in powerbi (#19990) 2025-03-17 14:43:22 +05:30
Teddy
fbea55e5d3
feat: allow all as an argument in include columns (#20263)
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-03-17 09:10:12 +01:00