3502 Commits

Author SHA1 Message Date
Mayur Singal
698956783b
Fix #1532: Fix Error ingesting using Datalake adls connector (#21243) 2025-05-19 12:30:56 +05:30
Pere Miquel Brull
6444ea3750
FIX - Ingestion workaround for Services with null secrets (#21260)
* FIX - Ingestion workaround for Services with null secrets

* linting
2025-05-19 08:53:37 +02:00
Suman Maharana
5a3d40f643
Fix: dbt multi owner support from manifest (#21233) 2025-05-19 12:04:22 +05:30
Mayur Singal
9ec424a3fa
Fix #1550: Metadata ingestion errors from Azure Data Lake (#21261) 2025-05-19 11:44:19 +05:30
Mayur Singal
7efa5e650b
MINOR: Add athena schema comment support (#21262) 2025-05-19 10:31:15 +05:30
Pere Miquel Brull
aa96019ab1
Rel to #1575 - LabelType Generated (#21244)
* Rel to #1575 - LabelType Generated

* migration

* format

* tests

* generate types for taglabel

---------

Co-authored-by: karanh37 <karanh37@gmail.com>
2025-05-19 06:59:13 +02:00
Mayur Singal
2157337847
MINOR: Configurable account usage for incremental metadata extraction (#21182) 2025-05-19 10:15:29 +05:30
Mohit Tilala
4c0ce77756
Fix airbyte pipeline lineage extraction (#21151) 2025-05-19 10:14:33 +05:30
Mayur Singal
703118f2b5
MINOR: Disable Flaky superset tests (#21242) 2025-05-18 23:12:42 +05:30
Teddy
2e8e79ff0a
ISSUE #17170: handle oracle unique count (#21225)
* fix: handle oracle unique count

* fix: failing test case
2025-05-16 17:44:28 +05:30
Pere Menal-Ferrer
a7e2f33adc
feature/pii-column-classifier (#21200)
* Add PII Tag and Sensitivity Level enums.

* Add feature-extraction for PII classification tasks

* Add faker as test dependency

* Add unit tests for presidio tag extractor

* Add PIISensitivityTags enum and update sensitivity mapping logic

* Add Presidio utility functions for PII analysis

* Extend column name regexs for PII

* Add colum name split

* Move pii algorithms to dedicated package

* Add tests for PAN, NIF, SSN entities

* Fix linting

* Add comment on why we need to set specific lanaguage to Presidio recognizers, as per PR suggestion.

* Fix version of faker to prevent flaky tests. Fix failing tests.

* Fix wrong import

---------

Co-authored-by: Pere Menal <pere.menal@getcollate.io>
2025-05-16 14:03:49 +02:00
harshsoni2024
9c9e885d77
issue-20074: s3 objects get paginated response (#21208) 2025-05-15 18:20:10 +05:30
harshsoni2024
35c1f5aead
issue-19890: PBI dataflow support (#21207) 2025-05-15 18:17:49 +05:30
Suman Maharana
2864e0f09d
Minor: Add sql query for dbt lineage with nodes (#21214) 2025-05-15 17:49:47 +05:30
Suman Maharana
f81ee52ec4
Chore Ingestion Tableau library change (#21076) 2025-05-15 17:48:39 +05:30
Teddy
cd6434dd73
ISSUE #21146 - Properly handle connection on sampler (#21186)
* fix: properly close connection on sampler ingestion

* fix: dangling connection test

* style: ran python linting

* fix: revert to 9
2025-05-15 12:21:01 +02:00
IceS2
87463df51d
Fixes #21095: Handle Conn Retry and implement is_disconnect for MSSQL (#21185)
* Handle Conn Retry and implement is_disconnect for MSSQL

* Change log to debug
2025-05-15 12:19:58 +02:00
Keshav Mohta
199aec8d3c
fix: bigquery root= in connection (#21154) 2025-05-14 19:43:20 +00:00
Mayur Singal
7abbb73ae2
MINOR: Handle udf definition fetch exceptions (#21188) 2025-05-15 00:01:03 +05:30
Mayur Singal
618897be85
Fix #1552: Improve Fetch Oracle View Definition Query (#21177) 2025-05-14 15:48:28 +05:30
Suman Maharana
b70eeac947
Fix #21030 - Snowflake Tags Not Reattached (#21141) 2025-05-14 15:41:58 +05:30
harshsoni2024
3b382c1bd9
issue-20737: datalake parquet different extensions (#21048) 2025-05-13 11:23:46 +05:30
harshsoni2024
234d302fbd
MINOR: parquet endpoint null case error (#21015) 2025-05-12 14:46:13 +05:30
Teddy
a853561d30
MINOR: data sample ingestion bigquery (#21074)
* fix: data sample ingestion bigquery

* style: ran python linting

* fix: flaky test in topology
2025-05-06 15:58:37 +02:00
Imri Paran
d91273a30d
Fix 20325: Trigger external apps with config (#20397)
* wip

* feat: trigger external apps with override config

- Added in openmetadata-airflow-apis functionality to trigger DAG with feature.
- Modified openmetadata-airflow-apis application runner to accept override config from params.
- Added overloaded runPipeline with `Map<String,Object> config` to allow triggering apps with configuration. We might want to expand this to all ingestion pipelines. For now its just for apps.
- Implemented an example external app that can be used to test functionality of external apps. The app can be enabled by setting the `ENABLE_APP_HelloPipelines=true` environment variable.

* fix class doc for application

* fixed README for airflow apis

* fixes

* set HelloPipelines to disabeld by default

* fixed basedpywright errros

* fixed app schema

* reduced airflow client runPipeline to an overload with null config
removed duplicate call to runPipeline in AppResource

* Update openmetadata-docs/content/v1.7.x-SNAPSHOT/developers/applications/index.md

Co-authored-by: Matias Puerta <matias@getcollate.io>

* deleted documentation file

---------

Co-authored-by: Matias Puerta <matias@getcollate.io>
2025-05-06 17:41:24 +07:00
Mayur Singal
2289e04a9f
Fix #1505: Fix limit reached logs (#21046) 2025-04-30 11:34:51 +05:30
Mayur Singal
9755662240
Fix #20902: Fix duplicate constraints error (#21037) 2025-04-30 11:34:35 +05:30
harshsoni2024
970f6fbc0b
pbi dashboard sourceurl fix (#21026) 2025-04-30 10:21:38 +05:30
Mayur Singal
b3caf5b5d1
Fix #20024: Fix get schema names logic for postgres (#21036) 2025-04-29 22:49:39 +05:30
Pere Miquel Brull
d901dd2948
FIX #16284 - Toggle if we want to raise workflow errors (#20969)
* FIX #16284 - Toggle if we want to raise workflow errors

* schema

* schema

* move prop

* fix

* move prop

* improve error handling

* Update openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/IngestionPipelineRepository.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/IngestionPipelineRepository.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add the `Raise on Error` option to the ingestion schedule step

* Revert "Update openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/IngestionPipelineRepository.java"

This reverts commit 985b73513a59695c6bb39ad41c2d273bbf4e5d22.

* Update the tests

* Fix sonar issue

---------

Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-29 08:19:13 +02:00
Ayush Shah
7dd987a799
feat: Add logging for table processing in DatalakeSource class (#20968) 2025-04-28 11:32:26 +05:30
Keshav Mohta
6d64fbf9c5
fix: service connection config not getting updated with multiple project ids after ingestion (#20975) 2025-04-25 14:25:50 +00:00
Teddy
63a55437ae
GEN-1412: Implement load test logic (#19155)
* feat: implemented load test logic

* style: ran python linting

* fix: added locust dependency in test

* fix: skip locust in 3.8 as not supported

* fix: update gcsfs version

* fix: revert gcsfs versionning

* fix: fix gcsf version to 2023.10

* fix: dagster graphql and gx versions

* fix: dagster version to 1.8 for py8 compatibility

* fix: fix clickhouse to 0.2 as 0.3 requires SQA 2+

* fix: revert changes from main

* fix: revert changes compared to main
2025-04-24 16:08:38 +02:00
Keshav Mohta
0488bef060
fix: string type as get system datatype (#20922) 2025-04-24 15:40:58 +05:30
Teddy
209793f315
MINOR - Add support for GX 1.4 (#20934)
* fix: add support for GX 0.18.22 and GX 1.4.x

* fix: add  support for GX 0.18.22 and GX 1.4.x

* style: ran python linting

* fix: skip test if GX version is not installed
2025-04-24 11:55:04 +02:00
harshsoni2024
17dd182cbb
e2e fix (#20952) 2025-04-24 15:23:43 +05:30
Ashish Gupta
73aaa34b75
update the snapshot to 1.8.0 (#20925) 2025-04-24 10:46:36 +05:30
Teddy
75b7e463be
ISSUE #19175: Handle pk for snowflake in data diff (#19734)
* fix: handle pk for snowflake in data diff

* fix: trino failure
2025-04-23 12:15:42 +02:00
harshsoni2024
0d4e9e0e09
MINOR: REST minor fixes (#20907) 2025-04-22 15:10:15 +05:30
Keshav Mohta
1063e019ba
Fixes: Bigquery E2E (#20863) 2025-04-17 11:43:14 +05:30
Mayur Singal
88d8553084
Revert "MINOR: Improve UDF Lineage Processing & Better Logging Time & MultiProcessing (#20848)" (#20872)
This reverts commit 5ea9f22492749867f9ea53465b817f52bd383ca2.
2025-04-17 00:35:56 +05:30
Mayur Singal
5ea9f22492
MINOR: Improve UDF Lineage Processing & Better Logging Time & MultiProcessing (#20848) 2025-04-17 00:09:52 +05:30
Sasha Malahov
105ba064a9
MINOR: Kinesis missing nexttoken 2025-04-16 18:57:31 +05:30
Mayur Singal
654529ab7a
MINOR: Suppress Pydantic Warnings (#20851) 2025-04-16 16:44:14 +05:30
Akash Jain
0f6d0523d8
feat: Bump Versions to 1.7.0-SNAPSHOT on Main Branch (#20847)
* feat: Bump Versions to 1.7.0-SNAPSHOT on Main Branch

* fix(script): Add a condition for "-SNAPSHOT" is version update script
2025-04-16 15:21:01 +05:30
harshsoni2024
fb5af8ad7c
bigquery lib fix (#20849) 2025-04-16 08:04:26 +02:00
Keshav Mohta
1a6224824b
Fixes: BQ Multiple Project E2E (#20797)
* fix: bq e2e lineage and counts

* fix: bigquery multiple project classify

* fix: tests count from 19 to 17
2025-04-15 17:35:22 +05:30
Teddy
1edeb0baf8
MINOR: classification + test workflow for BQ multiproject (#20779)
* fix: classification + test workflow for BQ multiproject

* fix: deleted e2e test as handled from the UI

* fix: failing test case
2025-04-15 10:37:29 +02:00
Mayur Singal
3e4b8f5293
MINOR: Fix 'lr_sqlparser' referenced before assignment (#20823) 2025-04-15 13:00:04 +05:30
Mayur Singal
40ab1814c0
MINOR: Always Include DDL for Views (#20784) 2025-04-15 12:59:50 +05:30