810 Commits

Author SHA1 Message Date
Keshav Mohta
8a03b00ad2
Fixes #23010 #: BigQuery Project Selection In Profiler & AutoClassification Workflow (#23233)
* fix: added code for separate engine and session for each project in rofiler and classification and refactor billing project approach

* fix: added entity.database check, bigquery sampling tests

* fix: system metrics logic when bigquery billing project is provided
2025-09-07 23:51:13 +05:30
Mohit Tilala
d1c646ed82 [Lineage] Fix cross services lineage changes of service_names to missed methods (#23240)
* Fix cross db changes of service_names to missed methods

* Handle string value passed to service_names

(cherry picked from commit 9b2b4d2452732c4efb5cf03632eeabbb1d85fe1c)
2025-09-04 15:35:39 +00:00
Mohit Tilala
aafc6d468f Fixes #22452: [Snowflake] Add custom host support for View in Snowflake source url (#23209)
(cherry picked from commit f2fd8a9107bdaaccdb6ca35f8515e1f273c3c1f7)
2025-09-03 09:01:49 +00:00
Mohit Tilala
537dc461fc Fixes #21895 #22363 #22369: Lineage improvements with multiprocessing, stored procedure level temp table processing and lineage filtering with db & schema (#22371)
* MINOR: Improve UDF Lineage Processing & Better Logging Time & MultiProcessing (#20848)

* Fix multiprocessing with better memory management and Airflow 2+ compatibility

* Add support for both multiprocessing and multithreading for relevant platforms

* Handle conflicting cross-db lineage changes of service_name parameter change

* Handle stored proc queries without caching all and increase the thread timeout times to cover 100% lineage

* Fix `get_table_query` inheritance and pylint

* Remove  mocks from db_utils tests

* Better db_utils test and fix the service_names parameter in case of schema_fallback

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
(cherry picked from commit 04a3639e47cacf68cc6732913b2c0c1cb15e495c)
2025-09-03 05:58:05 +00:00
Suman Maharana
d8ce507873 Add ssl support to hive (#22831)
* Add ssl support to hive

* Added missing ts files

* Added version to pure transport

* Added Tests

* fix tests add missing files

(cherry picked from commit 20e18d4f9f8fb6187a4a9610a793ff8ff5c134f8)
2025-09-02 21:17:18 +00:00
Mayur Singal
9eb6cfc04a MINOR: Add Unstructured Formats Support to GCS Storage Connector (#23158)
(cherry picked from commit 08ee62a1981ab242db02c119ed92edc58bca42d4)
2025-09-02 14:27:18 +00:00
Suman Maharana
57e4464e64 Fixes #22204 - Add support for sources key metadata fetch in dbt (#23003)
* Added support for sources key metadata fetch in dbt

* address comments

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes

* fixed tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit 30bceee58030590811b7ed95e0ad8ec4ae9c279b)
2025-09-02 04:53:42 +00:00
Ayush Shah
dd7682a3b5 Minor: Fix Databricks & Unity Catalog, Table or View Not found (#23014)
(cherry picked from commit 29bf750bde252c5a30eec4c7966217de5ab02274)
2025-08-29 02:17:46 +00:00
IceS2
06b5c10bcf FIXES #20807: Fix Oracle DataDiff and Change Oracle Connection to BaseConnection (#23020)
* Fix Oracle DataDiff and Change Oracle Connection to BaseConnection

* Add small unittest

* Fix Test

* Fix logic, to void other engines to denormalize table/schema names

(cherry picked from commit a696fe0111171c3079c5840c28a00073fae25003)
2025-08-26 09:05:03 +00:00
Mohit Tilala
9f8b7fcf5a Fixes #22238: [SAP HANA] Add calculated view columns' formula parsing logic (#23017)
* Add calculated view columns' formula parsing logic with correct source reference

* Handle top level column formula parsing and pass formula expression in column lineage detail

---------

Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
(cherry picked from commit 744494968e0e4d0266b42ea9a93a54ebb7ea6718)
2025-08-26 01:50:40 +00:00
Mayur Singal
155241605e
MINOR: Improve Databricks Profiler & Test Connection (#22732) 2025-08-22 17:35:49 +05:30
Keshav Mohta
da67de51cc Fix #18491: ingestion fails for Iceberg tables with nested partition column (#23031)
* fix: ingestion fails for Iceberg tables with nested partition column

* test: added test to cover nested partition column for iceberg

* refactor: used if-else in tablePartition check

* fix: partition_column_name & column_partition_type typo

(cherry picked from commit 2f655daedc1bb82fbb97bec0e42fca15a8bb7863)
2025-08-22 12:01:21 +00:00
Ayush Shah
c99edbe290 Fixes #21677: Refactor and enhance the entity name transformation logic (#22695) 2025-08-22 15:49:31 +05:30
Mohit Tilala
4f78c1d1b7 Fixes #22112: Snowflake schema tags inheritance (#22979)
* Add schema-level tags and tag inheritance support for snowflake

* Add tests for schema tag inheritance

* Lint fixes

(cherry picked from commit 26fedbaf0eccb774837791df6b3337643c72014d)
2025-08-20 04:24:09 +00:00
Mohit Tilala
3a22bb7b38 Fixes #22238: [SAP HANA] Correction of physical schema mapping and column lookup at each layer of calculation view (#22952)
(cherry picked from commit cc4b3574444fd41a2b46ac5f19666d1e96a18294)
2025-08-19 13:16:38 +00:00
Teddy
c69c21d82e ISSUE #1753 - Add Row Count to Custom SQL Test (#22697)
* feat: add count rows support for custom SQL

* style: ran python linting

* feat: added logic for partitioned custom sql row count

* migration: partitionExpression parameter

* chore: resolve conflicts

(cherry picked from commit d58b8a63d675e9bf91a2283a5f37702648cdab7f)
2025-08-19 10:25:15 +02:00
Ariel Schulz
8d98833622 Feature/1 fix and add lineage to exasol connector (#21399)
* Add lineage to Exasol connector

* Update test_connection to return TestConnectionResult

* Add exasol tests & dependencies to tests in setup.py

* Opensearch is required for testing, so add it there

* Modify metadata

* Update documentation for lineage

* Apply formatting changes to code

* Apply make py_format
2025-08-19 13:27:02 +05:30
Mayur Singal
f4127757c3 FIX #21180: Implement Cross Service Lineage (#22665) 2025-08-12 18:23:36 +05:30
Copilot
a4da357892 Add OpenAPI YAML format support for REST API ingestion (#22304)
* Initial plan

* Implement OpenAPI YAML support with backward JSON compatibility

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* fix tests & lint

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harshach <38649+harshach@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-08-12 18:23:14 +05:30
harshsoni2024
ae1a156d76 issue-22425: PowerBI parse expression along with measure (#22870) 2025-08-12 17:07:21 +05:30
Sriharsha Chintalapani
08d03b548b Fix #1093: Add Grafana Support (#22571)
* Fix #1093: Add Grafana Support

* Update generated TypeScript types

* Grafana test fix

* Update

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-08-12 15:57:54 +05:30
Mayur Singal
fe28faa13f
MINOR: Add support for csv.gz in datalake (#22666)
* MINOR: Add support for csv.gz in datalake

* fileformat change

* Update generated TypeScript types

* pyformat
2025-08-01 17:39:19 +05:30
IceS2
7f8298d49e
Update DataLake and PostgreSQL connection (#22682) 2025-08-01 11:08:43 +02:00
MarkSaf17
58078f582c
[WIP] Issue 22627 (#22647)
* [ISSUE-22627] fix(dbt): Support documented owner configuration with backward compatibility

* typo

* added tests

---------

Co-authored-by: marsafronov <marsafronov@ecom.tech>
Co-authored-by: SumanMaharana <sumanmaharana786@gmail.com>
2025-07-31 11:02:18 +00:00
Suman Maharana
3a90b38a26
Fix: Tableau ca cert auth (#22041)
* Fix: Tableau ca cert auth

* py_format

* Added ssl tests

* fix lint errors
2025-07-30 09:38:47 +05:30
harshsoni2024
50428e2e7b
feat-22574: Datalake ingestion fix for larger files (#22575) 2025-07-29 20:40:19 +05:30
Suman Maharana
670dc53b46
Minor: fix tableau handle none entities (#22630)
* Minor: fix tableau handle none entities

* added tests
2025-07-29 13:58:11 +02:00
Mayur Singal
199e3b981c
Fix #14830: Ignore non current columns for iceberg tables for glue & athena (#22564) 2025-07-29 16:19:09 +05:30
Mayur Singal
cc9506db20
MINOR: Postgres Implement schema fallback (#21858)
* MINOR: Postgres Implement schema fallback

* missing sql_lineage file
2025-07-29 14:45:21 +05:30
Suman Maharana
54dcdc7d82
Fix #20689: Trino Column validation errors for highly complex fields (#22421)
* Fix: Trino Column validation errors for highly complex fields

* addressed copilot comms

* fixed tests

* fixed tests and addressed comms

* missed file
2025-07-28 11:11:44 +05:30
IceS2
bad772db39
FIX #22099: enable 'Column values to be in set' test case for boolean columns (#22491)
* fix(dq): enable ''Column values to be in set'' test case for boolean columns

Add BOOLEAN to supportedDataTypes array in columnValuesToBeInSet.json
to allow boolean column validation with predefined allowed values.

This enables users to enforce strict true/false validation on boolean
columns directly at the column level, resolving issue #22099.

Co-authored-by: IceS2 <IceS2@users.noreply.github.com>

* Add tests to the new feature

* Add migrations and columnValuesToBeNotInSet

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: IceS2 <IceS2@users.noreply.github.com>
2025-07-25 15:17:38 +02:00
Mayur Singal
37b10a102f
MINOR: Improve ometa logging (#22586) 2025-07-25 18:26:44 +05:30
Mayur Singal
b8db86bc4f
MINOR: Fix airflow ingestion for older version (#22581) 2025-07-25 18:22:33 +05:30
Sriharsha Chintalapani
b0586f849f
Fix #22511: k8s secret support for Secrets Manager (#22516)
* Fix #22511: k8s secret support for Secrets Manager

* Update generated TypeScript types

* address comments

* pylint fix

* fix java checkstyle

* improve inCluster description in schema

* fix failing tests

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-07-24 12:40:51 +02:00
Chirag Madlani
b098395602
Data contracts support for tables & Multi Domain Migration (#22108)
* WIP - MINOR - Rule Engine

* WIP - MINOR - Rule Engine

* WIP - MINOR - Rule Engine

* WIP - MINOR - Rule Engine

* rules

* rules

* rules

* fix retrieval by entity

* test dc

* test dc

* WIP: Data contract feature

* destructure component to it's own files

* WIP contract tab

* update local

* fix test

* First iteration for multi domain support

* fix inheritance fields

* fix inheritance fields

* fix create interface

* fix few more tests

* fix indexing updates

* fix domain rel

* update domain --> domains

* merge

* fix merge

* fix csv tests and createEntity interface

* Update generated TypeScript types

* Trigger Build

* migrations

* fix tests

* fix tests

* fix tests

* Update generated TypeScript types

* Trigger Build

* handle drive service

* fix pg migration

* fix domains ref after merge and clean python tests

* Update generated TypeScript types

* fix merge domains

* format

* add missing migrations

* Update generated TypeScript types

* tests

* Update generated TypeScript types

* Trigger Build

* tests

* tests

* fix py test

* migrate domain to domains and fix compilation errors

* fix domain assignement

* fix domain spec

* fix py tests

* fix data product creation issue

* fix domain tests

* fix bulk import

* fix tests

* fix tests

* fix query and domain migration

* fix py test

* fix playwrights

* fix getEntitiesWithDisplayName indexing quotes

* fix domain prapogation tests

* fix domain propagation

* Fix patch api

* fix domain schema build edit playwright

* fix test

* fix test

* fix domain selection issue and console errors

* quick fix landing page changes

* fix remaining tests

* fix ui tests

* Fix adding data products

* format

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-07-22 09:34:50 +02:00
Mohit Tilala
64ec471e52
Fixes #22363 #22369: Stored procedure temp table processing and lineage filtering with db & schema (#22416)
* Process temp table graph in stored procedure processor and add db/schema filtering on lineage

* Add tests for stored procedure lineage processing

* Fix tests and py_format

* Fix the filters and log stored proc query count info
2025-07-18 12:32:22 +05:30
Suman Maharana
9838278ac4
Add: Schema and Database Mark Deletion (#22088)
* Added Schema and Database Mark Deletion

* removed unnecessary changes

* fixed marked deleted databases

* Added to all db connectors

* Added generated types

* Added tests
2025-07-15 16:26:46 +02:00
Ayush Shah
fe2caf7a5d
MINOR: Enhance patch request handling by adding 'skip_on_failure' parameter (#22142)
* Enhance patch request handling by adding 'skip_on_failure' parameter

* Introduced 'skip_on_failure' option in build_patch and OMetaPatchMixin methods to control behavior on patch operation failures.
* Updated documentation to reflect the new parameter and its default value.
* Improved error handling to log warnings instead of raising exceptions when 'skip_on_failure' is set to True.

* fix: add tests for patch request with skip on failure

* refactor: streamline mock patching and improve test readability in patch request tests

* Consolidated import statements for unittest mock.
* Enhanced readability by reducing line breaks and simplifying mock patching syntax.
* Ensured consistent use of commas in function calls for clarity.
* Updated tests to maintain functionality while improving code style.

* fix: improve error handling in patch operations

* Enhanced logging for patch operation failures in both build_patch and OMetaPatchMixin methods.
* Added detailed entity information in warning and error messages to aid in debugging.
* Ensured consistent behavior when 'skip_on_failure' is set, providing clearer feedback on operation outcomes.

* fix: clean up whitespace in patch request error handling

* Removed unnecessary whitespace in the build_patch function to improve code readability.
* Ensured consistent formatting in warning and error messages for better clarity during logging.

* fix: enhance error handling and improve test assertions in patch request

* Updated the condition for checking 'changeDescription' in the _remove_change_description function for better clarity.
* Modified exception handling in tests to raise RuntimeError instead of a generic Exception, providing more specific error feedback.
* Improved assertions in tests to check for the presence of error messages, enhancing the robustness of error handling verification.
* Adjusted test cases to reflect changes in expected patch operation counts and ensure accurate validation of patch operations.

* fix: enhance patch operation with skip_on_failure handling

* Added 'skip_on_failure' parameter to OMetaPatchMixin methods to control behavior on patch failures.
* Improved error handling to log warnings and provide detailed feedback when patch operations are skipped.
* Updated tests to verify the new behavior of skipping failures and improved assertions for clarity.
2025-07-14 12:33:17 +05:30
Mayur Singal
47b20a5f2d
MINOR: Fix databricks default schema issue (#22254) 2025-07-09 11:50:50 -07:00
Mohit Tilala
a6c0261728
Add lineage stored procedure and view filter pattern support (#22223)
* Add lineage stored procedure and view filter pattern support

* Update generated TypeScript types

* Add tests for lineage filter pattern

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-07-08 16:32:25 -07:00
Mohit Tilala
ebfce5ba7b
Handle quoted entity names in masking queries (#22174)
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-07-08 15:09:00 -07:00
Keshav Mohta
6e40f976e7
Fix #20145: Implemented Prefix For Dashboard Service (#21585)
* feat: implemented microstrategy lineage & dbServicePrefix

* feat: added dbServicePrefixes support in other dashboards

* fix: test_metabase and powerbi extra code remove

* fix: python checkstyle

* refactor: added prefix support for other connectors - superset, tableau, etc

* refactor: added migration for prefix change and fix dbServicePrefixes field description

* refactor: added prefix changes in superset db source

* doc: add prefix in tableau doc

* fix: typescript files and postgres migration for prefix

* fix: moved migration in 1.8.2

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-07-08 18:54:35 +02:00
Mayur Singal
573e3bfc21
MINOR: Improve Array Sampler for UC & DBX (#22155)
* MINOR: Improve Array Sampler for UC & DBX

* make log debug

* address comments
2025-07-08 14:25:20 +02:00
Mayur Singal
2fcf3281d8
MINOR: Fix snowflake map key type error (#22205) 2025-07-08 14:44:05 +05:30
IceS2
1260a0600a
MINOR: Update Snowflake Connection (#22167)
* Update Snowflake Connection

* Extracting needed methods
2025-07-07 16:38:22 +02:00
mgorsk1
3f01be6756
feat: derive downstream lineage from DBT exposures (#21992)
* 🎉 Init

* replace TARGET with EXPOSURE

* refactor and document

* add docs

* handle missing type/entity not matching

* linter

* update docs

* refactor for using label for communicating FQN as name field cannot contain special characters other than underscore. Storing dots in the name works for now but there is a deprecation warning and it will fail in the future.

* improve docs

* improve docs

* improve logging

* refactor for usage of meta.open_metadata_fqn

* linting

* update docs

* update docs

* fix docs

* 🎉 Add tests
2025-07-07 16:34:33 +02:00
Ferjani Nasraoui
b0e1a136cf
Fixes #21106: Support owner extraction from serialized Airflow DAGs (#22071)
* fix(airflow): correctly extract owners from serialized Airflow DAGs

Airflow serialization format wraps tasks under `__var` and `__type`.
Previously, the OpenMetadata Airflow connector failed to extract task owners properly in this format.

This patch:
- Flattens `__var` when parsing task owners
- Fallbacks to `default_args["owner"]` if no task-level owner is explicitly present
- Ensures correct DAG owner is picked as the most common task owner
- Handles compatibility with older Airflow versions

Fixes: #21106

* test(airflow): add tests for owner extraction from serialized Airflow DAGs

Adds new test cases to validate owner extraction logic:
- Owners from serialized task format (`__var`)
- Fallback to `default_args['owner']` if task owners are missing
- Resolution of most common owner
- Compatibility with unstructured or missing owners

* remove test version specific comment

* simplify comments and warnings

* fix return statement

* fixing formatting

* adding handling of default args

* fixing and adding more tests
2025-07-03 14:21:36 +05:30
Teddy
29450d1104
feat: add support for DBX system metrics (#22044)
* feat: add support for DBX system metrics

* feat: add support for DBX system metrics

* fix: added WRITE back

* fix: failing test cases

* fix: failing test
2025-07-02 08:54:16 +02:00
Suman Maharana
e36e5da26e
Added Databricks pipeline Lineage (#22014) 2025-06-30 10:41:22 +05:30
harshsoni2024
10b377590c
qlikcloud get script tables (#22022) 2025-06-30 10:36:57 +05:30