1234 Commits

Author SHA1 Message Date
Mohit Tilala
d1e60acd2a
[SAP HANA] Prevent exponential processing lineage parsing and use full name for filtering (#23484)
* Prevent exponential processing lineage parsing

* Use full name of views for filtering

* pylint fix - isort
2025-09-22 19:46:34 +05:30
Keshav Mohta
1a67e4fb7d
Feature: MariaDB Stored Procedures and Functions Support #23422 2025-09-18 17:59:39 +05:30
Akash Verma
da5dab7fef
Fixes #23388: Handle string and dict types for Metabase dataset_query field (#23417)
* Handle string and dict types for Metabase dataset_query field

* Added tests

---------

Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-09-16 16:57:08 -07:00
Sriharsha Chintalapani
cf7931ee3b
Add logging endpoint into S3 (#22533)
* Add logging endpoint into S3

* Update generated TypeScript types

* Stream Ingestion logs to S3

* Update generated TypeScript types

* Address comments

* Update generated TypeScript types

* create logs mixin, use clients to stream logs

* centralize logs sending into mixin

* use StreamableLogHandlerManager instead global handler

* improve condition

* remove example workflow file

* formatting changes

* fix tests and format

* tests, checkstyle fix

* minor changes

* reformat code

* tests fix

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-09-15 07:22:25 -07:00
Suman Maharana
39cb165164
Feat: show dbt project name (#23044)
* Feat: show dbt project name

* Update generated TypeScript types

* added dbtSourceProject in data asset header properties

* Added tests

* Addressed comments

* Update generated TypeScript types

* move from dataAssetHeader to the dbt tab itself

* added unit test for added code

* test name change

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
2025-09-10 11:23:28 +02:00
Suman Maharana
000aaa63f1
Fix tableau e2e count error (#23287) 2025-09-10 01:52:34 +05:30
IceS2
8177e529bc
FIXES #23220: Add cardinality metric for string and enum (#23052)
* Implement Cardinality Metric for String and Enum

* Add Unit Tests

* Update generated TypeScript types

* Update ingestion/src/metadata/profiler/metrics/hybrid/cardinality_distribution.py

Co-authored-by: Teddy <teddy.crepineau@gmail.com>

* Fix CTE to simplify it to work with sqlite

* Fix CTE to simplify it to work with sqlite

* Update generated TypeScript types

* Update generated TypeScript types

* Add 'cardinalityDistribution' metric to profiler configuration

* Update generated TypeScript types

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-09-09 16:38:53 +02:00
Teddy
1ef191a2aa
ISSUE #1534 - Profiler Refactor for Metadata Extraction Application (#23200)
* feat: added exporter app config

* refactor: added entityprofile resource & added backward compatibility to existing API

* feat: added tests to get_profile_data_by_type

* feat: remove non supported event types

* chore: added migrations to 1.9.7

* chore: added application creation readme

* chore: move migrations to 1.9.8

* fix: failing java test

* style: ran java linting
2025-09-05 13:07:04 +02:00
Keshav Mohta
103857f90c
Fixes #23010 #: BigQuery Project Selection In Profiler & AutoClassification Workflow (#23233)
* fix: added code for separate engine and session for each project in rofiler and classification and refactor billing project approach

* fix: added entity.database check, bigquery sampling tests

* fix: system metrics logic when bigquery billing project is provided
2025-09-05 14:09:14 +05:30
Mohit Tilala
9b2b4d2452
[Lineage] Fix cross services lineage changes of service_names to missed methods (#23240)
* Fix cross db changes of service_names to missed methods

* Handle string value passed to service_names
2025-09-04 20:38:05 +05:30
Mohit Tilala
f2fd8a9107
Fixes #22452: [Snowflake] Add custom host support for View in Snowflake source url (#23209) 2025-09-03 14:13:03 +05:30
Ram Narayan Balaji
5cb33ce78a
Implementation of Adding Entity Status and Reviewers to assets (#22904)
* Initial Implementation of Adding Status and Reviewers to assets for workflows

* Update generated TypeScript types

* Copilot Review Comments Addressed

* Removed DataProduct Reviewer Inheritance as it is irrelevant

* Commit: Classification has status and reviewers, DataContract uses the same status enums, changed the logic to be APPROVED instead of Active, DataContract can have null status as seen in tests, Changed Workflow to use workflowStatus instead of status as it is contradicting with the approval status, Fixed Tests

* Default for reviewers is null

* Default for reviewers is createSchema

* Addressed CoPilots comments

* Update generated TypeScript types

* Workflow status to workflowStatus in db and migrations

* Revert "Workflow status to workflowStatus in db and migrations"

This reverts commit 676e8789358654bc6f980f855c372f33c22fc40b.

* Changed status to entityStatus in the schema files

* Java Implementation of Default Status, Search Client improvements and Test fixes and new tests

* Adding entityStatus and reviewers in the searchIndex mappings and common attributes

* Data Migration scripts to change the glossaryTerm and dataContract structure

* Update generated TypeScript types

* Fixed zh/spreadsheet index json error

* Fix Postgre migration script

* Changed the entityStatus.json to status.json
Removed the duplicates of entityStatus in the indexMapping
Modified the sample data to take in EntityStatus.Approved instead of ContractStatus.Active

* Update generated TypeScript types

* dummy commit

* Fix UI Build Issues with the New EntityStatus
Fix py tests

* Migrations for all the entities that need entityStatus

* Update generated TypeScript types

* Removed Post Migration scripts

* Fix UI  and py for entityStatus

* Update generated TypeScript types

* Fix: DataContractResourceTest

* Fix UI and py for importing entityStatus

* UI to show and fetch Reviewers

* cleanup

* Removed Overridden SetDefaultStatus in GlossaryTermRepository

* Removed unnecessary validation

* Added entityStatus in search_entity_index_mapping.json

* Fixed DataContractResourceTest

* mvn spotless apply and fix migration scripts

* fix tests

* fix type error

* fix advanced search tests

* Status comparison using enums and supportsStatus to supportsEntityStatus

* mvn spotless apply

* fix merge conflict

* update entity status

* fix tests

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
2025-09-03 12:49:45 +05:30
Mohit Tilala
04a3639e47
Fixes #21895 #22363 #22369: Lineage improvements with multiprocessing, stored procedure level temp table processing and lineage filtering with db & schema (#22371)
* MINOR: Improve UDF Lineage Processing & Better Logging Time & MultiProcessing (#20848)

* Fix multiprocessing with better memory management and Airflow 2+ compatibility

* Add support for both multiprocessing and multithreading for relevant platforms

* Handle conflicting cross-db lineage changes of service_name parameter change

* Handle stored proc queries without caching all and increase the thread timeout times to cover 100% lineage

* Fix `get_table_query` inheritance and pylint

* Remove  mocks from db_utils tests

* Better db_utils test and fix the service_names parameter in case of schema_fallback

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-09-03 11:26:14 +05:30
Suman Maharana
20e18d4f9f
Add ssl support to hive (#22831)
* Add ssl support to hive

* Added missing ts files

* Added version to pure transport

* Added Tests

* fix tests add missing files
2025-09-02 20:13:30 +05:30
Mayur Singal
08ee62a198
MINOR: Add Unstructured Formats Support to GCS Storage Connector (#23158) 2025-09-02 18:22:39 +05:30
Suman Maharana
30bceee580
Fixes #22204 - Add support for sources key metadata fetch in dbt (#23003)
* Added support for sources key metadata fetch in dbt

* address comments

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes

* fixed tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-02 10:22:15 +05:30
Pere Miquel Brull
abcdc4e3d6
MINOR - Domain Independent DP Rule (#23067)
* MINOR - Domain Independent DP Rule

* handle DP

* Handle DP

* add migration

* improve rule mgmt

* improve rule mgmt

* add test for bulk op

* fix test

* handle in bulk

---------

Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
2025-08-29 17:28:29 +02:00
Ayush Shah
29bf750bde
Minor: Fix Databricks & Unity Catalog, Table or View Not found (#23014) 2025-08-29 07:46:23 +05:30
IceS2
a696fe0111
FIXES #20807: Fix Oracle DataDiff and Change Oracle Connection to BaseConnection (#23020)
* Fix Oracle DataDiff and Change Oracle Connection to BaseConnection

* Add small unittest

* Fix Test

* Fix logic, to void other engines to denormalize table/schema names
2025-08-26 11:03:40 +02:00
Mohit Tilala
744494968e
Fixes #22238: [SAP HANA] Add calculated view columns' formula parsing logic (#23017)
* Add calculated view columns' formula parsing logic with correct source reference

* Handle top level column formula parsing and pass formula expression in column lineage detail

---------

Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
2025-08-26 07:19:11 +05:30
Keshav Mohta
2f655daedc
Fix #18491: ingestion fails for Iceberg tables with nested partition column (#23031)
* fix: ingestion fails for Iceberg tables with nested partition column

* test: added test to cover nested partition column for iceberg

* refactor: used if-else in tablePartition check

* fix: partition_column_name & column_partition_type typo
2025-08-22 17:25:59 +05:30
Suman Maharana
582d5eb7d7
Update: Tableau e2e Tests (#23026) 2025-08-21 09:00:23 +05:30
Ayush Shah
f19c0be59e
Fixes #21677: Refactor and enhance the entity name transformation logic (#22695) 2025-08-21 08:43:33 +05:30
Mohit Tilala
26fedbaf0e
Fixes #22112: Snowflake schema tags inheritance (#22979)
* Add schema-level tags and tag inheritance support for snowflake

* Add tests for schema tag inheritance

* Lint fixes
2025-08-20 09:52:44 +05:30
Mohit Tilala
cc4b357444
Fixes #22238: [SAP HANA] Correction of physical schema mapping and column lookup at each layer of calculation view (#22952) 2025-08-19 18:45:06 +05:30
Teddy
d58b8a63d6
ISSUE #1753 - Add Row Count to Custom SQL Test (#22697)
* feat: add count rows support for custom SQL

* style: ran python linting

* feat: added logic for partitioned custom sql row count

* migration: partitionExpression parameter

* chore: resolve conflicts
2025-08-19 06:40:49 +02:00
harshsoni2024
a7afad466a
e2e tableau use pat creds (#22834) 2025-08-15 11:34:45 +05:30
Copilot
8cc9d2af71
Add OpenAPI YAML format support for REST API ingestion (#22304)
* Initial plan

* Implement OpenAPI YAML support with backward JSON compatibility

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* fix tests & lint

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harshach <38649+harshach@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-08-12 18:22:41 +05:30
Copilot
42bfd65a15
Fix typos in OpenMetadata documentation (#22899) 2025-08-12 17:27:40 +05:30
Mayur Singal
dc4c75ded4
FIX #21180: Implement Cross Service Lineage (#22665) 2025-08-12 16:54:40 +05:30
harshsoni2024
fceb9f3c00
issue-22425: PowerBI parse expression along with measure (#22870) 2025-08-12 10:17:37 +05:30
Sriharsha Chintalapani
15b92735b9
Fix #1093: Add Grafana Support (#22571)
* Fix #1093: Add Grafana Support

* Update generated TypeScript types

* Grafana test fix

* Update

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-08-11 19:39:39 +05:30
harshsoni2024
42572747ed
MINOR: e2e fixes (#22787) 2025-08-06 21:11:02 +00:00
Ariel Schulz
d31e2d8ba0
Feature/1 fix and add lineage to exasol connector (#21399)
* Add lineage to Exasol connector

* Update test_connection to return TestConnectionResult

* Add exasol tests & dependencies to tests in setup.py

* Opensearch is required for testing, so add it there

* Modify metadata

* Update documentation for lineage

* Apply formatting changes to code

* Apply make py_format
2025-08-06 23:49:38 +05:30
Mayur Singal
00b6da5b84
MINOR: Improve Databricks Profiler & Test Connection (#22732) 2025-08-06 00:41:11 +05:30
harshsoni2024
1deb5adeb5
MINOR: fix e2e tests (#22723) 2025-08-04 19:00:56 +05:30
Mayur Singal
fe28faa13f
MINOR: Add support for csv.gz in datalake (#22666)
* MINOR: Add support for csv.gz in datalake

* fileformat change

* Update generated TypeScript types

* pyformat
2025-08-01 17:39:19 +05:30
IceS2
7f8298d49e
Update DataLake and PostgreSQL connection (#22682) 2025-08-01 11:08:43 +02:00
MarkSaf17
58078f582c
[WIP] Issue 22627 (#22647)
* [ISSUE-22627] fix(dbt): Support documented owner configuration with backward compatibility

* typo

* added tests

---------

Co-authored-by: marsafronov <marsafronov@ecom.tech>
Co-authored-by: SumanMaharana <sumanmaharana786@gmail.com>
2025-07-31 11:02:18 +00:00
Pere Miquel Brull
dfe3fd6357
MINOR - Data Contract Validation (#22541) 2025-07-30 23:01:27 +02:00
Suman Maharana
3a90b38a26
Fix: Tableau ca cert auth (#22041)
* Fix: Tableau ca cert auth

* py_format

* Added ssl tests

* fix lint errors
2025-07-30 09:38:47 +05:30
harshsoni2024
50428e2e7b
feat-22574: Datalake ingestion fix for larger files (#22575) 2025-07-29 20:40:19 +05:30
Suman Maharana
670dc53b46
Minor: fix tableau handle none entities (#22630)
* Minor: fix tableau handle none entities

* added tests
2025-07-29 13:58:11 +02:00
Mayur Singal
199e3b981c
Fix #14830: Ignore non current columns for iceberg tables for glue & athena (#22564) 2025-07-29 16:19:09 +05:30
Mayur Singal
cc9506db20
MINOR: Postgres Implement schema fallback (#21858)
* MINOR: Postgres Implement schema fallback

* missing sql_lineage file
2025-07-29 14:45:21 +05:30
Suman Maharana
54dcdc7d82
Fix #20689: Trino Column validation errors for highly complex fields (#22421)
* Fix: Trino Column validation errors for highly complex fields

* addressed copilot comms

* fixed tests

* fixed tests and addressed comms

* missed file
2025-07-28 11:11:44 +05:30
IceS2
bad772db39
FIX #22099: enable 'Column values to be in set' test case for boolean columns (#22491)
* fix(dq): enable ''Column values to be in set'' test case for boolean columns

Add BOOLEAN to supportedDataTypes array in columnValuesToBeInSet.json
to allow boolean column validation with predefined allowed values.

This enables users to enforce strict true/false validation on boolean
columns directly at the column level, resolving issue #22099.

Co-authored-by: IceS2 <IceS2@users.noreply.github.com>

* Add tests to the new feature

* Add migrations and columnValuesToBeNotInSet

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: IceS2 <IceS2@users.noreply.github.com>
2025-07-25 15:17:38 +02:00
Ayush Shah
1e8e38f2ca
MINOR: Custom properties Data types fix (#22342) 2025-07-25 18:39:53 +05:30
Mayur Singal
37b10a102f
MINOR: Improve ometa logging (#22586) 2025-07-25 18:26:44 +05:30
Mayur Singal
b8db86bc4f
MINOR: Fix airflow ingestion for older version (#22581) 2025-07-25 18:22:33 +05:30