Sriharsha Chintalapani
fc7412f6dd
Add Timescale Connector ( #23665 )
...
* Add Timescale Connector
* Update generated TypeScript types
* Add UI changes for the Timescale
* lineage, usage and java
* Add beta tag
* update logo
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-10-03 19:00:59 -07:00
Mohit Tilala
b15dc8fe42
Add better handling of no columns found/permission issue exceptions ( #23695 )
2025-10-03 21:07:16 +05:30
Keshav Mohta
3d49b6689d
Fixes #23356 : Databricks & UnityCatalog OAuth and Azure AD Auth ( #23561 )
...
* feat: databricks oauth and azure ad auth setup
* refactor: add auth type changes in databricks.md
* fix: test after oauth changes
* refactor: unity catalog connection to databricks connection code
* feat: added oauth and azure ad for unity catalog
* fix: unitycatalog tests, doc & required type in connection.json
* fix: generated tx files
* fix: exporter databricksConnection file
* refactor: unitycatalog example file
* fix: usage example files
* fix: unity catalog sqlalchemy connection
* fix: unity catalog client headers
* refactor: make common auth.py for dbx and unitycatalog
* fix: auth functions import
* fix: test unity catalog tags as None
* fix: type hinting and sql migration
* fix: migration for postgres
2025-10-03 19:53:19 +05:30
harshsoni2024
ea54b6b883
MINOR: datalake column subfields fix ( #23576 )
2025-10-03 16:13:10 +05:30
Akash Verma
06453a925d
Fix #21093 : Update test connection improvements ( #23516 )
...
* Update test connection improvements
* Update queries
* checkstyle
* fix test failure
---------
Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>
2025-10-03 13:50:46 +05:30
Akash Verma
5bb2924a6a
Fix #16081 : Add support for SQL Server hierarchyid, geography, and geometry types ( #23527 )
2025-10-03 11:46:01 +05:30
Akash Verma
4d68fe7a10
feat: Add ML model lineage support ( #23494 )
2025-10-03 11:38:41 +05:30
Suman Maharana
c8055576ba
Fixes #21686 : Add missing includeOwners check in dashboard services ( #22514 )
2025-10-03 10:53:25 +05:30
Keshav Mohta
48ff77c917
Fixes: MF4 Import Error ( #23659 )
...
* fix: asammdf and avro import error
* fix: mf4 import only
* test: fix mf4 test
2025-10-01 20:08:45 +05:30
Eugenio
5da2d32b34
Use recognizer in classification ( #23628 )
...
* Refactor presidio utils
Extract the spacy model functionality from the analyzer building function
* Added a new `TagClassifier`
This classifier uses tags to dynamically build presidio `RecognizerRegistry`s
* Added a new `TagProcessor`
This processor uses `TagClassifier` to label a column based on the tags' recognizers
* Create `TagProcessor` based on workflow configuration
* Create decorator to apply threshold to recognizers
This is so that we can apply thresholds on recognizer results without subclassing or having to keep a map between the presidio recognizer and the recognizer configuration
* Fix broken test
2025-10-01 14:43:28 +02:00
Eugenio
dff2b394d5
Fix classification scoring ( #23523 )
...
* Add `reason` property to `TagLabel`
This is to understand what score was used for selecting the entity
* Build `TagLabel`s with `reason`
* Increase `PIIProcessor._tolerance`
This is so we correctly filter out low scores from classifiers while still maintaining the normalization that filters out confusing outcomes.
e.g: an output with scores 0.3, 0.7 and 0.75, would initially filter the 0.3 and then discard the other two because they're both relatively high results.
* Make database and DAO changes needed to persist `TagLabel.reason`
* Update generated TypeScript types
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-10-01 12:11:14 +00:00
Keshav Mohta
6b7262a8ea
Feature: MF4 File Reader ( #23308 )
...
* feat: mf4 file reader
* refactor: removed schema_from_data implementation
* test: added tests for mf4 files
2025-10-01 11:19:00 +02:00
Pere Miquel Brull
375e001dd9
MINOR - Fix S3 logging from ingestion pipelines ( #23590 )
...
* MINOR - Fix S3 logging from ingestion pipelines
* Update generated TypeScript types
* config
* update s3 configurations for streamable logs
* Update generated TypeScript types
* update s3 configurations for streamable logs
* update s3 configurations for streamable logs
* update s3 configurations for streamable logs
* SSE off by default
* Update log retrieval to use s3 if ingestion runner has streamable logs enabled
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Pablo Takara <pjt1991@gmail.com>
2025-10-01 09:44:17 +02:00
Ayush Shah
dd99ab5678
feat: Add Unity Catalog data diff module to use DBX connection instead of workspaceclient ( #23404 )
2025-09-30 20:56:54 +05:30
Sriharsha Chintalapani
18677afd39
Add support for Tags customizable rules, capturing feedback ( #23289 )
...
* Add support for translations in multi lang
* Add Tag Feedback System
* Update generated TypeScript types
* Fix typing issues and add tests to reocgnizer factory
* Updated `TagResourceTest.assertFieldChange` to fix broken test
This is because change description values had been serialized into strings and for some reason the keys ended up in a different order. So instead of performing String comparison, we do Json comparisons
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eugenio Doñaque <eugenio.donaque@getcollate.io>
2025-09-30 07:17:18 +02:00
Sriharsha Chintalapani
bb1395fc72
Implement Modern Fluent API Pattern for OpenMetadata Java Client ( #23239 )
...
* Implement Modern Fluent API Pattern for OpenMetadata Java Client
* Add Lineage, Bulk, Search static methods
* Add all API support for Java & Python SDKs
* Add Python SDKs and mock tests
* Add Fluent APIs for sdks
* Add Fluent APIs for sdks
* Add Fluent APIs for sdks, support async import/export
* Remove unnecessary scripts
* fix py checkstyle
* fix tests with new plural form sdks
* Fix tests
* remove examples from python sdk
* remove examples from python sdk
* Fix type check
* Fix pyformat check
* Fix pyformat check
* fix python integration tests
* fix pycheck and pytests
* fix search api pycheck
* fix pycheck
* fix pycheck
* fix pycheck
* Fix test_sdk_integration
* Improvements to SDK
* Remove SDK coverage for Python 3.9
* Remove SDK coverage for Python 3.9
* Remove SDK coverage for Python 3.9
2025-09-29 16:07:02 -07:00
Mohit Tilala
22a0925cd2
Fix correct snowflake object types in source url ( #23612 )
2025-09-29 15:31:10 +00:00
Keshav Mohta
4528c0c1c4
Fixes #23416 : Option To Opt Out of BigQuery Policy Tags Ingestion ( #23532 )
...
* fix: added includePolicyTags flag
* feat: added includePolicyTags
2025-09-29 18:24:10 +05:30
Mayur Singal
b489112bdd
MINOR: Fix import error log ( #23578 )
2025-09-29 12:47:19 +05:30
Keshav Mohta
94104e0806
fix: lineage flow and improved logging for databricks pipeline ( #23586 )
2025-09-26 18:22:01 +05:30
Eugenio
bb50514a00
FIxes #16983 : can't sample data from trino tables with complex types ( #23478 )
...
* Update test data for `tests.integration.trino`
This is to create tables with complex data types.
Using raw SQL because creating tables with pandas didn't get the right types for the structs
* Update tests to reproduce the issue
Also included the new tables in the other tests to make sure complex data types do not break anything else
Reference: [issue 16983](https://github.com/open-metadata/OpenMetadata/issues/16983 )
* Added `TypeDecorator`s handle `trino.types.NamedRowTuple`
This is because pydantic couldn't figure out how to create python objects when receiving `NamedRowTuple`s, which broke the sampling process.
This makes sure the data we receive from the trino interface is compatible with Pydantic
2025-09-26 08:13:28 +02:00
Suman Maharana
be51d53464
Fix - Hive Metastore None issue ( #23520 )
2025-09-24 10:11:29 +05:30
Mayur Singal
933802a354
MINOR: Support metabase API Key Auth ( #23436 )
2025-09-23 22:01:10 +05:30
Keshav Mohta
cb26c91442
Revert "Fixes #23356 : Databricks OAuth & Azure AD Auth ( #23482 )" ( #23530 )
...
This reverts commit f1afe8f5f114ee58090168fd7ae5d66b38a01ab0.
2025-09-23 17:44:16 +02:00
Teddy
57c5a50d20
ISSUE #23435 - Fix pass / fail count for custom SQL ( #23506 )
...
* fix: added logic to compute pass/fail for sql queries with cte, nested queries, and joins
* added logic to correctly compute pass / fail rows
* style: ran python linting
* fix: failing tests
* style: fix linting error
* fix: flawed count logic
* fix: handle case where we don't compute row count
2025-09-23 16:53:51 +02:00
Suman Maharana
79fde4ab02
Minor: improved dbt debug logs ( #23509 )
2025-09-23 19:52:58 +05:30
Ayush Shah
d94b39f6f5
fix(ssl): Update SSLManager to use dynamic schema registry paths ( #23505 )
2025-09-23 18:10:18 +05:30
Keshav Mohta
f1afe8f5f1
Fixes #23356 : Databricks OAuth & Azure AD Auth ( #23482 )
...
* feat: databricks oauth and azure ad auth setup
* refactor: add auth type changes in databricks.md
* fix: test after oauth changes
* refactor: unity catalog connection to databricks connection code
2025-09-23 15:22:50 +05:30
Suman Maharana
1c710ef5e3
Fix Stream logger url ( #23491 )
2025-09-23 14:35:14 +05:30
Keshav Mohta
9262040381
fix: handle database native types for create table request during openlineage lineage ( #23513 )
2025-09-23 10:11:39 +02:00
Suman Maharana
e2b903532e
Fixes - Kafkaconnect lineage & descriptions ( #23234 )
...
* Fix Kafkaconnect lineage & descriptions
* fix typos
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* address comments
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* address comms
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-23 10:08:37 +02:00
Pere Miquel Brull
49bdf1a112
MINOR - Report status for tests that blow up ( #23326 )
...
* MINOR - Report status for tests that blow up
* format
2025-09-22 16:34:36 +02:00
Mohit Tilala
d1e60acd2a
[SAP HANA] Prevent exponential processing lineage parsing and use full name for filtering ( #23484 )
...
* Prevent exponential processing lineage parsing
* Use full name of views for filtering
* pylint fix - isort
2025-09-22 19:46:34 +05:30
Keshav Mohta
1a67e4fb7d
Feature: MariaDB Stored Procedures and Functions Support #23422
2025-09-18 17:59:39 +05:30
Akash Verma
da5dab7fef
Fixes #23388 : Handle string and dict types for Metabase dataset_query field ( #23417 )
...
* Handle string and dict types for Metabase dataset_query field
* Added tests
---------
Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-09-16 16:57:08 -07:00
Mohit Tilala
61ed53f7b2
Handled none procedure_name in stored proc lineage processing ( #23408 )
2025-09-16 12:39:15 +05:30
Sriharsha Chintalapani
cf7931ee3b
Add logging endpoint into S3 ( #22533 )
...
* Add logging endpoint into S3
* Update generated TypeScript types
* Stream Ingestion logs to S3
* Update generated TypeScript types
* Address comments
* Update generated TypeScript types
* create logs mixin, use clients to stream logs
* centralize logs sending into mixin
* use StreamableLogHandlerManager instead global handler
* improve condition
* remove example workflow file
* formatting changes
* fix tests and format
* tests, checkstyle fix
* minor changes
* reformat code
* tests fix
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-09-15 07:22:25 -07:00
Keshav Mohta
11a719e611
Fixes: Oracle Stored Packages Test Connection Step #23370
2025-09-12 17:38:34 +00:00
Keshav Mohta
1f379a8697
fix: added depth in json and pass in metadata entry ( #23332 )
2025-09-12 12:31:20 +05:30
Mohit Tilala
f9e866cd50
Fix incomplete trino view definition extraction ( #23349 )
2025-09-12 11:43:50 +05:30
Mayur Singal
38c707b0bc
MINOR: Fix column comment getting overriden in glue ( #23329 )
2025-09-11 17:29:23 +05:30
Mayur Singal
d705fffc1d
Fix #1968 : Query Runner Schema ( #23077 )
2025-09-11 10:41:11 +05:30
Teddy
f3cb001d2b
ISSUE #2033-C - Support For DBX Exporter + Minor Fix to Status ( #23313 )
...
* feat: added config support for databricks
* fix: allow incrementing record count directly without storing element
* Update generated TypeScript types
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-09-10 12:04:46 +02:00
Suman Maharana
39cb165164
Feat: show dbt project name ( #23044 )
...
* Feat: show dbt project name
* Update generated TypeScript types
* added dbtSourceProject in data asset header properties
* Added tests
* Addressed comments
* Update generated TypeScript types
* move from dataAssetHeader to the dbt tab itself
* added unit test for added code
* test name change
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
2025-09-10 11:23:28 +02:00
IceS2
8177e529bc
FIXES #23220 : Add cardinality metric for string and enum ( #23052 )
...
* Implement Cardinality Metric for String and Enum
* Add Unit Tests
* Update generated TypeScript types
* Update ingestion/src/metadata/profiler/metrics/hybrid/cardinality_distribution.py
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
* Fix CTE to simplify it to work with sqlite
* Fix CTE to simplify it to work with sqlite
* Update generated TypeScript types
* Update generated TypeScript types
* Add 'cardinalityDistribution' metric to profiler configuration
* Update generated TypeScript types
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2025-09-09 16:38:53 +02:00
NadezhdaNovotortseva
3c9b3cac48
Lineage dialect postgres added to greenplum ( #23291 )
...
Co-authored-by: Надежда Коцюба <nadezhda.kotsyuba@uni.rest>
2025-09-08 15:26:23 +02:00
Teddy
1ef191a2aa
ISSUE #1534 - Profiler Refactor for Metadata Extraction Application ( #23200 )
...
* feat: added exporter app config
* refactor: added entityprofile resource & added backward compatibility to existing API
* feat: added tests to get_profile_data_by_type
* feat: remove non supported event types
* chore: added migrations to 1.9.7
* chore: added application creation readme
* chore: move migrations to 1.9.8
* fix: failing java test
* style: ran java linting
2025-09-05 13:07:04 +02:00
Keshav Mohta
103857f90c
Fixes #23010 #: BigQuery Project Selection In Profiler & AutoClassification Workflow ( #23233 )
...
* fix: added code for separate engine and session for each project in rofiler and classification and refactor billing project approach
* fix: added entity.database check, bigquery sampling tests
* fix: system metrics logic when bigquery billing project is provided
2025-09-05 14:09:14 +05:30
Mohit Tilala
d926ed9dad
[Snowflake] Handle cases when stream source is not retrievable ( #23245 )
2025-09-05 00:27:31 +05:30
Mohit Tilala
9b2b4d2452
[Lineage] Fix cross services lineage changes of service_names to missed methods ( #23240 )
...
* Fix cross db changes of service_names to missed methods
* Handle string value passed to service_names
2025-09-04 20:38:05 +05:30