OpenMetadata

mirror of https://github.com/open-metadata/OpenMetadata.git synced 2025-10-29 17:49:14 +00:00

Author	SHA1	Message	Date
mmigdiso	64d468188e	Fixes 23881: Added native query lineage extraction for powerbi-databricks (#23882 ) * Added native query lineage extraction for powerbi-databricks * improved error handling and logging * checkstyle fix --------- Co-authored-by: m.migdisoglu <m.migdisoglu@criteo.com> Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>	2025-10-16 15:21:55 +05:30
sonika-shah	303ee47d6f	Add assets API and deprecate inline assets field for Domain and Dataproduct (#23856 ) * Add assets API and deprecate inline assets field for Domain and Dataproduct * fix mvn test * fix py test and add new tests * fix py test * fix py test * fix timeout for workflow test * address pr feedback * Update generated TypeScript types * minor- remove unused function --------- Co-authored-by: Bhanu Agrawal <bhanuagrawal2018@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-16 05:23:05 +05:30
Mayur Singal	3c527ca83b	MINOR: Fix Databricks DLT Pipeline Lineage to Track Table (#23888 ) * MINOR: Fix Databricks DLT Pipeline Lineage to Track Table * fix tests * add support for s3 pipeline lineage as well	2025-10-15 10:54:01 +02:00
Akash Verma	9b16119ab5	feat: Add Hex dashboard connector support (#23246 ) * feat: Add Hex dashboard connector support * files * Added tests and UI image * fix tests --------- Co-authored-by: Akash Verma <akashverma@Mac.lan>	2025-10-15 11:05:42 +05:30
Mohit Tilala	09c851265e	[Redshift] Add better handling of incomplete redshift view definition (#23866 ) * Add better handling of incomplete redshift view definition * Match exact definitions in tests * Correct isort on tests	2025-10-14 12:51:07 +05:30
Keshav Mohta	50dbe6fe44	fix: view_names issue when incremental enabled (#23858 )	2025-10-13 19:21:07 +05:30
Mayur Singal	a638bdcfe0	MINOR: Fix databricks pipeline repeating tasks issue (#23851 )	2025-10-13 00:41:05 +05:30
Copilot	c8722faf47	Fix Grafana connector validation error for integer format fields (#23202 ) * Initial plan * Fix Grafana connector format field validation issue - Update GrafanaTarget.format field to accept both str and int types - Add field_validator to convert integer format codes to string equivalents - Add comprehensive tests for format field validation scenarios - Add test fixture with integer format fields that reproduces the original issue - Ensure backwards compatibility with existing string format values This resolves the issue where Grafana dashboards with integer format fields (e.g., format: 0 instead of format: "table") were causing validation errors and being skipped during ingestion. Co-authored-by: ulixius9 <39544459+ulixius9@users.noreply.github.com> * fix: GrafanaTarget model format type from str to Any --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ulixius9 <39544459+ulixius9@users.noreply.github.com> Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com> Co-authored-by: Keshav Mohta <keshavmohta09@gmail.com>	2025-10-12 23:14:16 +05:30
Sriharsha Chintalapani	ce3a9bd654	Kafka connect improvements (#23845 ) * Kafka Connect Lineage Improvements * Remove specific Kafka topic example from docstring Removed example from the documentation regarding the earnin.bank.dev topic. * fix: update comment to reflect accurate example for database server name handling * fix: improve expected FQN display in warning messages for missing Kafka topics * fix: update table entity retrieval method in KafkaconnectSource * fix: enhance lineage information checks and improve logging for missing configurations in KafkaconnectSource * Kafka Connect Lineage Improvements * address comments; work without the table.include.list --------- Co-authored-by: Ayush Shah <ayush@getcollate.io>	2025-10-11 22:26:14 +02:00
Sriharsha Chintalapani	5c638f5c8e	Databricks DLT pipelines parsing (#23848 )	2025-10-11 22:25:43 +02:00
Ayush Shah	a90cacc93b	MINOR: fix Kafka connect CDC lineage (#23836 )	2025-10-11 15:40:03 +05:30
Teddy	1f8cf64dd4	chore: added python 3.12 to CI (#23835 ) * chore: added python 3.12 to CI * chore: changed py-test-skip to 3.12	2025-10-10 17:26:45 +02:00
Teddy	93e5ee8cb1	fix: url encode fqn when retrieving test case results in python sdk (#23834 )	2025-10-10 17:25:33 +02:00
Mayur Singal	88115e1218	MINOR: Fix training / issue in UC S3 lineage (#23816 )	2025-10-09 18:44:07 +02:00
Antoine Balliet	be3a91f7df	fix: logger level should work for deprecation warnings (#23784 ) * chore: implement logger levels tests for depreciation * fix: use METADATA_LOGGER instead of warnings * use unit test syntax * isort * black * fix test --------- Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>	2025-10-09 18:21:28 +02:00
Mayur Singal	05f064787f	Feat: Add kafka lineage support in databricks pipelines (#23813 ) * Add dlt pipeline support * Fix code style * Add variable parsing * Fix kafka lineage --------- Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>	2025-10-09 16:42:08 +02:00
Sriharsha Chintalapani	454d7367b0	Kafka Connect: Support Confluent Cloud connectors (#23780 )	2025-10-09 01:28:27 +05:30
Mayur Singal	4708c2b64f	feat: Unity Catalog Lineage Enhancement: External Location Support (#23790 )	2025-10-08 20:26:39 +05:30
harshsoni2024	f2819ce4e4	Fix: PowerBI snowflake query lineage parsing (#23746 )	2025-10-08 18:32:25 +05:30
Eugenio	af0672e4cf	Fixes #22302 : add `table2.keyColumns` parameter for table diff validation (#23667 ) * Update `TableDiffParamsSetter` to move data at table level This means that `key_columns` and `extra_columns` will be defined per table instead of "globally", just like `data_diff` expects * Update `TableDiffValidator` to use table's `key_columns` Call `data_diff` and run validations using each table's `key_columns` * Create migration to update `tableDiff` test definition * Fix Playwright test	2025-10-08 09:32:00 +02:00
harshsoni2024	da7a2778f6	MINOR: iceberg load table retry backoff (#23579 )	2025-10-05 23:42:56 +05:30
Sriharsha Chintalapani	fc7412f6dd	Add Timescale Connector (#23665 ) * Add Timescale Connector * Update generated TypeScript types * Add UI changes for the Timescale * lineage, usage and java * Add beta tag * update logo --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: Akash Verma <akashverma@Mac.lan>	2025-10-03 19:00:59 -07:00
Keshav Mohta	3d49b6689d	Fixes #23356 : Databricks & UnityCatalog OAuth and Azure AD Auth (#23561 ) * feat: databricks oauth and azure ad auth setup * refactor: add auth type changes in databricks.md * fix: test after oauth changes * refactor: unity catalog connection to databricks connection code * feat: added oauth and azure ad for unity catalog * fix: unitycatalog tests, doc & required type in connection.json * fix: generated tx files * fix: exporter databricksConnection file * refactor: unitycatalog example file * fix: usage example files * fix: unity catalog sqlalchemy connection * fix: unity catalog client headers * refactor: make common auth.py for dbx and unitycatalog * fix: auth functions import * fix: test unity catalog tags as None * fix: type hinting and sql migration * fix: migration for postgres	2025-10-03 19:53:19 +05:30
harshsoni2024	ea54b6b883	MINOR: datalake column subfields fix (#23576 )	2025-10-03 16:13:10 +05:30
Akash Verma	06453a925d	Fix #21093 : Update test connection improvements (#23516 ) * Update test connection improvements * Update queries * checkstyle * fix test failure --------- Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>	2025-10-03 13:50:46 +05:30
Suman Maharana	c8055576ba	Fixes #21686 : Add missing includeOwners check in dashboard services (#22514 )	2025-10-03 10:53:25 +05:30
Keshav Mohta	48ff77c917	Fixes: MF4 Import Error (#23659 ) * fix: asammdf and avro import error * fix: mf4 import only * test: fix mf4 test	2025-10-01 20:08:45 +05:30
Eugenio	5da2d32b34	Use recognizer in classification (#23628 ) * Refactor presidio utils Extract the spacy model functionality from the analyzer building function * Added a new `TagClassifier` This classifier uses tags to dynamically build presidio `RecognizerRegistry`s * Added a new `TagProcessor` This processor uses `TagClassifier` to label a column based on the tags' recognizers * Create `TagProcessor` based on workflow configuration * Create decorator to apply threshold to recognizers This is so that we can apply thresholds on recognizer results without subclassing or having to keep a map between the presidio recognizer and the recognizer configuration * Fix broken test	2025-10-01 14:43:28 +02:00
Eugenio	dff2b394d5	Fix classification scoring (#23523 ) * Add `reason` property to `TagLabel` This is to understand what score was used for selecting the entity * Build `TagLabel`s with `reason` * Increase `PIIProcessor._tolerance` This is so we correctly filter out low scores from classifiers while still maintaining the normalization that filters out confusing outcomes. e.g: an output with scores 0.3, 0.7 and 0.75, would initially filter the 0.3 and then discard the other two because they're both relatively high results. * Make database and DAO changes needed to persist `TagLabel.reason` * Update generated TypeScript types --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-01 12:11:14 +00:00
Keshav Mohta	6b7262a8ea	Feature: MF4 File Reader (#23308 ) * feat: mf4 file reader * refactor: removed schema_from_data implementation * test: added tests for mf4 files	2025-10-01 11:19:00 +02:00
Ayush Shah	dd99ab5678	feat: Add Unity Catalog data diff module to use DBX connection instead of workspaceclient (#23404 )	2025-09-30 20:56:54 +05:30
Sriharsha Chintalapani	18677afd39	Add support for Tags customizable rules, capturing feedback (#23289 ) * Add support for translations in multi lang * Add Tag Feedback System * Update generated TypeScript types * Fix typing issues and add tests to reocgnizer factory * Updated `TagResourceTest.assertFieldChange` to fix broken test This is because change description values had been serialized into strings and for some reason the keys ended up in a different order. So instead of performing String comparison, we do Json comparisons --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Eugenio Doñaque <eugenio.donaque@getcollate.io>	2025-09-30 07:17:18 +02:00
Sriharsha Chintalapani	bb1395fc72	Implement Modern Fluent API Pattern for OpenMetadata Java Client (#23239 ) * Implement Modern Fluent API Pattern for OpenMetadata Java Client * Add Lineage, Bulk, Search static methods * Add all API support for Java & Python SDKs * Add Python SDKs and mock tests * Add Fluent APIs for sdks * Add Fluent APIs for sdks * Add Fluent APIs for sdks, support async import/export * Remove unnecessary scripts * fix py checkstyle * fix tests with new plural form sdks * Fix tests * remove examples from python sdk * remove examples from python sdk * Fix type check * Fix pyformat check * Fix pyformat check * fix python integration tests * fix pycheck and pytests * fix search api pycheck * fix pycheck * fix pycheck * fix pycheck * Fix test_sdk_integration * Improvements to SDK * Remove SDK coverage for Python 3.9 * Remove SDK coverage for Python 3.9 * Remove SDK coverage for Python 3.9	2025-09-29 16:07:02 -07:00
Mohit Tilala	22a0925cd2	Fix correct snowflake object types in source url (#23612 )	2025-09-29 15:31:10 +00:00
Eugenio	bb50514a00	FIxes #16983 : can't sample data from trino tables with complex types (#23478 ) * Update test data for `tests.integration.trino` This is to create tables with complex data types. Using raw SQL because creating tables with pandas didn't get the right types for the structs * Update tests to reproduce the issue Also included the new tables in the other tests to make sure complex data types do not break anything else Reference: [issue 16983](https://github.com/open-metadata/OpenMetadata/issues/16983) * Added `TypeDecorator`s handle `trino.types.NamedRowTuple` This is because pydantic couldn't figure out how to create python objects when receiving `NamedRowTuple`s, which broke the sampling process. This makes sure the data we receive from the trino interface is compatible with Pydantic	2025-09-26 08:13:28 +02:00
Keshav Mohta	cb26c91442	Revert "Fixes #23356 : Databricks OAuth & Azure AD Auth (#23482 )" (#23530 ) This reverts commit f1afe8f5f114ee58090168fd7ae5d66b38a01ab0.	2025-09-23 17:44:16 +02:00
Teddy	57c5a50d20	ISSUE #23435 - Fix pass / fail count for custom SQL (#23506 ) * fix: added logic to compute pass/fail for sql queries with cte, nested queries, and joins * added logic to correctly compute pass / fail rows * style: ran python linting * fix: failing tests * style: fix linting error * fix: flawed count logic * fix: handle case where we don't compute row count	2025-09-23 16:53:51 +02:00
Ayush Shah	d94b39f6f5	fix(ssl): Update SSLManager to use dynamic schema registry paths (#23505 )	2025-09-23 18:10:18 +05:30
Keshav Mohta	f1afe8f5f1	Fixes #23356 : Databricks OAuth & Azure AD Auth (#23482 ) * feat: databricks oauth and azure ad auth setup * refactor: add auth type changes in databricks.md * fix: test after oauth changes * refactor: unity catalog connection to databricks connection code	2025-09-23 15:22:50 +05:30
Keshav Mohta	9262040381	fix: handle database native types for create table request during openlineage lineage (#23513 )	2025-09-23 10:11:39 +02:00
Suman Maharana	e2b903532e	Fixes - Kafkaconnect lineage & descriptions (#23234 ) * Fix Kafkaconnect lineage & descriptions * fix typos Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * address comments Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * address comms --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-23 10:08:37 +02:00
Mohit Tilala	d1e60acd2a	[SAP HANA] Prevent exponential processing lineage parsing and use full name for filtering (#23484 ) * Prevent exponential processing lineage parsing * Use full name of views for filtering * pylint fix - isort	2025-09-22 19:46:34 +05:30
Keshav Mohta	1a67e4fb7d	Feature: MariaDB Stored Procedures and Functions Support #23422	2025-09-18 17:59:39 +05:30
Akash Verma	da5dab7fef	Fixes #23388 : Handle string and dict types for Metabase dataset_query field (#23417 ) * Handle string and dict types for Metabase dataset_query field * Added tests --------- Co-authored-by: Akash Verma <akashverma@Mac.lan>	2025-09-16 16:57:08 -07:00
Sriharsha Chintalapani	cf7931ee3b	Add logging endpoint into S3 (#22533 ) * Add logging endpoint into S3 * Update generated TypeScript types * Stream Ingestion logs to S3 * Update generated TypeScript types * Address comments * Update generated TypeScript types * create logs mixin, use clients to stream logs * centralize logs sending into mixin * use StreamableLogHandlerManager instead global handler * improve condition * remove example workflow file * formatting changes * fix tests and format * tests, checkstyle fix * minor changes * reformat code * tests fix --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com> Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>	2025-09-15 07:22:25 -07:00
Suman Maharana	39cb165164	Feat: show dbt project name (#23044 ) * Feat: show dbt project name * Update generated TypeScript types * added dbtSourceProject in data asset header properties * Added tests * Addressed comments * Update generated TypeScript types * move from dataAssetHeader to the dbt tab itself * added unit test for added code * test name change --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashish Gupta <ashish@getcollate.io>	2025-09-10 11:23:28 +02:00
Suman Maharana	000aaa63f1	Fix tableau e2e count error (#23287 )	2025-09-10 01:52:34 +05:30
IceS2	8177e529bc	FIXES #23220 : Add cardinality metric for string and enum (#23052 ) * Implement Cardinality Metric for String and Enum * Add Unit Tests * Update generated TypeScript types * Update ingestion/src/metadata/profiler/metrics/hybrid/cardinality_distribution.py Co-authored-by: Teddy <teddy.crepineau@gmail.com> * Fix CTE to simplify it to work with sqlite * Fix CTE to simplify it to work with sqlite * Update generated TypeScript types * Update generated TypeScript types * Add 'cardinalityDistribution' metric to profiler configuration * Update generated TypeScript types --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Teddy <teddy.crepineau@gmail.com> Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>	2025-09-09 16:38:53 +02:00
Teddy	1ef191a2aa	ISSUE #1534 - Profiler Refactor for Metadata Extraction Application (#23200 ) * feat: added exporter app config * refactor: added entityprofile resource & added backward compatibility to existing API * feat: added tests to get_profile_data_by_type * feat: remove non supported event types * chore: added migrations to 1.9.7 * chore: added application creation readme * chore: move migrations to 1.9.8 * fix: failing java test * style: ran java linting	2025-09-05 13:07:04 +02:00
Keshav Mohta	103857f90c	Fixes #23010 #: BigQuery Project Selection In Profiler & AutoClassification Workflow (#23233 ) * fix: added code for separate engine and session for each project in rofiler and classification and refactor billing project approach * fix: added entity.database check, bigquery sampling tests * fix: system metrics logic when bigquery billing project is provided	2025-09-05 14:09:14 +05:30

1 2 3 4 5 ...

1275 Commits