* Add assets API and deprecate inline assets field for Domain and Dataproduct
* fix mvn test
* fix py test and add new tests
* fix py test
* fix py test
* fix timeout for workflow test
* address pr feedback
* Update generated TypeScript types
* minor- remove unused function
---------
Co-authored-by: Bhanu Agrawal <bhanuagrawal2018@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Initial plan
* Fix Grafana connector format field validation issue
- Update GrafanaTarget.format field to accept both str and int types
- Add field_validator to convert integer format codes to string equivalents
- Add comprehensive tests for format field validation scenarios
- Add test fixture with integer format fields that reproduces the original issue
- Ensure backwards compatibility with existing string format values
This resolves the issue where Grafana dashboards with integer format fields
(e.g., format: 0 instead of format: "table") were causing validation errors
and being skipped during ingestion.
Co-authored-by: ulixius9 <39544459+ulixius9@users.noreply.github.com>
* fix: GrafanaTarget model format type from str to Any
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ulixius9 <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
Co-authored-by: Keshav Mohta <keshavmohta09@gmail.com>
* Kafka Connect Lineage Improvements
* Remove specific Kafka topic example from docstring
Removed example from the documentation regarding the earnin.bank.dev topic.
* fix: update comment to reflect accurate example for database server name handling
* fix: improve expected FQN display in warning messages for missing Kafka topics
* fix: update table entity retrieval method in KafkaconnectSource
* fix: enhance lineage information checks and improve logging for missing configurations in KafkaconnectSource
* Kafka Connect Lineage Improvements
* address comments; work without the table.include.list
---------
Co-authored-by: Ayush Shah <ayush@getcollate.io>
* chore: implement logger levels tests for depreciation
* fix: use METADATA_LOGGER instead of warnings
* use unit test syntax
* isort
* black
* fix test
---------
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
* Update `TableDiffParamsSetter` to move data at table level
This means that `key_columns` and `extra_columns` will be defined per table instead of "globally", just like `data_diff` expects
* Update `TableDiffValidator` to use table's `key_columns`
Call `data_diff` and run validations using each table's `key_columns`
* Create migration to update `tableDiff` test definition
* Fix Playwright test
* Refactor presidio utils
Extract the spacy model functionality from the analyzer building function
* Added a new `TagClassifier`
This classifier uses tags to dynamically build presidio `RecognizerRegistry`s
* Added a new `TagProcessor`
This processor uses `TagClassifier` to label a column based on the tags' recognizers
* Create `TagProcessor` based on workflow configuration
* Create decorator to apply threshold to recognizers
This is so that we can apply thresholds on recognizer results without subclassing or having to keep a map between the presidio recognizer and the recognizer configuration
* Fix broken test
* Add `reason` property to `TagLabel`
This is to understand what score was used for selecting the entity
* Build `TagLabel`s with `reason`
* Increase `PIIProcessor._tolerance`
This is so we correctly filter out low scores from classifiers while still maintaining the normalization that filters out confusing outcomes.
e.g: an output with scores 0.3, 0.7 and 0.75, would initially filter the 0.3 and then discard the other two because they're both relatively high results.
* Make database and DAO changes needed to persist `TagLabel.reason`
* Update generated TypeScript types
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Add support for translations in multi lang
* Add Tag Feedback System
* Update generated TypeScript types
* Fix typing issues and add tests to reocgnizer factory
* Updated `TagResourceTest.assertFieldChange` to fix broken test
This is because change description values had been serialized into strings and for some reason the keys ended up in a different order. So instead of performing String comparison, we do Json comparisons
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eugenio Doñaque <eugenio.donaque@getcollate.io>
* Update test data for `tests.integration.trino`
This is to create tables with complex data types.
Using raw SQL because creating tables with pandas didn't get the right types for the structs
* Update tests to reproduce the issue
Also included the new tables in the other tests to make sure complex data types do not break anything else
Reference: [issue 16983](https://github.com/open-metadata/OpenMetadata/issues/16983)
* Added `TypeDecorator`s handle `trino.types.NamedRowTuple`
This is because pydantic couldn't figure out how to create python objects when receiving `NamedRowTuple`s, which broke the sampling process.
This makes sure the data we receive from the trino interface is compatible with Pydantic
* feat: databricks oauth and azure ad auth setup
* refactor: add auth type changes in databricks.md
* fix: test after oauth changes
* refactor: unity catalog connection to databricks connection code
* Feat: show dbt project name
* Update generated TypeScript types
* added dbtSourceProject in data asset header properties
* Added tests
* Addressed comments
* Update generated TypeScript types
* move from dataAssetHeader to the dbt tab itself
* added unit test for added code
* test name change
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
* fix: added code for separate engine and session for each project in rofiler and classification and refactor billing project approach
* fix: added entity.database check, bigquery sampling tests
* fix: system metrics logic when bigquery billing project is provided