3502 Commits

Author SHA1 Message Date
Mohit Tilala
04a3639e47
Fixes #21895 #22363 #22369: Lineage improvements with multiprocessing, stored procedure level temp table processing and lineage filtering with db & schema (#22371)
* MINOR: Improve UDF Lineage Processing & Better Logging Time & MultiProcessing (#20848)

* Fix multiprocessing with better memory management and Airflow 2+ compatibility

* Add support for both multiprocessing and multithreading for relevant platforms

* Handle conflicting cross-db lineage changes of service_name parameter change

* Handle stored proc queries without caching all and increase the thread timeout times to cover 100% lineage

* Fix `get_table_query` inheritance and pylint

* Remove  mocks from db_utils tests

* Better db_utils test and fix the service_names parameter in case of schema_fallback

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-09-03 11:26:14 +05:30
Suman Maharana
20e18d4f9f
Add ssl support to hive (#22831)
* Add ssl support to hive

* Added missing ts files

* Added version to pure transport

* Added Tests

* fix tests add missing files
2025-09-02 20:13:30 +05:30
Mayur Singal
08ee62a198
MINOR: Add Unstructured Formats Support to GCS Storage Connector (#23158) 2025-09-02 18:22:39 +05:30
Suman Maharana
30bceee580
Fixes #22204 - Add support for sources key metadata fetch in dbt (#23003)
* Added support for sources key metadata fetch in dbt

* address comments

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes

* fixed tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-02 10:22:15 +05:30
Mayur Singal
df6bf20179
MINOR: Add support for clone queries in bigquery (#23155) 2025-09-01 10:20:09 +05:30
Pere Miquel Brull
abcdc4e3d6
MINOR - Domain Independent DP Rule (#23067)
* MINOR - Domain Independent DP Rule

* handle DP

* Handle DP

* add migration

* improve rule mgmt

* improve rule mgmt

* add test for bulk op

* fix test

* handle in bulk

---------

Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
2025-08-29 17:28:29 +02:00
Oliver Schlüter
4f6ba7b010
fix: pass additional_client_config_arguments to OpenMetadata correctly (#23143)
* fix: pass additional_client_config_arguments to OpenMetadata correctly

* added missing import

* fixed checkstyle errors
2025-08-29 11:53:36 +02:00
Ayush Shah
29bf750bde
Minor: Fix Databricks & Unity Catalog, Table or View Not found (#23014) 2025-08-29 07:46:23 +05:30
Roman Sheludko
3d2e86e102
Fixes 22994 (PR): Add List import to avoid errors (#23144) 2025-08-28 20:28:48 +02:00
IceS2
a696fe0111
FIXES #20807: Fix Oracle DataDiff and Change Oracle Connection to BaseConnection (#23020)
* Fix Oracle DataDiff and Change Oracle Connection to BaseConnection

* Add small unittest

* Fix Test

* Fix logic, to void other engines to denormalize table/schema names
2025-08-26 11:03:40 +02:00
Mohit Tilala
744494968e
Fixes #22238: [SAP HANA] Add calculated view columns' formula parsing logic (#23017)
* Add calculated view columns' formula parsing logic with correct source reference

* Handle top level column formula parsing and pass formula expression in column lineage detail

---------

Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
2025-08-26 07:19:11 +05:30
Mayur Singal
7a6d5cd2fb
MINOR: Improve UC lineage exception handling (#23081) 2025-08-25 10:15:51 +00:00
NadezhdaNovotortseva
24e5155ffd
fix: airflow ingest get_pipelines_list order_by added (#22909)
* fix: airflow ingest get_pipelines_list order_by added

* code reformatted

---------

Co-authored-by: Надежда Коцюба <nadezhda.kotsyuba@uni.rest>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-08-24 19:49:09 +00:00
Oliver Schlüter
b29deea3de
Follow-up to #22474: Expose retry_codes parameter in Airflow lineage backend configuration (#22994) 2025-08-24 19:44:27 +00:00
Keshav Mohta
2f655daedc
Fix #18491: ingestion fails for Iceberg tables with nested partition column (#23031)
* fix: ingestion fails for Iceberg tables with nested partition column

* test: added test to cover nested partition column for iceberg

* refactor: used if-else in tablePartition check

* fix: partition_column_name & column_partition_type typo
2025-08-22 17:25:59 +05:30
Keshav Mohta
697b318f75
Fixes: Snowflake Tags Ingestion (#23040)
* fix: snowflake tag ingestion error handling

* fix: python checkstyle
2025-08-22 17:25:39 +05:30
Mohit Yadav
c0d7a574d7
chore(release): Prepare Branch for 1.10.0-SNAPSHOT (#23034)
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-21 21:43:01 +05:30
Suman Maharana
582d5eb7d7
Update: Tableau e2e Tests (#23026) 2025-08-21 09:00:23 +05:30
Ayush Shah
f19c0be59e
Fixes #21677: Refactor and enhance the entity name transformation logic (#22695) 2025-08-21 08:43:33 +05:30
Ayush Shah
726fa89c80
Refactor: remove doc changes from OM repo (#22019) 2025-08-20 14:28:48 +05:30
Mohit Tilala
26fedbaf0e
Fixes #22112: Snowflake schema tags inheritance (#22979)
* Add schema-level tags and tag inheritance support for snowflake

* Add tests for schema tag inheritance

* Lint fixes
2025-08-20 09:52:44 +05:30
Mohit Tilala
cc4b357444
Fixes #22238: [SAP HANA] Correction of physical schema mapping and column lookup at each layer of calculation view (#22952) 2025-08-19 18:45:06 +05:30
Keshav Mohta
f39f57ddcd
Fix #22340: Execution Time Support for NiFi Connector (#22981)
* feat: added nifi execution history

* doc: added pipeline status in available features
2025-08-19 08:25:02 +00:00
Teddy
d58b8a63d6
ISSUE #1753 - Add Row Count to Custom SQL Test (#22697)
* feat: add count rows support for custom SQL

* style: ran python linting

* feat: added logic for partitioned custom sql row count

* migration: partitionExpression parameter

* chore: resolve conflicts
2025-08-19 06:40:49 +02:00
Ayush Shah
cda93b1af5
Fix formatting in OpenMetadata class docstring for clarity on field overwrite limitations (#22965) 2025-08-18 16:56:01 +05:30
Copilot
1d7c03ac9b
Add comprehensive documentation for entity field update limitations in OpenMetadata Python SDK (#22865) 2025-08-18 14:10:59 +05:30
harshsoni2024
a7afad466a
e2e tableau use pat creds (#22834) 2025-08-15 11:34:45 +05:30
Mayur Singal
e74de5df81
MINOR: Fix usage failure after cross lineage (#22926) 2025-08-13 16:30:29 +05:30
harshsoni2024
2ceffa3e5c
precede source table name before pbi table name (#22902) 2025-08-13 09:59:08 +05:30
Mayur Singal
23959bd21a
Fix #22164: Fix airflow compatibility issue with version 2.2.5 (#22897) 2025-08-12 18:30:43 +05:30
Copilot
8cc9d2af71
Add OpenAPI YAML format support for REST API ingestion (#22304)
* Initial plan

* Implement OpenAPI YAML support with backward JSON compatibility

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* fix tests & lint

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harshach <38649+harshach@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-08-12 18:22:41 +05:30
Copilot
42bfd65a15
Fix typos in OpenMetadata documentation (#22899) 2025-08-12 17:27:40 +05:30
Mayur Singal
dc4c75ded4
FIX #21180: Implement Cross Service Lineage (#22665) 2025-08-12 16:54:40 +05:30
Dog
63bc1c5f5d
Fixes #18195: fix deprecate parameter (#18196)
* fix kafka deprecated module

* fix elasticsearch deprecated parameter

* fix trusted_certificates

* fix workflow.print_status

* fix workflow print status

* fix datime to timestamp

* fix type of trusted_certificates

* style py format

* fix import path

* revert datatime to timestamp

* format

* Fix small issues

* Fix Avro Deserializer

---------

Co-authored-by: IceS2 <pjt1991@gmail.com>
2025-08-12 08:37:00 +02:00
harshsoni2024
fceb9f3c00
issue-22425: PowerBI parse expression along with measure (#22870) 2025-08-12 10:17:37 +05:30
Ayush Shah
4f82ab0557
MINOR: Enhance Sample Data with Owner and Descriptions (#22872)
* Enhance Sample Data Generation: Update table and column limits, add description and owner fields to table creation requests in sample_data.py

* Refactor SampleDataSource: Improve readability by adjusting conditional formatting for owner checks in sample_data.py

* Reduced number of tables per schema to 10

* Update sample_data.py: Reduce the maximum number of columns per table from 2000 to 200 for improved data generation efficiency
2025-08-12 10:10:01 +05:30
Sriharsha Chintalapani
15b92735b9
Fix #1093: Add Grafana Support (#22571)
* Fix #1093: Add Grafana Support

* Update generated TypeScript types

* Grafana test fix

* Update

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Akash Verma <akashverma@Mac.lan>
2025-08-11 19:39:39 +05:30
Ayush Shah
e0e7cf21d3
MINOR: Refactor BigQuery client type hinting in bigquery_utils.py (#22873)
* Refactor BigQuery client type hinting in bigquery_utils.py

- Updated type hint for the BigQuery client to use forward declaration for better compatibility with type checking.
- Moved import statement for google.cloud.bigquery inside TYPE_CHECKING block to optimize imports during runtime.

* Refactor BigQuery client import structure in bigquery_utils.py

- Moved the import statement for google.cloud.bigquery inside the TYPE_CHECKING block to enhance type hinting compatibility.
- Adjusted the import location for better runtime performance and adherence to best practices.
2025-08-11 17:54:08 +05:30
Sriharsha Chintalapani
c4d395d14d
fix logger level in sample_data.yaml (#22867) 2025-08-11 12:25:20 +05:30
harshsoni2024
42572747ed
MINOR: e2e fixes (#22787) 2025-08-06 21:11:02 +00:00
Aleksey Stefonyak
65928c149a
MINOR - fix(elasticsearch): add None value filter (#18711)
* fix(elasticsearch.py) - add None value filter

Sometimes elasticsearch returns lists with None values in it.
To fix this issue we need to filter them out first, befor returning most relevant to the endpoint.

* fix tests

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-08-06 23:52:01 +05:30
mgorsk1
956a13f3f0
feat: Enable ES Search for Databases (#19041)
* 🎉 Init

* linter

* bring back removed code for api collection

* remove comment

* fix type hint

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2025-08-06 23:51:09 +05:30
Ariel Schulz
d31e2d8ba0
Feature/1 fix and add lineage to exasol connector (#21399)
* Add lineage to Exasol connector

* Update test_connection to return TestConnectionResult

* Add exasol tests & dependencies to tests in setup.py

* Opensearch is required for testing, so add it there

* Modify metadata

* Update documentation for lineage

* Apply formatting changes to code

* Apply make py_format
2025-08-06 23:49:38 +05:30
Akash Verma
1cc1abcd27
feature: googlesheet connector (#22464)
* feature: googlesheet connector

* updates

* minor change

* java checkstyle

* googledrive files

* fix

* make the google sheets work

* fix directory and files

* remove ui

* update generated types

---------

Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
Co-authored-by: Akash Verma <akashverma@Mac.lan>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2025-08-06 22:01:41 +05:30
Mayur Singal
00b6da5b84
MINOR: Improve Databricks Profiler & Test Connection (#22732) 2025-08-06 00:41:11 +05:30
Ayush Shah
c68ea8c83f
Enhance Ingestion Framework: Add Drive Service support and improve logging for User Profiles (#22733) 2025-08-05 18:17:55 +05:30
Mohit Yadav
b92e9d0e06
chore(release): Prepare Branch for 1.9.0-SNAPSHOT (#22742)
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-04 20:00:25 +05:30
harshsoni2024
1deb5adeb5
MINOR: fix e2e tests (#22723) 2025-08-04 19:00:56 +05:30
Mayur Singal
b74e181d52
MINOR: Improve Unity Catalog Usage (#22721) 2025-08-04 11:04:10 +05:30
Mayur Singal
fe28faa13f
MINOR: Add support for csv.gz in datalake (#22666)
* MINOR: Add support for csv.gz in datalake

* fileformat change

* Update generated TypeScript types

* pyformat
2025-08-01 17:39:19 +05:30