284 Commits

Author SHA1 Message Date
Teddy
141ceb4c8d
MNINOR add common test elements to _openmetadata_testutils module (#16758)
* fix: add common test to _testutils module

* fix: renamed _testutils to _openmetadata_testutils
2024-06-21 15:11:34 +02:00
Suman Maharana
5bd48fcc34
Fixes #14065 : Added DBT Cloud connector (#16705) 2024-06-21 17:16:47 +05:30
Ayush Shah
c9a017d8db
#16720: Add Support for Salesforce SSL (#16719) 2024-06-20 12:10:41 +05:30
IceS2
f0049853ec
FIXES 14885: Initial deltalake implementation for s3 (#16665)
* Initial deltalake implementation for s3

* Fix styles

* Fix test_amundsen

* Fix UnitTests

* Fix Checkstyle

* Fix integration tests due to datalake client refactor

* Fix unit tests

* Fix tests

* Fix Integration DeltaLake Storage test

* Skip delta storage integration test for python 3.8

* DeltaLake JSONSchema changes migrations

* Update import name

* Add some comments based on sonarcloud suggestions

* Update DeltaLake documentation

* Resolve some comments
2024-06-20 12:08:21 +05:30
Mayur Singal
57e51df05f
MINOR: Fix superset cypress error (#16689) 2024-06-18 11:36:51 +05:30
Pere Miquel Brull
cb72a22b59
Fix - e2e tests for pydantic V2 (#16551)
* Fix - e2e tests for pydantic V2

* add correct default

* add correct default

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* fix apis

* format
2024-06-06 19:36:17 -07:00
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Suman Maharana
bd3f47a563
MINOR - Added quicksight pydantic models (#16269)
* Added quicksight pydantic models

* pyformat

* resolved type hints

* Renamed sheet -> chart in models
2024-05-17 08:40:20 +02:00
harshsoni2024
d6046c811c
MINOR: superset pytest fix (#16299)
* superset testcontainer image tag fix

* update chart data acc. to runtime
2024-05-17 07:27:36 +02:00
Pere Miquel Brull
53185fd30b
MINOR - Add Integration Test for S3 Storage (#16277)
* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* MINOR - Add Integration Test for S3 Storage

* format

* format
2024-05-16 10:03:27 +02:00
Suman Maharana
0e2736ee74
MINOR: Removed supportsDDL from json schemas (#16171) 2024-05-10 17:40:12 +05:30
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
harshsoni2024
68e036418c
Fix #15719: Improve unit test to increase coverage. (#15905)
* issue-15719: unit test for superset db source

* issue-15719: use testcontainers for superset_api client test

* issue-15719: superset-api yield data changes

* fix failed test cases due to testcontainer version

* issue-15719: postgres container version fix

* issue-15719: setup & teardown with testcontainers

* issue-15719: remove more patch code
2024-04-29 08:00:39 +02:00
Ayush Shah
3621407642
Fixes #15732: Modify Reference for Tags to EntityName (#15938) 2024-04-25 11:53:46 +05:30
IceS2
cb801dedb4
FIXES 13209: Add Sagemaker Model Storage (#15986)
* Add Sagemaker Model Storage

* Fix checkstyle

* Sagemaker unittest

* Small refactor to be less verbose
2024-04-22 16:53:25 +02:00
Imri Paran
0a1018648c
Fixes #15566: add dynamodb row count (#15204)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* feat(profiler): add dynamodb row count

* feat(profiler): add dynamodb row count

* formatting

* fixed import

* format

* dded dynamodb row count

* format

* removed unused factory file

* removed "validate"

* migrations

* removed validations

* format

* linting

* fixed: test_amundsen.py

* Update schemaChanges.sql
2024-04-22 09:14:52 +02:00
Ayush Shah
d5b1465406
Fixes #14113 - Allow SSL file uploads (#15828) 2024-04-19 11:38:27 +05:30
Suman Maharana
16eaf925e9
FIX #13553 Added option to exclude drafts: superset ingestion (#15770)
* Added option to exclude drafts: superset ingestion

* Updated supserset yaml docs

* Added tests for exlcude draft dashboards

* Added tests for exlcude draft dashboards

* Formatted queries.py
2024-04-03 17:07:02 +05:30
harshsoni2024
feb33a0cc2
Fix #12964: Qlik Sense & Qlik Cloud filter draft dashboards (#15726)
* Fix #12964: filter draft dashboards from config

* Fix #12964: add unit test for qlik_sense

* Fix #12964: added UI and doc code

* Fix #12964: move includedraftdashboard flag from source_connection to source_config

* Fix #12964: filter draft dashboards in qlikcloud

* Fix #12964: add unit test for qlik cloud

* Fix #12964: remove unnecessary comments, code clean

* Fix #12964: pylint changes
2024-04-02 14:30:33 +02:00
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
mgorsk1
98850ab5cc
feat: OpenLineage integration (#15317)
* 🎉 Init OpenLineage connector

Co-authored-by: dechoma <dominik.choma@gmail.com>

* MLH - make linter happy

* review fixes

* 🐛 Fix path for ol event in tests

* 🐛 Fix path for ol event in tests

* Update ingestion/setup.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/models.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* review fixes 2

* linter

* review

* review

* make linter happy

* fix test_yield_pipeline_lineage_details test

* make linter happy

* fix tests

* fix tests 2

---------

Co-authored-by: dechoma <dominik.choma@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-03-12 08:39:25 +01:00
IceS2
418e281daa
Fixes 15375: Metabase metadata extraction fix (#15376) 2024-02-28 13:23:53 +05:30
Teddy
056e6368d0
Issue #14765 - Preparatory Work (#15312)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* updated ui side change for table partitioned key

* miner fix

* addressing comments

* fixed ci error

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-02-28 07:11:00 +01:00
Imri Paran
aeb5fbe303
fixes #12591: add BigTable (#15122)
* feat(connector): add BigTable

* bigtable work

1. docstrings
2. tests
3. created a Row BaseModel
4. implemented a ClassConverter

* docs moved to separate PR

* format files

* minor cosmetic

- removed TODO
- changed headers' year to 2024 for new files
- fixed typos

* format

* formatting and comments

1. added missing docstrings.
2. abstracted the _find_instance method.
3. aliased the IDs used in the BigTable connection

* added comment regarding private key

* added comments regarding column families

* enclose get_schema_name_list in `try/except/else`

* format

* streamlined get_schema_name_list to include all logic in the try block
2024-02-13 08:28:01 +01:00
NiharDoshi99
2b56e34b19
#14930 bigquery support for pk, fk and column view description (#15042) 2024-02-07 16:49:27 +05:30
Onkar Ravgan
edb9c21bfd
Added /view to tableau dashboard url (#15031) 2024-02-05 20:18:02 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema (#14956)
* linting: fix python linting

* fix: get column types from parquet schema for parquet files

* style: python linting

* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
IceS2
373cafcda2
Fixes #5448: Implement initial Iceberg Connector using PyIceberg (#14825)
* Create the iceberg connection schema

* Link the IcebergConnection configuration with the forms on the UI

* Add the pyiceberg dependency on the ingestion package

* Create the get_connection and test_connection functions

* First iteration on the iceberg ingestion logic

* Add A more comprehensive implementation of the Iceberg Source

* Add UnitTests

* Update icebergConnection definition

* Update the iceberg souce code based on new schema

* Updated icebergConnecgtion schema for simplicity and to be able to configure Converters

* Updated setup dependencies to be more flexible

* Updated get_owner_ref logic

* Fix formatting

* Changed the icebergConnection json schema structure to enable the ClassConverters

* Add the IcebergCatalog and IcebergFileSystem ClassConverters

* Refactor the code to take into account the new jsonSchema structure

* Fix formatting

* Add Documentation for the Iceberg Connector

* Fix Menu order for Iceberg

* ui: add Iceberg service icon and constant

* Fix DynamoDb Catalog issue due to how PyIceberg instantes it

* Changed uri title to URI

* Fix ClassConverter for Iceberg

* Fix GetSecretValue for password types

* Fix formatting

* Fix formatting

* Add Iceberg Connector Images for the docs

* Add pylint disable for Hacky super() call

* Add Iceberg.md for the UI docs

* Fix pylint complaint

* Fix pylint complaint

* Fix UnitTests

* fix type error and unit tests

* update pipeline type checks

* Fix Sonar Cloud complaints

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2024-01-29 06:32:58 +01:00
NiharDoshi99
c1d62186df
MINOR - metadata tag extraction for Databricks (#14874)
* metadata tag extraction for databaricks

* fix python test

* changes as per comment

* fix python test

* fix python checkstyle
2024-01-26 07:09:24 +01:00
Ayush Shah
1552aeb2de
Fix #13149: Multiple Project Id for Datalake GCS (#14846)
* Fix Multiple Project Id for datalake gcs

* Optimize logic

* Fix Tests

* Add Datalake GCS Tests

* Add multiple project id gcs test
2024-01-25 10:52:16 +01:00
Pere Miquel Brull
85e2058979
MINOR - Fix & Organize topology context (#14838)
* MINOR - Fix & Organize topology context

* Handle missing context charts
2024-01-25 08:22:07 +01:00
Onkar Ravgan
80fff72949
Fix #14794: Refactored and cleaned owner processing in sources (#14817)
* refactor owner processing

* Add exception handling and fix pytest

* review comments addressed

* looker tests

* fixed pycheckstyle
2024-01-25 06:46:22 +01:00
Shiyang Xiao
9f5a70bd71
MINOR - update docs & added unit test for SAS Connector (#14743)
Co-authored-by: Shiyang Xiao <Shiyang.Xiao@sas.com>
2024-01-23 14:55:29 -08:00
Pere Miquel Brull
337796d612
MINOR - Fix SP topology context & Looker usage context (#14816)
* MINOR - Fix SP topology context & Looker usage context

* MINOR - Fix SP topology context & Looker usage context

* Fix tests
2024-01-23 07:02:39 +01:00
NiharDoshi99
3f78e072e1
#13429 support for struct data type in hive (#14785) 2024-01-19 18:26:53 +05:30
Onkar Ravgan
f2219a10f3
Fixed oracle tests (#14738) 2024-01-16 17:39:10 +01:00
Onkar Ravgan
64a4e1afce
Fix 12180, 14158: Added LF tags to Athena (#14718)
* Added LF tags to athena

* fixed pytests

* Added docs
2024-01-16 14:24:31 +05:30
NiharDoshi99
54d34934c1
#14630 added oracle stored procedures (#14641) 2024-01-15 18:28:27 +05:30
Pere Miquel Brull
24643a397a
#14492 - Fix Snowflake SP parsing with empty signature (#14623) 2024-01-08 11:16:35 -08:00
Mayur Singal
a789fc86d6
Fix #13053: Remove Connection URI config MongoDB (#14584)
* Fix #13053: Remove Connection URI cofig MongoDB

* pyformat & test fixes
2024-01-05 10:51:12 -08:00
Pere Miquel Brull
f4bbca3f72
MINOR - Clean topology & add tests (#14527)
* Clean topo

* Format

* Add tests

* Fix tests

* Merge main
2023-12-29 17:00:59 +01:00
Pere Miquel Brull
b84ce33b80
#11799 - Fix Airfow ownership & add pipeline tasks (#14510)
* Fix airflow owner and add tasks

* Add pipeline tasks ownership

* MINOR - Fix py CI

* Add pipeline tasks ownership

* Add pipeline tasks ownership

* MINOR - Fix py CI

* MINOR - Fix py CI

* Add pipeline tasks ownership

* patch team

* patch team

* Format
2023-12-28 10:25:00 -08:00
Pere Miquel Brull
a83a5ba3a3
MINOR - Skip delta tests for 3.11 (#14398)
* MINOR - Bump delta for 3.11

* Update flags

* MINOR - Bump delta for 3.11

* Update tests regex

* Update version

* Deprecations

* Format

* Version

* Try delta spark

* Skip delta tests for 3.11

* Update ingestion/tests/unit/topology/pipeline/test_airflow.py
2023-12-18 17:01:57 +01:00
Lucas Garcia
fe06b5cbb2
#14235: adding dialect based on connection type to LineageParser (#14249)
* Fix #14235: adding dialect based on connection type to LineageParser

* Fix: formating changes

* Update ingestion/src/metadata/ingestion/source/dashboard/metabase/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* style: fix indentation errors

* Fix pytest

---------

Co-authored-by: LucasGarcia07 <lucas.junqueira@hurb.com>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2023-12-08 19:49:59 +05:30
NiharDoshi99
8d925c46a5
#13696: add support for dot in schema name to fetch tables (#14246) 2023-12-08 12:04:28 +05:30