755 Commits

Author SHA1 Message Date
Pere Miquel Brull
39eed12f32
MINOR - Version match logic update & Airflow docs (#16157)
* airflow docs

* update version validation

* MINOR - docs and version match
2024-05-08 07:37:14 +02:00
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
Onkar Ravgan
87c8254c38
Fix #15454: Added protobuf parser complex schema support (#16071)
* Added protobuf parser complex schema support

* Added options keyword in proto testing
2024-04-30 17:59:27 +05:30
harshsoni2024
68e036418c
Fix #15719: Improve unit test to increase coverage. (#15905)
* issue-15719: unit test for superset db source

* issue-15719: use testcontainers for superset_api client test

* issue-15719: superset-api yield data changes

* fix failed test cases due to testcontainer version

* issue-15719: postgres container version fix

* issue-15719: setup & teardown with testcontainers

* issue-15719: remove more patch code
2024-04-29 08:00:39 +02:00
Ayush Shah
3621407642
Fixes #15732: Modify Reference for Tags to EntityName (#15938) 2024-04-25 11:53:46 +05:30
Ayush Shah
a15da7ec98
Issue #14812: Add support for empty string as missing count (#16017) 2024-04-25 09:45:26 +05:30
Teddy
449a5f2de3
FIX #11951 - ingestion logic for global profiler config (#15948)
* feat: add global metric configuration for the profiler

* style: ran java linting

* fix: renamed disable to disabled

* style: ran java linting

* feat: ometa sdk for profiler setting

* test: ingestion profiler global config tests

* fix: update metric name to use MetricType Enum

* fix: allow bot to retrieve settings

* fix: exclude GX artifacts

* feat: implement global profiler setting logic for ingestion side

* fix: exclude metrics if Metric is empty

* style: ran python linting

* style: ran python linting

* fix: skip empty metrics

* style: ran python linting

* fix: moved GET profiler config to seperate endpoint in system resource

* fix: moved compute metric filter to MetricFilter + renamed container

* fix: test failures

* fix: profiler test case
2024-04-22 22:35:37 +02:00
IceS2
cb801dedb4
FIXES 13209: Add Sagemaker Model Storage (#15986)
* Add Sagemaker Model Storage

* Fix checkstyle

* Sagemaker unittest

* Small refactor to be less verbose
2024-04-22 16:53:25 +02:00
Imri Paran
0a1018648c
Fixes #15566: add dynamodb row count (#15204)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* feat(profiler): add dynamodb row count

* feat(profiler): add dynamodb row count

* formatting

* fixed import

* format

* dded dynamodb row count

* format

* removed unused factory file

* removed "validate"

* migrations

* removed validations

* format

* linting

* fixed: test_amundsen.py

* Update schemaChanges.sql
2024-04-22 09:14:52 +02:00
Ayush Shah
d5b1465406
Fixes #14113 - Allow SSL file uploads (#15828) 2024-04-19 11:38:27 +05:30
Ayush Shah
0c3e580592
MINOR: Fix Entity Link Error (#15864) 2024-04-11 18:41:59 +05:30
IceS2
12a4c578a2
MINOR: Fix jsonpatch operation order (#15680)
* Mantain the OperationType Order when considering the dividing groups

* Remove reordering the jsonpatch operations from the backend

* Fix checkstyle

* Fix UnitTests to comply with no reordering

* Initial idea on how to fix our current jsonpatch builder from python

* fix(JsonUtils): Change JSONPatch library used

When creating a JSONPatch by using the 'createDiff' method, the library
we are using is not returning a correct JSONPatch when removing multiple
items from an array.

Since the library doesn't provide good ways to override this behavior
and fix it, we decided to move away from it and use the json-patch
library only for this specific operation.

* Fix linters

* Add docstrings

* Refactor patch updated on ingestion framework

* Add UnitTests

* Fix linters
2024-04-05 15:52:01 +02:00
Suman Maharana
16eaf925e9
FIX #13553 Added option to exclude drafts: superset ingestion (#15770)
* Added option to exclude drafts: superset ingestion

* Updated supserset yaml docs

* Added tests for exlcude draft dashboards

* Added tests for exlcude draft dashboards

* Formatted queries.py
2024-04-03 17:07:02 +05:30
Ayush Shah
b79e5c064b
Fix 15576 - Eval Data Type issue fix (#15702) 2024-04-03 15:51:19 +05:30
Teddy
205850be79
[MINOR] fix antlr parser definition for entity link (#15758)
* fix: update antlr regex for entity fqn

* fix: update antlr rule to allow single character

* style: ran python linting

* fix: updated antlr token for NAME_OR_FQN
2024-04-03 08:34:43 +00:00
harshsoni2024
feb33a0cc2
Fix #12964: Qlik Sense & Qlik Cloud filter draft dashboards (#15726)
* Fix #12964: filter draft dashboards from config

* Fix #12964: add unit test for qlik_sense

* Fix #12964: added UI and doc code

* Fix #12964: move includedraftdashboard flag from source_connection to source_config

* Fix #12964: filter draft dashboards in qlikcloud

* Fix #12964: add unit test for qlik cloud

* Fix #12964: remove unnecessary comments, code clean

* Fix #12964: pylint changes
2024-04-02 14:30:33 +02:00
Pere Miquel Brull
9d7bfa363e
MINOR - Clean metadata CLI (#15631)
* Docs

* MINOR - Clean metadata CLI

* remove tests
2024-03-26 16:36:47 +01:00
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
00677a1e1b
Fix External Account Json Schema Issue (#15671) 2024-03-23 16:47:55 +05:30
Ayush Shah
8b880bbf91
Fixes 14370: Add Azure Client, support Default Creds (#15554)
* Add Azure Client, support Default Creds
2024-03-22 14:28:42 +05:30
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
Ayush Shah
e06e5c1bdd
Fixes 15544: Histogram not working for more than 15 units (#15617) 2024-03-20 11:35:52 +05:30
Trs
4db9b775ea
#14169: Support external_account type for GCP Auth (#14166) 2024-03-16 19:59:02 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
Mayur Singal
80123b3c0a
Fix #15533: Fix name & display name for kafka json schema parser (#15534) 2024-03-12 23:02:41 +05:30
mgorsk1
98850ab5cc
feat: OpenLineage integration (#15317)
* 🎉 Init OpenLineage connector

Co-authored-by: dechoma <dominik.choma@gmail.com>

* MLH - make linter happy

* review fixes

* 🐛 Fix path for ol event in tests

* 🐛 Fix path for ol event in tests

* Update ingestion/setup.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/models.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* review fixes 2

* linter

* review

* review

* make linter happy

* fix test_yield_pipeline_lineage_details test

* make linter happy

* fix tests

* fix tests 2

---------

Co-authored-by: dechoma <dominik.choma@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-03-12 08:39:25 +01:00
Onkar Ravgan
1fc2c7f974
MINOR: Part 1 of #15090: dbt JSON Schema & Parsing Improvements (#15297) 2024-02-29 10:41:21 +05:30
Teddy
3e83bdac3d
ISSUE #14765 - Implement Athena Injected Partition Check (#15318)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* fix: catch injected partition table in Athena

* style: ran python linting
2024-02-28 14:20:59 +00:00
IceS2
418e281daa
Fixes 15375: Metabase metadata extraction fix (#15376) 2024-02-28 13:23:53 +05:30
Teddy
056e6368d0
Issue #14765 - Preparatory Work (#15312)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* updated ui side change for table partitioned key

* miner fix

* addressing comments

* fixed ci error

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-02-28 07:11:00 +01:00
Imri Paran
4967e091e6
MINOR: add test for sqla compiler (#15275)
* add test for sqla compiler
2024-02-22 14:45:47 +01:00
Onkar Ravgan
dfc7662449
Fix #15247: Fixed ingesting dbt owners with dot in name (#15261)
* Fixed dbt owners with dot

* fixed pytests

* Fixed pytest2

* rmv type
2024-02-20 16:06:54 +05:30
Pere Miquel Brull
62c0cc7563
#13985 - Azure KV Secrets Manager (#15192)
* #13985 - Azure KV Secrets Manager

* Format

* #13985 - Azure KV Secrets Manager

* #13985 - Azure KV Secrets Manager

* Simplify credentials loading

* Simplify credentials loading

* Simplify credentials loading
2024-02-20 07:18:35 +01:00
Imri Paran
aeb5fbe303
fixes #12591: add BigTable (#15122)
* feat(connector): add BigTable

* bigtable work

1. docstrings
2. tests
3. created a Row BaseModel
4. implemented a ClassConverter

* docs moved to separate PR

* format files

* minor cosmetic

- removed TODO
- changed headers' year to 2024 for new files
- fixed typos

* format

* formatting and comments

1. added missing docstrings.
2. abstracted the _find_instance method.
3. aliased the IDs used in the BigTable connection

* added comment regarding private key

* added comments regarding column families

* enclose get_schema_name_list in `try/except/else`

* format

* streamlined get_schema_name_list to include all logic in the try block
2024-02-13 08:28:01 +01:00
NiharDoshi99
2b56e34b19
#14930 bigquery support for pk, fk and column view description (#15042) 2024-02-07 16:49:27 +05:30
Onkar Ravgan
edb9c21bfd
Added /view to tableau dashboard url (#15031) 2024-02-05 20:18:02 +05:30
Mayur Singal
a9fc51ec8b
MINOR: Change sqllineage import to collate_sqllineage (#14870) 2024-02-05 19:44:08 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema (#14956)
* linting: fix python linting

* fix: get column types from parquet schema for parquet files

* style: python linting

* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
Sriharsha Chintalapani
2e95fcb98d
Fix #14786: Suggestions API (#14821)
* Fix #14786: Suggestions API

* Handle suggestions in ometa

* Minor: Optimise Databricks Client (#14776)

* MINOR - Fix SP topology context & Looker usage context (#14816)

* MINOR - Fix SP topology context & Looker usage context

* MINOR - Fix SP topology context & Looker usage context

* Fix tests

* Fixes #14598: Fix Tags / Labels ingestion on includeTags as False (#14782)

* fix(ui): password error message for char limits (#14808)

* fix(ui): password error message for char limits

* fix java side code

* Fixes #13556: Support for Salesforce table description ingestion (#14733)

* ISSUE-13556: Add suport for Salesforce table description ingestion

* ISSUE-13556: Remove unnecessary blank line

* ISSUE-13556: Fix to get description for each table

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>

* MINOR - Better handling of Ingestion Pipeline Status (#14792)

* MINOR - Better handling of Ingestion Pipeline Status

* format

* format

* MINOR: Added table validation for cost analysis data (#14793)

* Added validation for cost analysis source

* centralized life cycle logic

* CYPRESS: simplify side navigation click in cypress (#14818)

* simplify side navigation click in cypress

* make sidbar item uses common enum

* fix cypress failure of outside import

* fix(#14326): tier dropdown is not working in advance search (#14780)

* improvement in advance search based on custom property

* fix a reading undefined property issue

* wip: advance search based on tier

* some code cleanup and improvement

* some fixes

* fix: ui flicker when advanceSearched is apply and refresh the page

* some cleanup

* no need to call customproperty api call, if entity not suppport customProperties

* minor change

* fix: autocomplete not working in tier search option in advance search modal

* added unit test for advance search provider component

* some cleanup

* added testcase for open modal

* added testcase for resetAllFilters method

* removed unwanted code

* added e2e test for testing tier advance search

* fix: e2e search flow for single field

* fix: string field not working after giving listValues in TierSearch

* fix: group query e2e test fix

* used asyncFetch way to get the tierOptions synchronously

* some cleanup

* remove unwanted lines

* some cleanup

* fix: selected option show option value instead of option title

* fix(minor): update skip icon for executions (#14809)

* Fixes #14803: ignore capitalization when confirming deletes  (#14804)

* ignore case when confirming deletes

* Test confirmation of deletes works when case differs 

Added test case for 'delete' as the confirmation text.

* minor(config): update openmetadata-ui code reviewers (#14823)

* Add Tests

* Add list/accept/reject apis

* initial ui changes

* localisation

* show suggestion for empty description

* ui feedbacks

* Fix permission check for entities without owner

* Fix entityLink and add tests

* Add update suggestion WIP

* Fix test

* Fix PUT and Pagination

* Fix styling

* update test

* Update status

* add OM server connection in apps

* add permissions check

* Fix CI

* Remove TODO

* Fix feedResourceTest

* fix unit tests

* add private configs for apps

* add private configs for apps

* fix update application icons

* minor center align icon

* add private configs for apps

* Format

* Fix pydantic gen

* Remove token

* Update name

* Rework private conf

* Fix apps

* Fix apps

* Format

* Format

* show metapilot only if its installed

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: Ayush Shah <ayush@getcollate.io>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: kwgdaig <18678754+kwgdaig@users.noreply.github.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
Co-authored-by: Abhishek Porwal <80886271+Abhishek332@users.noreply.github.com>
Co-authored-by: Carlo Q <carlo@machina.bio>
Co-authored-by: karanh37 <karanh37@gmail.com>
2024-01-31 18:51:09 -08:00
IceS2
373cafcda2
Fixes #5448: Implement initial Iceberg Connector using PyIceberg (#14825)
* Create the iceberg connection schema

* Link the IcebergConnection configuration with the forms on the UI

* Add the pyiceberg dependency on the ingestion package

* Create the get_connection and test_connection functions

* First iteration on the iceberg ingestion logic

* Add A more comprehensive implementation of the Iceberg Source

* Add UnitTests

* Update icebergConnection definition

* Update the iceberg souce code based on new schema

* Updated icebergConnecgtion schema for simplicity and to be able to configure Converters

* Updated setup dependencies to be more flexible

* Updated get_owner_ref logic

* Fix formatting

* Changed the icebergConnection json schema structure to enable the ClassConverters

* Add the IcebergCatalog and IcebergFileSystem ClassConverters

* Refactor the code to take into account the new jsonSchema structure

* Fix formatting

* Add Documentation for the Iceberg Connector

* Fix Menu order for Iceberg

* ui: add Iceberg service icon and constant

* Fix DynamoDb Catalog issue due to how PyIceberg instantes it

* Changed uri title to URI

* Fix ClassConverter for Iceberg

* Fix GetSecretValue for password types

* Fix formatting

* Fix formatting

* Add Iceberg Connector Images for the docs

* Add pylint disable for Hacky super() call

* Add Iceberg.md for the UI docs

* Fix pylint complaint

* Fix pylint complaint

* Fix UnitTests

* fix type error and unit tests

* update pipeline type checks

* Fix Sonar Cloud complaints

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2024-01-29 06:32:58 +01:00
Pere Miquel Brull
db985fda57
MINOR - Snowflake system queries to work with ES & IDENTIFIER (#14864) 2024-01-26 18:41:16 +05:30
NiharDoshi99
c1d62186df
MINOR - metadata tag extraction for Databricks (#14874)
* metadata tag extraction for databaricks

* fix python test

* changes as per comment

* fix python test

* fix python checkstyle
2024-01-26 07:09:24 +01:00
Ayush Shah
1552aeb2de
Fix #13149: Multiple Project Id for Datalake GCS (#14846)
* Fix Multiple Project Id for datalake gcs

* Optimize logic

* Fix Tests

* Add Datalake GCS Tests

* Add multiple project id gcs test
2024-01-25 10:52:16 +01:00
Pere Miquel Brull
85e2058979
MINOR - Fix & Organize topology context (#14838)
* MINOR - Fix & Organize topology context

* Handle missing context charts
2024-01-25 08:22:07 +01:00
Onkar Ravgan
80fff72949
Fix #14794: Refactored and cleaned owner processing in sources (#14817)
* refactor owner processing

* Add exception handling and fix pytest

* review comments addressed

* looker tests

* fixed pycheckstyle
2024-01-25 06:46:22 +01:00
Mayur Singal
17fb2cabca
MINOR: Lineage handle copy queries being skipped (#14855) 2024-01-25 10:15:32 +05:30
NiharDoshi99
2efa0c9e28
#13974 handle for hyphen in schema and median function (#14834) 2024-01-24 15:57:36 +05:30