1118 Commits

Author SHA1 Message Date
Imri Paran
c277233ef1
MINOR: use archive instead of volume for postgres test (#16245)
* using archive instead of volume for postgres test

* format

* remove usage of request
2024-05-14 09:11:16 +00:00
Suman Maharana
0e2736ee74
MINOR: Removed supportsDDL from json schemas (#16171) 2024-05-10 17:40:12 +05:30
Pere Miquel Brull
39eed12f32
MINOR - Version match logic update & Airflow docs (#16157)
* airflow docs

* update version validation

* MINOR - docs and version match
2024-05-08 07:37:14 +02:00
Onkar Ravgan
4a6849a05d
MINOR: Added custom property EntityReference support to python sdk (#16132)
* Added cust prop entityref to python sdk

* Added name and displayName fields to entityref
2024-05-07 17:35:39 +05:30
Mohit Yadav
0769d71ee7
Fixes Test Suite Reference in Table Schema (#16129)
* Fixes Test Suite Reference in Table Schema

* fix: fix test suite to interact with entity reference

---------

Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>
2024-05-06 19:03:23 +05:30
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
IceS2
795879d776
MINOR: Fix issue with SQLAlchemy types not being correctly mapped to OM Type on the profiler (#16122)
* Fix issue with SQLAlchemy types not being correctly mapped to OM Types on the profiler

* Fix checkstyle
2024-05-03 17:05:52 +02:00
Onkar Ravgan
3c083cdb68
updated pbi e2e counts (#16109) 2024-05-03 06:57:15 +02:00
Onkar Ravgan
87c8254c38
Fix #15454: Added protobuf parser complex schema support (#16071)
* Added protobuf parser complex schema support

* Added options keyword in proto testing
2024-04-30 17:59:27 +05:30
Onkar Ravgan
ceaa9d3e8a
Fix #15611 Parse PowerBI Dax files for lineage (#15975) 2024-04-29 14:55:06 +05:30
harshsoni2024
68e036418c
Fix #15719: Improve unit test to increase coverage. (#15905)
* issue-15719: unit test for superset db source

* issue-15719: use testcontainers for superset_api client test

* issue-15719: superset-api yield data changes

* fix failed test cases due to testcontainer version

* issue-15719: postgres container version fix

* issue-15719: setup & teardown with testcontainers

* issue-15719: remove more patch code
2024-04-29 08:00:39 +02:00
Teddy
4ed87a4d08
Fix #15341 - Test Case reference as inherited field for Test Case Incident (#16027)
* fix: unique test computation to scalar_subquery

* fix: make test case reference an inherited field

* style: ran java linting

* fix: added test case resolution migration

* style: ran java linting
2024-04-25 17:31:11 +02:00
Onkar Ravgan
828e9abc97
Added enum support in custom prop python sdk (#16026) 2024-04-25 14:46:55 +05:30
Ayush Shah
3621407642
Fixes #15732: Modify Reference for Tags to EntityName (#15938) 2024-04-25 11:53:46 +05:30
Ayush Shah
a15da7ec98
Issue #14812: Add support for empty string as missing count (#16017) 2024-04-25 09:45:26 +05:30
Ayush Shah
0963a111fe
Fixes #12127: Add Support for Complex types of Databricks & UnityCatalog in profiler (#15976) 2024-04-23 15:54:36 +05:30
Pere Miquel Brull
df5d5e1866
MINOR - Fix datamodel lineage call (#15991)
* MINOR - Fix datamodel lineage call

* amend merge
2024-04-23 09:56:24 +02:00
Mayur Singal
85b6983eee
Fix #15062 & #14810: Fix Column level lineage overwrites pipeline Lineage & manual col lineage (#15897) 2024-04-23 09:37:43 +05:30
Teddy
449a5f2de3
FIX #11951 - ingestion logic for global profiler config (#15948)
* feat: add global metric configuration for the profiler

* style: ran java linting

* fix: renamed disable to disabled

* style: ran java linting

* feat: ometa sdk for profiler setting

* test: ingestion profiler global config tests

* fix: update metric name to use MetricType Enum

* fix: allow bot to retrieve settings

* fix: exclude GX artifacts

* feat: implement global profiler setting logic for ingestion side

* fix: exclude metrics if Metric is empty

* style: ran python linting

* style: ran python linting

* fix: skip empty metrics

* style: ran python linting

* fix: moved GET profiler config to seperate endpoint in system resource

* fix: moved compute metric filter to MetricFilter + renamed container

* fix: test failures

* fix: profiler test case
2024-04-22 22:35:37 +02:00
Imri Paran
93ec391f5c
MINOR: Dynamodb sample data (#15264)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* feat(profiler): add dynamodb row count

* feat(profiler): add dynamodb row count

* formatting

* validate_compose: raise exception for bad status code.

* fixed import

* format

* feat(nosql-profiler): added sample data

1. Implemented the NoSQL sampler.
2. Some naming changes to the NoSQL adaptor to avoid fixing names with the profiler interface.
3. Tests.

* added default sample limit

* formatting

* fixed import

* feat(profiler): dynamodb sample data

* tests for dynamo db sample data

* format

* format

* use service connection for nosql adaptor factory

* fixed tests

* format

* fixed after merge
2024-04-22 17:46:40 +02:00
IceS2
cb801dedb4
FIXES 13209: Add Sagemaker Model Storage (#15986)
* Add Sagemaker Model Storage

* Fix checkstyle

* Sagemaker unittest

* Small refactor to be less verbose
2024-04-22 16:53:25 +02:00
IceS2
08c114c340
FIXES 15626: Fix issue with not url model store (#15974)
* Changed the MLModelStore storage type to string

* fix checkstyle

* remove unused files

* Update requirements

* fix checkstyle

* Skipping MLFlow intergration on python 3.8

* Hack to allow pytest to parse the mlflow integrations test on python 3.8

* Fix checkstyle
2024-04-22 15:50:44 +02:00
IceS2
19fa15f010
fix e2e (#15981) 2024-04-22 09:57:06 +02:00
Imri Paran
0a1018648c
Fixes #15566: add dynamodb row count (#15204)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* feat(profiler): add dynamodb row count

* feat(profiler): add dynamodb row count

* formatting

* fixed import

* format

* dded dynamodb row count

* format

* removed unused factory file

* removed "validate"

* migrations

* removed validations

* format

* linting

* fixed: test_amundsen.py

* Update schemaChanges.sql
2024-04-22 09:14:52 +02:00
Imri Paran
d8781bbef2
MINOR: postgres integration test (#15929)
* implemented postgres-integration-tests

* format

* format

* - disable ryuk
- disabled verbose sqlfluff logging

* query usage assertion
2024-04-19 10:00:37 -07:00
Ayush Shah
d5b1465406
Fixes #14113 - Allow SSL file uploads (#15828) 2024-04-19 11:38:27 +05:30
Imri Paran
47f0d99333
MINOR: add raise_from_status for sql_server test (#15931)
* Update test_metadata_ingestion.py

* Update test_metadata_ingestion.py

* fixed import
2024-04-17 14:52:10 +02:00
Imri Paran
29cd58b628
MINOR: added integration test for SQL SERVER (#15919)
* adventure works mssql test case

* adventure works mssql test case

* fixed tests

* fixed tests

* fixed tests

* fixed tests
2024-04-17 12:19:37 +02:00
Ayush Shah
0c3e580592
MINOR: Fix Entity Link Error (#15864) 2024-04-11 18:41:59 +05:30
Pere Miquel Brull
a1404e6b4a
MINOR - Clean ingestion dependencies (#15679)
* WIP - MINOR - Clean ingestion dependencies

* test

* test

* Clean imports

* add pyiceberg for test

* Revert "add pyiceberg for test"

This reverts commit ab26942736586f089a57a644ffd727aca200db62.

* add pyiceberg for test

* Remove docker dep

* clean local docker sh

* MINOR - AKS Airflow troubleshooting docs

* Fix action

* clean local docker sh
2024-04-11 14:30:40 +02:00
harshsoni2024
c671e64f69
[MINOR] Fix tableau e2e (#15824) 2024-04-08 21:17:37 +05:30
IceS2
c909ff8857
MINOR: Fix e2e tests (#15829)
* Update values

* Update values

* Fix checkstyle
2024-04-08 15:58:32 +02:00
IceS2
12a4c578a2
MINOR: Fix jsonpatch operation order (#15680)
* Mantain the OperationType Order when considering the dividing groups

* Remove reordering the jsonpatch operations from the backend

* Fix checkstyle

* Fix UnitTests to comply with no reordering

* Initial idea on how to fix our current jsonpatch builder from python

* fix(JsonUtils): Change JSONPatch library used

When creating a JSONPatch by using the 'createDiff' method, the library
we are using is not returning a correct JSONPatch when removing multiple
items from an array.

Since the library doesn't provide good ways to override this behavior
and fix it, we decided to move away from it and use the json-patch
library only for this specific operation.

* Fix linters

* Add docstrings

* Refactor patch updated on ingestion framework

* Add UnitTests

* Fix linters
2024-04-05 15:52:01 +02:00
Suman Maharana
16eaf925e9
FIX #13553 Added option to exclude drafts: superset ingestion (#15770)
* Added option to exclude drafts: superset ingestion

* Updated supserset yaml docs

* Added tests for exlcude draft dashboards

* Added tests for exlcude draft dashboards

* Formatted queries.py
2024-04-03 17:07:02 +05:30
Ayush Shah
b79e5c064b
Fix 15576 - Eval Data Type issue fix (#15702) 2024-04-03 15:51:19 +05:30
Teddy
205850be79
[MINOR] fix antlr parser definition for entity link (#15758)
* fix: update antlr regex for entity fqn

* fix: update antlr rule to allow single character

* style: ran python linting

* fix: updated antlr token for NAME_OR_FQN
2024-04-03 08:34:43 +00:00
harshsoni2024
feb33a0cc2
Fix #12964: Qlik Sense & Qlik Cloud filter draft dashboards (#15726)
* Fix #12964: filter draft dashboards from config

* Fix #12964: add unit test for qlik_sense

* Fix #12964: added UI and doc code

* Fix #12964: move includedraftdashboard flag from source_connection to source_config

* Fix #12964: filter draft dashboards in qlikcloud

* Fix #12964: add unit test for qlik cloud

* Fix #12964: remove unnecessary comments, code clean

* Fix #12964: pylint changes
2024-04-02 14:30:33 +02:00
Pere Miquel Brull
890820ed92
MINOR - App routes & datamodel (#15722)
* MINOR - App routes & datamodel

* fix future annotations

* fix future annotations
2024-03-27 19:12:24 +01:00
Pere Miquel Brull
9d7bfa363e
MINOR - Clean metadata CLI (#15631)
* Docs

* MINOR - Clean metadata CLI

* remove tests
2024-03-26 16:36:47 +01:00
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
00677a1e1b
Fix External Account Json Schema Issue (#15671) 2024-03-23 16:47:55 +05:30
Ayush Shah
8b880bbf91
Fixes 14370: Add Azure Client, support Default Creds (#15554)
* Add Azure Client, support Default Creds
2024-03-22 14:28:42 +05:30
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
Ayush Shah
e06e5c1bdd
Fixes 15544: Histogram not working for more than 15 units (#15617) 2024-03-20 11:35:52 +05:30
Trs
4db9b775ea
#14169: Support external_account type for GCP Auth (#14166) 2024-03-16 19:59:02 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Onkar Ravgan
46954dc848
Fix #15563: Fixed incorrect col ordering after patch request from ingestion (#15577)
* fixed patch col order

* Added excp handling

* changed logs to warning

* rmv excp
2024-03-15 13:08:33 +05:30
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30