843 Commits

Author SHA1 Message Date
Ayush Shah
d5b1465406
Fixes #14113 - Allow SSL file uploads (#15828) 2024-04-19 11:38:27 +05:30
Imri Paran
47f0d99333
MINOR: add raise_from_status for sql_server test (#15931)
* Update test_metadata_ingestion.py

* Update test_metadata_ingestion.py

* fixed import
2024-04-17 14:52:10 +02:00
Imri Paran
29cd58b628
MINOR: added integration test for SQL SERVER (#15919)
* adventure works mssql test case

* adventure works mssql test case

* fixed tests

* fixed tests

* fixed tests

* fixed tests
2024-04-17 12:19:37 +02:00
Ayush Shah
0c3e580592
MINOR: Fix Entity Link Error (#15864) 2024-04-11 18:41:59 +05:30
Pere Miquel Brull
a1404e6b4a
MINOR - Clean ingestion dependencies (#15679)
* WIP - MINOR - Clean ingestion dependencies

* test

* test

* Clean imports

* add pyiceberg for test

* Revert "add pyiceberg for test"

This reverts commit ab26942736586f089a57a644ffd727aca200db62.

* add pyiceberg for test

* Remove docker dep

* clean local docker sh

* MINOR - AKS Airflow troubleshooting docs

* Fix action

* clean local docker sh
2024-04-11 14:30:40 +02:00
harshsoni2024
c671e64f69
[MINOR] Fix tableau e2e (#15824) 2024-04-08 21:17:37 +05:30
IceS2
c909ff8857
MINOR: Fix e2e tests (#15829)
* Update values

* Update values

* Fix checkstyle
2024-04-08 15:58:32 +02:00
IceS2
12a4c578a2
MINOR: Fix jsonpatch operation order (#15680)
* Mantain the OperationType Order when considering the dividing groups

* Remove reordering the jsonpatch operations from the backend

* Fix checkstyle

* Fix UnitTests to comply with no reordering

* Initial idea on how to fix our current jsonpatch builder from python

* fix(JsonUtils): Change JSONPatch library used

When creating a JSONPatch by using the 'createDiff' method, the library
we are using is not returning a correct JSONPatch when removing multiple
items from an array.

Since the library doesn't provide good ways to override this behavior
and fix it, we decided to move away from it and use the json-patch
library only for this specific operation.

* Fix linters

* Add docstrings

* Refactor patch updated on ingestion framework

* Add UnitTests

* Fix linters
2024-04-05 15:52:01 +02:00
Suman Maharana
16eaf925e9
FIX #13553 Added option to exclude drafts: superset ingestion (#15770)
* Added option to exclude drafts: superset ingestion

* Updated supserset yaml docs

* Added tests for exlcude draft dashboards

* Added tests for exlcude draft dashboards

* Formatted queries.py
2024-04-03 17:07:02 +05:30
Ayush Shah
b79e5c064b
Fix 15576 - Eval Data Type issue fix (#15702) 2024-04-03 15:51:19 +05:30
Teddy
205850be79
[MINOR] fix antlr parser definition for entity link (#15758)
* fix: update antlr regex for entity fqn

* fix: update antlr rule to allow single character

* style: ran python linting

* fix: updated antlr token for NAME_OR_FQN
2024-04-03 08:34:43 +00:00
harshsoni2024
feb33a0cc2
Fix #12964: Qlik Sense & Qlik Cloud filter draft dashboards (#15726)
* Fix #12964: filter draft dashboards from config

* Fix #12964: add unit test for qlik_sense

* Fix #12964: added UI and doc code

* Fix #12964: move includedraftdashboard flag from source_connection to source_config

* Fix #12964: filter draft dashboards in qlikcloud

* Fix #12964: add unit test for qlik cloud

* Fix #12964: remove unnecessary comments, code clean

* Fix #12964: pylint changes
2024-04-02 14:30:33 +02:00
Pere Miquel Brull
890820ed92
MINOR - App routes & datamodel (#15722)
* MINOR - App routes & datamodel

* fix future annotations

* fix future annotations
2024-03-27 19:12:24 +01:00
Pere Miquel Brull
9d7bfa363e
MINOR - Clean metadata CLI (#15631)
* Docs

* MINOR - Clean metadata CLI

* remove tests
2024-03-26 16:36:47 +01:00
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
00677a1e1b
Fix External Account Json Schema Issue (#15671) 2024-03-23 16:47:55 +05:30
Ayush Shah
8b880bbf91
Fixes 14370: Add Azure Client, support Default Creds (#15554)
* Add Azure Client, support Default Creds
2024-03-22 14:28:42 +05:30
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
Ayush Shah
e06e5c1bdd
Fixes 15544: Histogram not working for more than 15 units (#15617) 2024-03-20 11:35:52 +05:30
Trs
4db9b775ea
#14169: Support external_account type for GCP Auth (#14166) 2024-03-16 19:59:02 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Onkar Ravgan
46954dc848
Fix #15563: Fixed incorrect col ordering after patch request from ingestion (#15577)
* fixed patch col order

* Added excp handling

* changed logs to warning

* rmv excp
2024-03-15 13:08:33 +05:30
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
Sriharsha Chintalapani
d0efaac877
Fix #11868: Duplicated queries cannot be created (#15519)
* Fix #11868: Duplicate query should throw an error of entityExists

* Fix #11868: Duplicate query should throw an error of entityExists

* fix test

* fix test

* Fix uniquee constraint for checksum in Postgres

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-13 13:02:26 +01:00
Mayur Singal
80123b3c0a
Fix #15533: Fix name & display name for kafka json schema parser (#15534) 2024-03-12 23:02:41 +05:30
mgorsk1
98850ab5cc
feat: OpenLineage integration (#15317)
* 🎉 Init OpenLineage connector

Co-authored-by: dechoma <dominik.choma@gmail.com>

* MLH - make linter happy

* review fixes

* 🐛 Fix path for ol event in tests

* 🐛 Fix path for ol event in tests

* Update ingestion/setup.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/models.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* review fixes 2

* linter

* review

* review

* make linter happy

* fix test_yield_pipeline_lineage_details test

* make linter happy

* fix tests

* fix tests 2

---------

Co-authored-by: dechoma <dominik.choma@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-03-12 08:39:25 +01:00
IceS2
7805a0b609
MINOR: Fix athena e2e tests (#15486)
* Comment side effects

* Update assert to match clauses better

* Improve input

* Improve input

* Update assert to match clauses better

* Fix Athena E2E Values

* Uncomment needed steps

* Fix linters
2024-03-08 09:31:06 +01:00
Teddy
ceaf205f59
Fix #15299 - Handle Table metrics & test cases for Empty Tables (#15469)
* fix: add cli support for computePassedFailedRowCount

* fix: div zero error and improve empty table message

* doc: updated test case page

* style: ran python linting
2024-03-07 07:15:22 +01:00
IceS2
86a2930cfa
Minor: Fix E2E Ingestion Tests (#15462)
* Fix E2E Tests

* Fix E2E Tests

* Update mysql count, schema changes

* Addition to vertica e2e

* Temporary Github Action modification to test

* Fix Redshift round issue post 10 digits

* modify e2e gh file

* fix gh error

* fix matrix syntax

* Fix Redash counts

* Update py-cli-e2e-tests.yml

* Fix Redshift referenced before assignment error

* Revert Py tests e2e

* Modify Elasticsearch configuration

* Modify Elasticsearch configuration

* Update docker-compose.yml

* Test only running the python tests as e2e

* Comment side effects

* Test

* Test

* Fix name

* Add missing shell property

* Add bigquery to e2e

* Uncomment needed step

* test

* test

* test

* test

* Add control ci pipeline

* Add new e2e tests

* test

* fix

* fix

* fix

* Uncomment needed steps

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-03-05 16:00:22 +01:00
Sriharsha Chintalapani
cecbf80a2d
Add Custom Propety Config to store format, enum values, entity types (#15302)
* Add Custom Propety Config to store format, enum values, entity types

* Fix import statements and remove unused code

* Add Custom Propety Config to store format, enum values, entity types

* Add support for enum field type in custom properties

* update name in customPropertyConfigTypeValueField

* add custom property config column in custom property table

* Update padding-left in block-editor.less

* Add enum value translation for multiple languages

* update placeholder of config

* fixed python sdk

* add enum type in property value

* add unit tests

* Add Custom Propety Config to store format, enum values, entity types

* update ui to handle the enum config and validation

* Fix enum value handling in EditCustomPropertyModal and PropertyValue

* Update CustomProperty.md with enum values and multi-select option

* add cypress test

* add cypress for multiselect enum value

* Add tests for enum props

* add cypress for editing the enum property

* Add validations to enum

* Fix dependency issue

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2024-02-29 14:36:24 +05:30
Onkar Ravgan
1fc2c7f974
MINOR: Part 1 of #15090: dbt JSON Schema & Parsing Improvements (#15297) 2024-02-29 10:41:21 +05:30
Teddy
3e83bdac3d
ISSUE #14765 - Implement Athena Injected Partition Check (#15318)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* fix: catch injected partition table in Athena

* style: ran python linting
2024-02-28 14:20:59 +00:00
IceS2
418e281daa
Fixes 15375: Metabase metadata extraction fix (#15376) 2024-02-28 13:23:53 +05:30
Teddy
056e6368d0
Issue #14765 - Preparatory Work (#15312)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* updated ui side change for table partitioned key

* miner fix

* addressing comments

* fixed ci error

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-02-28 07:11:00 +01:00
Imri Paran
50b2709e94
MINOR: Mongodb column profile (#15252)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.
2024-02-26 07:38:38 +01:00
Imri Paran
bdf27458e5
MINOR: modified nosql factory to not use pymongo (#15316) 2024-02-23 16:48:59 +05:30
Imri Paran
ff2ecc56f2
MINOR: add MongoDB sample data (#15237)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* formatting

* validate_compose: raise exception for bad status code.

* fixed import

* format

* feat(nosql-profiler): added sample data

1. Implemented the NoSQL sampler.
2. Some naming changes to the NoSQL adaptor to avoid fixing names with the profiler interface.
3. Tests.

* added default sample limit
2024-02-22 16:31:58 +01:00
Imri Paran
4967e091e6
MINOR: add test for sqla compiler (#15275)
* add test for sqla compiler
2024-02-22 14:45:47 +01:00
Imri Paran
18c22c4178
Fixes #10013: Implement first stage of NoSQL profiler (#15189)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.

* added profiler support for mongodb connection

* doc

* use int_admin_ometa in test setup

* - fixed linting issue in gx
- removed unused inheritance

* moved the nosql function into the metric class

* formatting

* validate_compose: raise exception for bad status code.

* fixed import

* format
2024-02-22 11:46:19 +01:00
Onkar Ravgan
dfc7662449
Fix #15247: Fixed ingesting dbt owners with dot in name (#15261)
* Fixed dbt owners with dot

* fixed pytests

* Fixed pytest2

* rmv type
2024-02-20 16:06:54 +05:30
Pere Miquel Brull
62c0cc7563
#13985 - Azure KV Secrets Manager (#15192)
* #13985 - Azure KV Secrets Manager

* Format

* #13985 - Azure KV Secrets Manager

* #13985 - Azure KV Secrets Manager

* Simplify credentials loading

* Simplify credentials loading

* Simplify credentials loading
2024-02-20 07:18:35 +01:00
Mayur Singal
dbb888d962
MINOR: Fix CLI E2E Tests (#15253) 2024-02-19 23:04:44 +05:30
Onkar Ravgan
cdbcea11f6
fixed e2e counts (#15171) 2024-02-14 06:33:17 +01:00
Imri Paran
aeb5fbe303
fixes #12591: add BigTable (#15122)
* feat(connector): add BigTable

* bigtable work

1. docstrings
2. tests
3. created a Row BaseModel
4. implemented a ClassConverter

* docs moved to separate PR

* format files

* minor cosmetic

- removed TODO
- changed headers' year to 2024 for new files
- fixed typos

* format

* formatting and comments

1. added missing docstrings.
2. abstracted the _find_instance method.
3. aliased the IDs used in the BigTable connection

* added comment regarding private key

* added comments regarding column families

* enclose get_schema_name_list in `try/except/else`

* format

* streamlined get_schema_name_list to include all logic in the try block
2024-02-13 08:28:01 +01:00
NiharDoshi99
2b56e34b19
#14930 bigquery support for pk, fk and column view description (#15042) 2024-02-07 16:49:27 +05:30
Mayur Singal
331c687625
MINOR: Fix mysql e2e count (#15064) 2024-02-06 18:08:12 +00:00
Onkar Ravgan
edb9c21bfd
Added /view to tableau dashboard url (#15031) 2024-02-05 20:18:02 +05:30
Mayur Singal
a9fc51ec8b
MINOR: Change sqllineage import to collate_sqllineage (#14870) 2024-02-05 19:44:08 +05:30