2826 Commits

Author SHA1 Message Date
Mayur Singal
0532bb1226
MINOR: Fix external table lineage processing (#15713) 2024-03-27 12:37:20 +05:30
Mayur Singal
8073a80989
Fix #14285: Add column lineage support for tableau datamodels (#15646) 2024-03-27 11:03:40 +05:30
Pere Miquel Brull
9d7bfa363e
MINOR - Clean metadata CLI (#15631)
* Docs

* MINOR - Clean metadata CLI

* remove tests
2024-03-26 16:36:47 +01:00
Ayush Shah
6039fe9462
Fix TypeError Missing arg (#15698) 2024-03-26 18:32:32 +05:30
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
00677a1e1b
Fix External Account Json Schema Issue (#15671) 2024-03-23 16:47:55 +05:30
Pere Miquel Brull
a79e79ef3d
#15662 - List All test cases from a table in DQ (#15665)
* #15662 - List All test cases from a table in DQ

* #15662 - List All test cases from a table in DQ

* #15662 - List All test cases from a table in DQ
2024-03-22 11:30:02 +01:00
Ayush Shah
8b880bbf91
Fixes 14370: Add Azure Client, support Default Creds (#15554)
* Add Azure Client, support Default Creds
2024-03-22 14:28:42 +05:30
Mayur Singal
ad28af4f4f
MINOR: Fix sample data upload - binary data error (#15659) 2024-03-22 12:13:26 +05:30
Mayur Singal
2208662886
MINOR: Move external table lineage to post processing (#15633) 2024-03-22 11:46:14 +05:30
Pere Miquel Brull
b778bc7968
#14943 - Check tags before PII processor (#15622) 2024-03-21 14:15:28 +05:30
Imri Paran
7eeb0e45d2
1. add profiler support for GEOMETRY type in redshift. (#15628)
2. Add GEOMETRY to values not to compute.
2024-03-20 13:42:46 +01:00
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
Ayush Shah
e06e5c1bdd
Fixes 15544: Histogram not working for more than 15 units (#15617) 2024-03-20 11:35:52 +05:30
Onkar Ravgan
2dd912ab8a
fixed dagster tasks status (#15605) 2024-03-19 18:03:58 +05:30
Mayur Singal
ef61bdc3a8
MINOR: Looker fix bitbucket protocol (#15604) 2024-03-19 10:06:24 +01:00
Matias Puerta
7036a7bb25
Fix typo in Bitbucket URL (#15602) 2024-03-18 16:24:11 -07:00
Mayur Singal
b5fb57f7c6
Fix #15118: Handle exception while processing usage comparisons (#15597) 2024-03-18 17:26:32 +05:30
Mayur Singal
4696e14f6c
MINOR: Add support for databricks external table lineage (#15585) 2024-03-18 16:23:53 +05:30
Pere Miquel Brull
0eb18ad891
MINOR - Clean MSSQL lineage & usage (#15571) 2024-03-18 10:55:29 +01:00
Mayur Singal
104b41cfd2
MINOR: Fix Looker clone repo failure for bitbucket (#15590)
* MINOR: Fix Looker clone repo failure for bitbucket

* pyformat
2024-03-16 15:44:14 +01:00
Trs
4db9b775ea
#14169: Support external_account type for GCP Auth (#14166) 2024-03-16 19:59:02 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Onkar Ravgan
46954dc848
Fix #15563: Fixed incorrect col ordering after patch request from ingestion (#15577)
* fixed patch col order

* Added excp handling

* changed logs to warning

* rmv excp
2024-03-15 13:08:33 +05:30
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
ed41f25f18
MINOR: Fix multiline insert query stored procedure lineage (#15578) 2024-03-15 12:40:59 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
Pere Miquel Brull
fd403bae9a
MINOR - Review query performance (#15553)
* MINOR - Review query performance

* MINOR - Review query performance

* MINOR - Review query performance

* MINOR - Review query performance
2024-03-14 06:37:38 +01:00
Mayur Singal
658526e02c
MINOR: Skip source hash generation for service (#15516) 2024-03-13 22:13:19 +05:30
Sriharsha Chintalapani
d0efaac877
Fix #11868: Duplicated queries cannot be created (#15519)
* Fix #11868: Duplicate query should throw an error of entityExists

* Fix #11868: Duplicate query should throw an error of entityExists

* fix test

* fix test

* Fix uniquee constraint for checksum in Postgres

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-13 13:02:26 +01:00
Teddy
93a4453a6f
[MINOR] GX logging hierarchy (#15542)
* fix: GX module logging hierarchy

* style: ran python linting
2024-03-13 07:37:57 +00:00
Mayur Singal
80123b3c0a
Fix #15533: Fix name & display name for kafka json schema parser (#15534) 2024-03-12 23:02:41 +05:30
Sriharsha Chintalapani
8af194193a
Add stack trace while throwing an error to debug (#15522) 2024-03-12 10:07:46 -07:00
Kent Chenery
34b727f6c9
Fixes ISSUE-13473: Ensure MSSQL columns query filters to tables and views (#15530)
* Update queries.py and utils.py to ingest table and view descriptions

* Ensure MSSQL columns query filters to tables and views

* Fix linters

---------

Co-authored-by: Pablo Takara <pjt1991@gmail.com>
2024-03-12 13:56:30 +01:00
Mayur Singal
8a2ee00fe3
Fix #15432: make sample data external storage path configurable (#15478) 2024-03-12 15:02:28 +05:30
mgorsk1
98850ab5cc
feat: OpenLineage integration (#15317)
* 🎉 Init OpenLineage connector

Co-authored-by: dechoma <dominik.choma@gmail.com>

* MLH - make linter happy

* review fixes

* 🐛 Fix path for ol event in tests

* 🐛 Fix path for ol event in tests

* Update ingestion/setup.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/models.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* review fixes 2

* linter

* review

* review

* make linter happy

* fix test_yield_pipeline_lineage_details test

* make linter happy

* fix tests

* fix tests 2

---------

Co-authored-by: dechoma <dominik.choma@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-03-12 08:39:25 +01:00
Teddy
36327c2ee9
FIX #15436 - Add disconnect method for databricks client (#15514)
* fix: implement is_disconnect for databricks client

* style: ran python linting
2024-03-12 07:42:04 +01:00
Imri Paran
aade838020
Fixes #15388: Use native backup tools (#15393)
* feat: use native backup tools

1. added mysqldump 8.3 to the ingestion container.
2. documented how to use native tools to back up and restore.
3. added deprecated message on the cli backup and restore.

* added deprecation notice for 1.3 backup

* removed 1.3.x deprecation notice

* added another backup page in 1.3 introducing SQL dump tools

* added --set-gtid-purged=OFF to the mysql dump process
2024-03-12 06:23:05 +01:00
IceS2
7805a0b609
MINOR: Fix athena e2e tests (#15486)
* Comment side effects

* Update assert to match clauses better

* Improve input

* Improve input

* Update assert to match clauses better

* Fix Athena E2E Values

* Uncomment needed steps

* Fix linters
2024-03-08 09:31:06 +01:00
Imri Paran
f4932ee420
added log levels to all example ingestion configs config (#15488)
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-07 16:54:10 +00:00
Teddy
ceaf205f59
Fix #15299 - Handle Table metrics & test cases for Empty Tables (#15469)
* fix: add cli support for computePassedFailedRowCount

* fix: div zero error and improve empty table message

* doc: updated test case page

* style: ran python linting
2024-03-07 07:15:22 +01:00
Italo Batista
647287951d
Update test suite workflow example in ingestion folder (#15476) 2024-03-07 07:13:53 +01:00
IceS2
86a2930cfa
Minor: Fix E2E Ingestion Tests (#15462)
* Fix E2E Tests

* Fix E2E Tests

* Update mysql count, schema changes

* Addition to vertica e2e

* Temporary Github Action modification to test

* Fix Redshift round issue post 10 digits

* modify e2e gh file

* fix gh error

* fix matrix syntax

* Fix Redash counts

* Update py-cli-e2e-tests.yml

* Fix Redshift referenced before assignment error

* Revert Py tests e2e

* Modify Elasticsearch configuration

* Modify Elasticsearch configuration

* Update docker-compose.yml

* Test only running the python tests as e2e

* Comment side effects

* Test

* Test

* Fix name

* Add missing shell property

* Add bigquery to e2e

* Uncomment needed step

* test

* test

* test

* test

* Add control ci pipeline

* Add new e2e tests

* test

* fix

* fix

* fix

* Uncomment needed steps

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-03-05 16:00:22 +01:00
Antoine Balliet
296d2c9351
fix: get_connection import when using customDashboard (#15447)
* fix: get_connection import when using customDashboard

* linting

* black
2024-03-05 06:36:35 +01:00
Sriharsha Chintalapani
cecbf80a2d
Add Custom Propety Config to store format, enum values, entity types (#15302)
* Add Custom Propety Config to store format, enum values, entity types

* Fix import statements and remove unused code

* Add Custom Propety Config to store format, enum values, entity types

* Add support for enum field type in custom properties

* update name in customPropertyConfigTypeValueField

* add custom property config column in custom property table

* Update padding-left in block-editor.less

* Add enum value translation for multiple languages

* update placeholder of config

* fixed python sdk

* add enum type in property value

* add unit tests

* Add Custom Propety Config to store format, enum values, entity types

* update ui to handle the enum config and validation

* Fix enum value handling in EditCustomPropertyModal and PropertyValue

* Update CustomProperty.md with enum values and multi-select option

* add cypress test

* add cypress for multiselect enum value

* Add tests for enum props

* add cypress for editing the enum property

* Add validations to enum

* Fix dependency issue

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2024-02-29 14:36:24 +05:30
Onkar Ravgan
1fc2c7f974
MINOR: Part 1 of #15090: dbt JSON Schema & Parsing Improvements (#15297) 2024-02-29 10:41:21 +05:30
Ayush Shah
35679e5234
Fix bigint out of range issue (#15395) 2024-02-28 22:37:17 +05:30
Ayush Shah
c31bb98e64
Fixes #15355: fix KeyError issue if not present (#15387) 2024-02-28 19:55:03 +05:30
Teddy
3e83bdac3d
ISSUE #14765 - Implement Athena Injected Partition Check (#15318)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* fix: catch injected partition table in Athena

* style: ran python linting
2024-02-28 14:20:59 +00:00