2916 Commits

Author SHA1 Message Date
Mayur Singal
2208662886
MINOR: Move external table lineage to post processing (#15633) 2024-03-22 11:46:14 +05:30
Pere Miquel Brull
b778bc7968
#14943 - Check tags before PII processor (#15622) 2024-03-21 14:15:28 +05:30
Imri Paran
7eeb0e45d2
1. add profiler support for GEOMETRY type in redshift. (#15628)
2. Add GEOMETRY to values not to compute.
2024-03-20 13:42:46 +01:00
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
Ayush Shah
e06e5c1bdd
Fixes 15544: Histogram not working for more than 15 units (#15617) 2024-03-20 11:35:52 +05:30
Onkar Ravgan
2dd912ab8a
fixed dagster tasks status (#15605) 2024-03-19 18:03:58 +05:30
Mayur Singal
ef61bdc3a8
MINOR: Looker fix bitbucket protocol (#15604) 2024-03-19 10:06:24 +01:00
Matias Puerta
7036a7bb25
Fix typo in Bitbucket URL (#15602) 2024-03-18 16:24:11 -07:00
Mayur Singal
b5fb57f7c6
Fix #15118: Handle exception while processing usage comparisons (#15597) 2024-03-18 17:26:32 +05:30
Mayur Singal
4696e14f6c
MINOR: Add support for databricks external table lineage (#15585) 2024-03-18 16:23:53 +05:30
Pere Miquel Brull
0eb18ad891
MINOR - Clean MSSQL lineage & usage (#15571) 2024-03-18 10:55:29 +01:00
Mayur Singal
104b41cfd2
MINOR: Fix Looker clone repo failure for bitbucket (#15590)
* MINOR: Fix Looker clone repo failure for bitbucket

* pyformat
2024-03-16 15:44:14 +01:00
Trs
4db9b775ea
#14169: Support external_account type for GCP Auth (#14166) 2024-03-16 19:59:02 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Onkar Ravgan
46954dc848
Fix #15563: Fixed incorrect col ordering after patch request from ingestion (#15577)
* fixed patch col order

* Added excp handling

* changed logs to warning

* rmv excp
2024-03-15 13:08:33 +05:30
Mayur Singal
88ab7475e7
MINOR: Restructure dbServiceName field in dashboard and pipeline (#15548) 2024-03-15 12:42:47 +05:30
Mayur Singal
ed41f25f18
MINOR: Fix multiline insert query stored procedure lineage (#15578) 2024-03-15 12:40:59 +05:30
Mayur Singal
b643206bba
Fix #11905: Automated lineage between external table and container snowflake (#15537) 2024-03-15 00:52:41 +05:30
Pere Miquel Brull
fd403bae9a
MINOR - Review query performance (#15553)
* MINOR - Review query performance

* MINOR - Review query performance

* MINOR - Review query performance

* MINOR - Review query performance
2024-03-14 06:37:38 +01:00
Mayur Singal
658526e02c
MINOR: Skip source hash generation for service (#15516) 2024-03-13 22:13:19 +05:30
Sriharsha Chintalapani
d0efaac877
Fix #11868: Duplicated queries cannot be created (#15519)
* Fix #11868: Duplicate query should throw an error of entityExists

* Fix #11868: Duplicate query should throw an error of entityExists

* fix test

* fix test

* Fix uniquee constraint for checksum in Postgres

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-13 13:02:26 +01:00
Teddy
93a4453a6f
[MINOR] GX logging hierarchy (#15542)
* fix: GX module logging hierarchy

* style: ran python linting
2024-03-13 07:37:57 +00:00
Mayur Singal
80123b3c0a
Fix #15533: Fix name & display name for kafka json schema parser (#15534) 2024-03-12 23:02:41 +05:30
Sriharsha Chintalapani
8af194193a
Add stack trace while throwing an error to debug (#15522) 2024-03-12 10:07:46 -07:00
Kent Chenery
34b727f6c9
Fixes ISSUE-13473: Ensure MSSQL columns query filters to tables and views (#15530)
* Update queries.py and utils.py to ingest table and view descriptions

* Ensure MSSQL columns query filters to tables and views

* Fix linters

---------

Co-authored-by: Pablo Takara <pjt1991@gmail.com>
2024-03-12 13:56:30 +01:00
Mayur Singal
8a2ee00fe3
Fix #15432: make sample data external storage path configurable (#15478) 2024-03-12 15:02:28 +05:30
mgorsk1
98850ab5cc
feat: OpenLineage integration (#15317)
* 🎉 Init OpenLineage connector

Co-authored-by: dechoma <dominik.choma@gmail.com>

* MLH - make linter happy

* review fixes

* 🐛 Fix path for ol event in tests

* 🐛 Fix path for ol event in tests

* Update ingestion/setup.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* Update ingestion/src/metadata/ingestion/source/pipeline/openlineage/models.py

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

* review fixes 2

* linter

* review

* review

* make linter happy

* fix test_yield_pipeline_lineage_details test

* make linter happy

* fix tests

* fix tests 2

---------

Co-authored-by: dechoma <dominik.choma@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-03-12 08:39:25 +01:00
Teddy
36327c2ee9
FIX #15436 - Add disconnect method for databricks client (#15514)
* fix: implement is_disconnect for databricks client

* style: ran python linting
2024-03-12 07:42:04 +01:00
Imri Paran
aade838020
Fixes #15388: Use native backup tools (#15393)
* feat: use native backup tools

1. added mysqldump 8.3 to the ingestion container.
2. documented how to use native tools to back up and restore.
3. added deprecated message on the cli backup and restore.

* added deprecation notice for 1.3 backup

* removed 1.3.x deprecation notice

* added another backup page in 1.3 introducing SQL dump tools

* added --set-gtid-purged=OFF to the mysql dump process
2024-03-12 06:23:05 +01:00
IceS2
7805a0b609
MINOR: Fix athena e2e tests (#15486)
* Comment side effects

* Update assert to match clauses better

* Improve input

* Improve input

* Update assert to match clauses better

* Fix Athena E2E Values

* Uncomment needed steps

* Fix linters
2024-03-08 09:31:06 +01:00
Imri Paran
f4932ee420
added log levels to all example ingestion configs config (#15488)
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-07 16:54:10 +00:00
Teddy
ceaf205f59
Fix #15299 - Handle Table metrics & test cases for Empty Tables (#15469)
* fix: add cli support for computePassedFailedRowCount

* fix: div zero error and improve empty table message

* doc: updated test case page

* style: ran python linting
2024-03-07 07:15:22 +01:00
Italo Batista
647287951d
Update test suite workflow example in ingestion folder (#15476) 2024-03-07 07:13:53 +01:00
IceS2
86a2930cfa
Minor: Fix E2E Ingestion Tests (#15462)
* Fix E2E Tests

* Fix E2E Tests

* Update mysql count, schema changes

* Addition to vertica e2e

* Temporary Github Action modification to test

* Fix Redshift round issue post 10 digits

* modify e2e gh file

* fix gh error

* fix matrix syntax

* Fix Redash counts

* Update py-cli-e2e-tests.yml

* Fix Redshift referenced before assignment error

* Revert Py tests e2e

* Modify Elasticsearch configuration

* Modify Elasticsearch configuration

* Update docker-compose.yml

* Test only running the python tests as e2e

* Comment side effects

* Test

* Test

* Fix name

* Add missing shell property

* Add bigquery to e2e

* Uncomment needed step

* test

* test

* test

* test

* Add control ci pipeline

* Add new e2e tests

* test

* fix

* fix

* fix

* Uncomment needed steps

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-03-05 16:00:22 +01:00
Antoine Balliet
296d2c9351
fix: get_connection import when using customDashboard (#15447)
* fix: get_connection import when using customDashboard

* linting

* black
2024-03-05 06:36:35 +01:00
Sriharsha Chintalapani
cecbf80a2d
Add Custom Propety Config to store format, enum values, entity types (#15302)
* Add Custom Propety Config to store format, enum values, entity types

* Fix import statements and remove unused code

* Add Custom Propety Config to store format, enum values, entity types

* Add support for enum field type in custom properties

* update name in customPropertyConfigTypeValueField

* add custom property config column in custom property table

* Update padding-left in block-editor.less

* Add enum value translation for multiple languages

* update placeholder of config

* fixed python sdk

* add enum type in property value

* add unit tests

* Add Custom Propety Config to store format, enum values, entity types

* update ui to handle the enum config and validation

* Fix enum value handling in EditCustomPropertyModal and PropertyValue

* Update CustomProperty.md with enum values and multi-select option

* add cypress test

* add cypress for multiselect enum value

* Add tests for enum props

* add cypress for editing the enum property

* Add validations to enum

* Fix dependency issue

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2024-02-29 14:36:24 +05:30
Onkar Ravgan
1fc2c7f974
MINOR: Part 1 of #15090: dbt JSON Schema & Parsing Improvements (#15297) 2024-02-29 10:41:21 +05:30
Ayush Shah
35679e5234
Fix bigint out of range issue (#15395) 2024-02-28 22:37:17 +05:30
Ayush Shah
c31bb98e64
Fixes #15355: fix KeyError issue if not present (#15387) 2024-02-28 19:55:03 +05:30
Teddy
3e83bdac3d
ISSUE #14765 - Implement Athena Injected Partition Check (#15318)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* fix: catch injected partition table in Athena

* style: ran python linting
2024-02-28 14:20:59 +00:00
Imri Paran
c61ab94ff4
Update conn_test.py (#15385) 2024-02-28 14:21:30 +01:00
IceS2
418e281daa
Fixes 15375: Metabase metadata extraction fix (#15376) 2024-02-28 13:23:53 +05:30
Mayur Singal
8571ab87e8
MINOR: Fix mongodb profiler imports (#15383) 2024-02-28 13:05:01 +05:30
Teddy
056e6368d0
Issue #14765 - Preparatory Work (#15312)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* updated ui side change for table partitioned key

* miner fix

* addressing comments

* fixed ci error

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-02-28 07:11:00 +01:00
Mayur Singal
1987a161f3
MINOR: Fix oracle lineage query (#15372) 2024-02-27 15:34:34 +00:00
Imri Paran
f6e1f0d9c0
Fixes #15366: add profiler_data_time_series to TABLES_DUMP_ALL (#15369)
* add profiler_data_time_series to TABLES_DUMP_ALL
2024-02-27 15:52:53 +01:00
Mayur Singal
d2879ae232
MINOR: Improve Databricks Tags Ingestion (#15248) 2024-02-27 16:57:11 +05:30
Imri Paran
50b2709e94
MINOR: Mongodb column profile (#15252)
* feat(nosql-profiler): row count

1. Implemented the NoSQLProfilerInterface as an entrypoint for the nosql profiler.
2. Added the NoSQLMetric as an abstract class.
3. Implemented the interface for the MongoDB database source.
4. Implemented an e2e test using testcontainers.
2024-02-26 07:38:38 +01:00
Teddy
16fdc249b7
fix: pin pandas version to 2.1.x (#15333) 2024-02-24 23:12:22 +05:30
Teddy
ba8208222e
MINOR - Fix column to match set test (#15186)
* fix: column value test for SQA types

* style: ran python linting
2024-02-23 16:35:58 +01:00