984 Commits

Author SHA1 Message Date
Teddy
06735fe8db
Fixe Issue #11863 - Add Status logic for test case results (#11881)
* feat: added entityReference field in testSuite to link testSuite to an entity when the testSuite is executable.

* feat: added `executableEntityReference` as an entity reference for executable test suite to their entity

* feat: add status object to test case results

* feat: ran python linting
2023-06-06 09:45:49 +02:00
Ayush Shah
65f370e4aa
Rename GCS to GCP (#11812) 2023-06-06 11:57:00 +05:30
Teddy
d0cffdcd66
Fixes Issue #11438 - Implement threshold and startegy for custom SQL (#11847)
* feat: Add threshold and strategy logic on the custom SQL object test

* feat: ran python linting

* feat: added safety checks for custom sql query

* feat: ran python linting
2023-06-02 09:41:31 +02:00
Teddy
c98a15ca19
Fixes #11705 - Update ingestion and backend to match new DQ flow (#11836)
* feat: refactor ingestion flow logic

* feat: ran python linting

* feat: update tests to match new workflow

* feat: ran python linting

* feat: update sample data test suite name

* feat: Added backend logic to support logical and executable test suites

* feat: clean up java and json code

* feat: added sample data for logical and executable test suites

* feat: remove executable from CreateTestSuite

* feat: ran python and java linting

* feat: added README info for data quality structure

* skipping cypress to keep main green

* fixed typescript type issue

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2023-06-01 23:19:13 -07:00
Pere Miquel Brull
fdeea71671
Fix Looker explore git link & Add BitBucket reader (#11837)
* Add looker test connection step

* Add looker test connection step

* Update Credentials

* Fix explore link and add bitbucket reader

* Format

* Fix test

* Fix spline linting

* Fix import
2023-06-02 07:19:32 +02:00
Pere Miquel Brull
0a8b28ab7e
Fix redshift e2e with new lineage info (#11843)
* Fix redshift e2e

* Format

* Format
2023-06-01 09:44:27 +02:00
Pere Miquel Brull
3966238703
Remove e2e system metrics test (#11838) 2023-05-31 18:02:17 +02:00
Mayur Singal
b57bbf833f
Fix #11572: Glue Support Partition Columns & Use Pydantic Models (#11776) 2023-05-31 12:03:34 +00:00
Pere Miquel Brull
158250353b
Fix hive e2e tests (#11815)
* Fix hive e2e tests

* Fix hive e2e tests
2023-05-30 17:15:55 +02:00
Pere Miquel Brull
a00fc7fef3
Increase wait time for system metrics e2e test (#11804) 2023-05-30 10:45:14 +02:00
Chirag Madlani
7adc291364
fix(ui): circular deps for entityReference.json (#11760)
* fix(ui): circular deps for entityReference.json

* Fix circular Dependency python

* Cap Delta Spark version

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-05-26 18:02:21 +05:30
Sriharsha Chintalapani
6509a3670a
Fix #11664: Refactor patch_mixin to use jsonpatch lib (#11696)
* Fix #11664: Refactor patch_mixin to use jsonpatch lib

* Migrate to jsonpatch

* Fix nested cols

* Format

* Update patch_description

* Table constraints

* tag

* owner

* column tag

* column desc

* Format

* Format

* Fix log

* Update dbt patch

* Update column fqn

* Fix test

* Fix tests

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-05-23 15:47:11 +02:00
Onkar Ravgan
efb37fa7af
Fixed tableau e2e (#11716) 2023-05-23 15:36:08 +05:30
Teddy
9f2a02c718
fix: increase sleep delay for system tables (#11704) 2023-05-22 13:43:50 +00:00
Teddy
8c50d1af52
Fixes #4565 - Fetch Metrics from System tables (#11645)
* feat: fetch metrics from system tables

* feat: add permission doc for fetching metrics from system tables

* feat: fix E2E tests to reflect full table row count after table metric update

* feat: ran linting

* feat: fix doc string engine name + function typing

* feat: ran python linting
2023-05-22 09:04:18 +02:00
Teddy
ddbc7fe14d
Fixes #11570 - Add support for BQ Multi-project Profiler (#11692)
* fix: extracted profiler object from workflow and implemented factory to allow service base logic

* fix: ran python linting

* fix: renamed `base` to `base_profiler_source`

* fix: add logic to set correct database for BQ multi project ID connections

* fix: ran python linting
2023-05-20 14:22:53 -07:00
Pere Miquel Brull
0eb2201f94
Restructure NER Scanner internals (#11690)
* Simplify col name scanner

* Restructure NER Scanner internals
2023-05-19 18:21:01 +02:00
Ayush Shah
ad7258e7be
Fixes 10949: return Chunks for file formats & Centralize logic for different auth configs (#11639)
* Centralize Auth and File formats datalake
2023-05-19 18:54:28 +05:30
Pere Miquel Brull
d52d773707
Send encrypted automation workflow (#11681) 2023-05-19 15:04:42 +02:00
Mayur Singal
e9992a52a8
Fix #1604: Add Spline Pipeline Connector (#11562)
* Fix #1604: Add Spline Connector

* Add tests & grammer validation

* Spline UI Changes & Docs

* fix pipeline workflow doc

* chore: use common field for dbService name

* chore: use const for beta services

* chore: add service icon

* Update ingestion/src/metadata/ingestion/source/pipeline/spline/metadata.py

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2023-05-19 14:46:32 +05:30
Pere Miquel Brull
50ad38ea0f
Fix #11548 - Secrets Managers comms with OMeta (#11602)
* Remove secretsManagerCredentials from backend

* Remove secretsManagerCredentials from backend

* Add secrets manager loader

* Load SM in the ometa client

* Fix tests

* Fix tests

* Fix Lint

* Mock AWS region

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-05-19 09:43:11 +02:00
Pere Miquel Brull
4626363fd8
Fix parsing for Storage (#11663) 2023-05-19 09:36:44 +02:00
Pere Miquel Brull
8795337f88
Clean NER Scanner imports (#11653) 2023-05-18 12:53:22 +02:00
Mayur Singal
e4997c3749
Fix #11571: Support custom database name for glue (#11631) 2023-05-18 14:16:56 +05:30
Pere Miquel Brull
1b90badd0e
Restructure PII processor (#11640)
* Restructure PII processor

* Restructure PII processor

* Format
2023-05-17 15:58:17 +02:00
Mayur Singal
b53a362772
Increase delay for system metric in e2e tests (#11607) 2023-05-16 06:57:36 +00:00
Onkar Ravgan
3d9d4416b7
Fixed incompatible column name for Postgres version 11.6 (#11536)
* postgres col name on version

* Added dependancy

* Added paranthesis validation

* review comments and tests
2023-05-15 11:48:03 +05:30
Mayur Singal
e61bb3cf0d
Fix snowflake E2E: add system metric sleep time (#11569) 2023-05-12 14:09:57 +05:30
Onkar Ravgan
cff403a05a
Validate if tags are created before attaching them to CreateRequest (#11554)
* Added tags validation

* typo fixed
2023-05-11 16:04:55 +00:00
Pere Miquel Brull
f22d604c54
Remove old tests (#11505)
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2023-05-11 10:29:30 +02:00
Mayur Singal
f7a0d3f5f2
Fix E2E Vertica & System Metric (#11525) 2023-05-10 17:35:55 +05:30
Teddy
60de33d7cf
Fixes #11384 - Implement mem. optimization for sys. metrics (#11460)
* fix: optimize system metrics retrieval for memory

* fix: ran python linting

* fix: logic to retrieve unique system metrics operations

* fix: added logic to clean up query before parsing it

* fix: added E2E tests for rds, bq, snflk system metrics

* fix: ran python linting

* fix: fix postgres query + add default byte size to env var

* fix: ran python linting
2023-05-09 12:05:35 +02:00
Ayush Shah
2c9ba537eb
Fix min max on rowversion/timestamp mssql (#11455) 2023-05-08 14:52:53 +05:30
Keith Sirmons
65c5b44eaa
Impala Connection Profiler is_nan rollback; Histogram fix. (#11388) 2023-05-05 21:45:30 +02:00
Nahuel
3ba29e7f0e
Fix: Redshift E2E tests (#11396) 2023-05-03 08:12:32 +05:30
Teddy
f8c667b504
Fix median for concatenable types (#11382)
* fix: median/fq/tq for concatenable types

* fix: ran linting
2023-05-02 10:45:26 +00:00
Keith Sirmons
ad9b5a0cb5
Impalaconnection 0.2.1 + string datatypes enabled in profile (#11364)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* Fixed overflow error when converting large numbers to bigint

Fixed error for CHAR datatype missing.

* Fixed NaN issues with Impala Profile

* py formatting

* Fixed warnings from SqlAlchemy
  The GenericFunction 'max' is already registered and is going to be overridden.
  The GenericFunction 'min' is already registered and is going to be overridden.

Updated Min/Max to handle strings by getting they length.

* Updated profiler to handle strings by using the string length as the parameter to compute the profile

* py_format updates

* fix: ran linting

* fix: Mysql hardcoded table alias

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>
2023-04-30 10:03:56 +02:00
Ayush Shah
f7168db8ea
Add Quicksight AWS support (#11294) 2023-04-27 11:39:47 +05:30
Teddy
0930bc307a
fix: change in entityLink to string in CreateTestCaseRequest (#11291) 2023-04-26 10:52:09 +00:00
Nahuel
bcdab5e30a
Fix: Tableau E2E wrong expected values (#11290) 2023-04-26 13:44:17 +05:30
Ayush Shah
dd509681be
Fixes tableau, add quicksight e2e (#11177) 2023-04-26 10:22:08 +05:30
Teddy
afce5fa61b
Fix E2E tests (#11267)
* fix: profile only include schema

* tests: add logic to handle exsiting views and table for Hive

* fix: python linting
2023-04-25 16:05:49 +02:00
Ayush Shah
efd82113ec
Fix E2E tests (#11226) 2023-04-25 10:11:06 +05:30
Teddy
017fbc6a32
fix: logic for number of profiled tables (#11222)
* fix: logic for number of profiled tables

* fix: python linting
2023-04-24 08:00:25 +02:00
Pere Miquel Brull
d3d523e96d
Ingestion md docs review (#11219)
* Update workflow docs

* Remove duplicate key

* Update Custom connector docs

* Update Domo connector docs

* Dashboard docs updates

* Some databases docs updates

* Finish db docs updates

* Remove Pulsar

* Messaging docs

* Metadata docs

* ML docs

* S3 docs

* Fix rendering

* Update title and description of the databaseSchema

* Pipeline Service docs

* remove pulsar from tests

* Format

* Fix test

* Remove pulsar

* Remove pulsar
2023-04-23 18:43:46 +02:00
Mayur Singal
da2f03ca50
Salesforce docs & remove unnecessary fields (#11207) 2023-04-22 18:32:32 +02:00
Nahuel
ed1388827e
Doc: Add ElasticsearchReindex and Data Insight docs in UI (#11201) 2023-04-21 11:34:55 -07:00
Teddy
6e129c1e65
Issue 10805 Added Hive e2e (#11197)
* tests: Added E2E test for Hive + fix minor bug

* tests: ran python linting
2023-04-21 15:45:12 +00:00
Ayush Shah
a50c31539b
Fix HexByteString Issue, revert datatype change (#11145)
* Fix HexByteString Issue, revert datatype change

* Add E2E MSSQL Bit type
2023-04-21 10:08:27 +02:00
Keith Sirmons
97b58c65f5
Impalaconnection (#11151)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* first pass for impala connector

* updated default auth_mechanism to be one of the enum values.

* updated UI documentation to match fields for the impalaconneciton.

refined impalaConnection to bring use_ssl to a boolean instead or relying on an extra connection option being manually added.

Removed reference to hive for type mapping

added impala to the pip setup

* py_format updates

* removed print statement

* Lints and fixes

* Updated database documentation to follow new style

* Flag as BETA

* Remove tests

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-04-21 09:57:13 +02:00