346 Commits

Author SHA1 Message Date
Teddy
c98a15ca19
Fixes #11705 - Update ingestion and backend to match new DQ flow (#11836)
* feat: refactor ingestion flow logic

* feat: ran python linting

* feat: update tests to match new workflow

* feat: ran python linting

* feat: update sample data test suite name

* feat: Added backend logic to support logical and executable test suites

* feat: clean up java and json code

* feat: added sample data for logical and executable test suites

* feat: remove executable from CreateTestSuite

* feat: ran python and java linting

* feat: added README info for data quality structure

* skipping cypress to keep main green

* fixed typescript type issue

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2023-06-01 23:19:13 -07:00
Sriharsha Chintalapani
6509a3670a
Fix #11664: Refactor patch_mixin to use jsonpatch lib (#11696)
* Fix #11664: Refactor patch_mixin to use jsonpatch lib

* Migrate to jsonpatch

* Fix nested cols

* Format

* Update patch_description

* Table constraints

* tag

* owner

* column tag

* column desc

* Format

* Format

* Fix log

* Update dbt patch

* Update column fqn

* Fix test

* Fix tests

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-05-23 15:47:11 +02:00
Teddy
8c50d1af52
Fixes #4565 - Fetch Metrics from System tables (#11645)
* feat: fetch metrics from system tables

* feat: add permission doc for fetching metrics from system tables

* feat: fix E2E tests to reflect full table row count after table metric update

* feat: ran linting

* feat: fix doc string engine name + function typing

* feat: ran python linting
2023-05-22 09:04:18 +02:00
Pere Miquel Brull
d52d773707
Send encrypted automation workflow (#11681) 2023-05-19 15:04:42 +02:00
Pere Miquel Brull
50ad38ea0f
Fix #11548 - Secrets Managers comms with OMeta (#11602)
* Remove secretsManagerCredentials from backend

* Remove secretsManagerCredentials from backend

* Add secrets manager loader

* Load SM in the ometa client

* Fix tests

* Fix tests

* Fix Lint

* Mock AWS region

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-05-19 09:43:11 +02:00
Pere Miquel Brull
1b90badd0e
Restructure PII processor (#11640)
* Restructure PII processor

* Restructure PII processor

* Format
2023-05-17 15:58:17 +02:00
Pere Miquel Brull
f22d604c54
Remove old tests (#11505)
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2023-05-11 10:29:30 +02:00
Ayush Shah
2c9ba537eb
Fix min max on rowversion/timestamp mssql (#11455) 2023-05-08 14:52:53 +05:30
Teddy
0930bc307a
fix: change in entityLink to string in CreateTestCaseRequest (#11291) 2023-04-26 10:52:09 +00:00
Pere Miquel Brull
d3d523e96d
Ingestion md docs review (#11219)
* Update workflow docs

* Remove duplicate key

* Update Custom connector docs

* Update Domo connector docs

* Dashboard docs updates

* Some databases docs updates

* Finish db docs updates

* Remove Pulsar

* Messaging docs

* Metadata docs

* ML docs

* S3 docs

* Fix rendering

* Update title and description of the databaseSchema

* Pipeline Service docs

* remove pulsar from tests

* Format

* Fix test

* Remove pulsar

* Remove pulsar
2023-04-23 18:43:46 +02:00
Pere Miquel Brull
5152db488d
Add partition columns details (#11062) 2023-04-14 13:06:56 +02:00
Onkar Ravgan
bc6ce22a2b
Added oneof selection for tableau auth types (#11049)
* Added tableau oneof fields

* Fixed pytests

* fixed field in test

* Handle tableau auth converter

* Fixed java tests and imports

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-04-14 13:49:36 +05:30
Teddy
77b94f9ebb
fix: rename tests endpoint to dataQuality/<specificity> (#10970) 2023-04-14 00:14:49 -07:00
Pere Miquel Brull
b5cb1d464a
Deprecate location and old storage service (#11004)
* Deprecate location and old storage service

* Format

* Fix test

* Refactor

* Clean location

* Rename object store to storage

* Rename object store to storage

* Rename object store to storage

* Format

* Format

* Refactor object store for storage

* Refactor object store for storage

* Rename object store to storage

* Fix test

* Fix test

* Format

* chore(ui): change Objectstore to  Storage

* Fixes

* Fix test

* Remove storage service from Glue cypress

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-04-12 11:44:46 +02:00
Ayush Shah
9d11029ec8
Fixes 10351: Fixes Metrics Computation, Samping, test suites and partioning (#10603)
Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>
2023-04-11 20:58:31 +05:30
Teddy
9b4e9132ae
fixed #9656 - Add support for date type to column values to be between (#10890)
* fix: renamed  to  submodule

* fix: linting

* fix: columnValuesToBeBetween test for date column type
2023-04-04 17:16:44 +02:00
Suresh Srinivas
c8b640674b
10041 part2 - Refactor and cleanup APIs (#10900)
* Use @Tag annotation to group APIs in the swagger documentation.

* Hide internal APIs

* Change API path events/subscription to events/subscriptions

* Change API path from automations/workflow to automations/workflows

* Change API path v1/testCase to v1/testCases

* Change API path v1/testDefinition to v1/testDefinitions

* Change API path v1/testSuite to v1/testSuites

* Rename Kpi and kpi in the documentation to KPI

* Change API path v1/testConnectionDefinition to v1/testConnectionDefinitions

* Update API section in the API documentation

* Fix test failures

* Correctly capitalize Test Cases and Test Suites in API docs
2023-04-03 13:03:48 -07:00
Teddy
ecffd5ffc7
Fixes #10727 (& other minor improvements) (#10856)
* fix: logic for test suite config workflow

* fix: added caching for system metrics (snflk and bq)

* fix: linting

* fix: added tearDown logic for tests suite/case
2023-03-31 16:57:53 +02:00
Schlameel
6d24455738
Fixes 10343: Add methods to update Glossary and GlossaryTerm in Python SDK (#10810)
* ISSUE 10343: Python SDK Glossary and GlossaryTerms
- Added methods to glossary_mixin to PATCH Glossary and GlossaryTerm
- Created in patch_mixin_utils a super class for mixins that PATCH entities
- Moved common Patch enums from patch.py to patch_mixin_utils.py
- Updated imports and super classes for mixins that PATCH entities
- Added tests for Glossary and GlossaryTerm mixins

* ISSUE #10343: Python SDK extensions for Glossary and GlossaryTerms
- Fixed an import
- Fixed two method signatures

* Issue #10343 - Fixed formatting
2023-03-31 16:55:22 +02:00
NiharDoshi99
46afe69811
improvement in pii tagging (#10696)
* improvement in pii tagging

* fix conflict and changes as per comment

* Added confidence field

* changes as per comments

* Apply suggestions from code review

Co-authored-by: Teddy <teddy.crepineau@gmail.com>

---------

Co-authored-by: Ashish Gupta <ashish@getcollate.io>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2023-03-28 19:37:48 +05:30
Pere Miquel Brull
78d7dd8789
[WIP] - Test Connection - Prepare the new test connection ingestion+UI logic (#10660)
* Prepare the new test connection ingestion logic

* Update test assert

* Update Test Connection for SQA Sources

* Correct return type and method doc

* Handle decryption

* Non SQA Database Sources

* Add the run_automation script in ingestion-base

* Dashboard Test Connection Changes

* Pipeline, Messagin, MlModel & Metadata Sources

* ui: test connect flow-1

* Unmask connection parameters before sending to Ariflow

* ui: test connect flow-2

* Address review comments and pylint

* pytest fix

* ui: test connect flow-3 (refactoring and style fix)

* ui: test connect flow-4 (fix test connection status logic)

* sync local file

* ui: test connect flow-5 (fix lowercase issue and styling)

* ui: test connect flow-5 (show toast notifications)

* test: add unit test

* ui: test connect flow-5 (update service page test connection button)

* Databrick fix & pytest fix

* pylint

* Update test

* Fix merge

* S3 Test connection

* add style for mandatory step

* sync locales

* chore: add service name in workflow request

* Unmask using original service connection parameters

* Fix test connection unmasking

* Wrap inspector function to eliminate error outside test conn

* Fix linting

* fix:cy test

* Fix linting

* address comment

* refactor and fix connection type casing issue

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: Nahuel Verdugo Revigliono <nahuel@getcollate.io>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2023-03-28 06:29:13 +02:00
Pere Miquel Brull
e2a2bcc8da
Fix search by email index keyword (#10698)
* Fix search by email index keyword

* Fix search by email index keyword
2023-03-21 20:50:47 -07:00
Pere Miquel Brull
4dbe5e4f5c
Simplify Data Insight workflow builder (#10688) 2023-03-21 14:12:20 +01:00
Schlameel
df855ad8c3
Issue #3809: Add python client for Roles and Policies (#10531)
* Issue #3809: Add python client for Roles and Policies
Includes Tests

* #3809: Add python client for Roles and Policies
- Moved constants to enums in client_utils.py
- Updated all patch methods to utilized new enums
- includes tests

* #3809: Add python client for Roles and Policies
- includes tests
- merged upstream updates and updated to use new enums
2023-03-20 08:42:01 +01:00
Mohit Yadav
b982d3fe2b
Query as entity (#10449)
* added query as an entity

* changed name of the variables and methods

* Added Resource Descriptors

* testcase bug fix

* addressing comments

* added script for table query migration

* added script for table query migration postgresql

* bug fix

* db change for script test

* added current timestamp

* change db config from postgresql to mysql

* added extension to use fucntion gen_random_uuid()

* solving maven ci

* added queryUsage and change is migration script

* addressing comments

* addressing comments

* added queryUsage relation and testcase

* added api to insert queries in bulk

* .

* fix a test case which was failing due to latest changes

* Ingestion Changes for Query as Entity

* move query changes to latest sqls

* added tags and owner

* update PR for Query as Entity

* update type

* fixed pagination

* fix path param

* fix TestCases

* add validation criteria

* removed exisitng query apis

* checkstyle fix

* remove vote from put

* remove vote from put

* Query As Entity Ingestion Changes

* Remove unused func

* update Review Comments

* update Review Comments

* remove previous changes for Query and Update Tests

* moved Checksum to Query Util Class

* update python api

* fix python checkstyle

* Fixed Tests

* Fix pytest

* remove space changes

* remove space changes

* Fixed put_addFollowerDeleteEntity_200

* Fix usage ingestion

* Update Python SDK and tests

* pylint fix

---------

Co-authored-by: Himank Mehta <himankmehta@Himanks-MacBook-Air.local>
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2023-03-15 20:55:30 -07:00
Teddy
2f4a92a17b
fix: exclude owner from page view traffic in DI (#10574)
* fix: exclude owner from page view traffic in DI

* fix: uncomment KPI creation in setup
2023-03-14 11:45:46 +00:00
Pere Miquel Brull
81dec813a0
Don't store the OM connection in the Ingestion Pipeline or Workflow (#10448)
* Do not store OM connection

* Migration to remove the server connection

* Update tests

* Add workflow masking and secrets manager

* Fix failing test

---------

Co-authored-by: Nahuel Verdugo Revigliono <nahuel@getcollate.io>
2023-03-09 17:32:40 +01:00
Suresh Srinivas
4c6d184ef5
Fixes #10480 Glossary rename results in rename of Classification with… (#10486)
* Fixes #10480 Glossary rename results in rename of Classification with the same name

* Rename TagSource Tag to Classification
2023-03-09 00:30:36 -08:00
Nahuel
f2e1a87b5a
Fix#10377: service connection not overwritten as expected (#10445) 2023-03-06 16:32:10 +01:00
Schlameel
fb7b12842b
#9544: Added patch owner to Python SDK. Includes tests. (#10403)
Co-authored-by: Nahuel <nahuel@getcollate.io>
2023-03-06 14:32:58 +00:00
Nahuel
ef1812a09d
Fix: Stop displaying authorization values in debug logs (#10443) 2023-03-06 14:56:29 +01:00
Pere Miquel Brull
050da1e2d1
Add service type to container (#10441) 2023-03-06 14:44:30 +01:00
Pere Miquel Brull
477a5223eb
Fix #10401 - Add Automations Workflow Resource & PUT service test connection result (#10437)
Fix #10401 - Add Automations Workflow Resource & PUT service test connection result (#10437)
2023-03-06 14:44:16 +01:00
Nahuel
247016307d
Fix#8648: Mask sensitive info from API responses (#10307)
* Mask sensitive info from API responses

* Rename converter classes

* Add missing Java classes from JSON schemas and class converters

* Update test service connection schema

* Update datalakeConnection JSON schema and fix some tests

* Fix AlertsRuleEvaluatorResourceTest and minor error in run_local_docker.sh

* Fix Pipeline and Database service tests

* Minor refactor

* Fix CsvUtilTest

* Fix EventMonitorFactoryTest

* Fix CloudWatchEventMonitorTest

* Update datalake metadata

* Update bigquery metadata

* Fix test connection functionality

* Fix OMeta service api test

* Update gcsValues title and revert changes in GH actions

* Mask sensitive enabled by default for local docker

* Add missing tests

* Address PR comments

* Address PR comments

* fix ui breaks on gcsValues.json

* Address PR comments

* Minor refactor

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2023-03-03 18:10:01 +00:00
NiharDoshi99
1ff76f5e65
pii tagging using spacy (#10256)
* WIP: pii tagging using spacy

* added test cases and changes as per comment

* fix python checkstyle

* fix python checkstyle

* added score, test_cases and docs update

* solved merge conflict

* fix python checkstyle

* remove pii tagging using regex

* fix python test

* lib changes and added some test case

* changed as per comment

* fix: python test

* fix: changes to get source_config

* fix: changes as per comment
2023-03-03 18:33:18 +05:30
Teddy
775ca75e87
fix #10173 handle cases where entity would be deleted from OM (#10364)
* fix(dataInsight): handle cases where entity would be deleted from OM

* Update ingestion/src/metadata/data_insight/processor/web_analytic_report_data_processor.py

Added explanation in code comments

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>

* fix(dataInsight): tests failure

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-03-03 12:30:15 +01:00
Teddy
754074f1be
Fixes #7758 - Added Column value and Integer Range Partitionning (#10350)
* feat(profiler): renamed  module to

* feat(profiler): added dbt-artifacts-parser to test setup.py

* feat(profiler): refactor workflow and interface

* feat(profiler): linting

* feat(profiler): removed old profiler modules

* feat(profiler): added support for value and integer range partition

* feat(profiler): fixed linting

* feat(profiler): added partitionning support for datalake profiler

* feat(profiler): removed `ProfilerInterfaceArgs` class

* feat(profiler): address comments

* feat(profiler): Added `OTHER` as an `IntervalType` for UI type generation
2023-03-01 08:20:38 +01:00
Teddy
83be5d933b
Fixes #9301 - Refactor TestSuite and Remove Pandas from Base Requirements (#10244)
* feat(testSuite): extracted out column test for SQA type

* refactor(testSuite): extracted SQA column and table tests into their own classes

* refactor(testSuite): Added pkutil namespace package style for test suite classes

* refactor(testSuite): added dynamic importer function for test cases

* refactor(testSuite): black formatting

* refactor(testSuite): fixed linting issues

* refactor(testSuite): refactor metrics for dataframe

* refactor(testSuite): Added Mixins and base methods

* refactor(testSuite): extrcated out get bound for floats

* refactor(testSuite): Added pandas column test cases

* refactor(testSuite): Deleted old column tests

* refactor(testSuite): Added table tests for datalake

* refactor(testSuite): Removed old tests definition

* refactor(testSuite): changed registry to dynamic class inport

* refactor(testSuite): renamed dl_fn to df_fn

* refactor(testSuite): updated registry unit test

* refactor(testSuite): updated import path to sqa like column

* refactor(testSuite): cleaned up imports in old files

* refactor(testSuite): harmonzied SQALikeColumn object to replicate SQA Column object

* refactor(testSuite): linting

* refactor(testSuite): linting

* refactor(testSuite): raise expection on DQ exception

* refactor(testSuite): linting

* refactor(testSuite): removed pandas from base requirements

* refactor(testSuite): Added __futur__ for py3.7 type hint

* refactor(testSuite): added `df` to good-names

* refactor(testSuite): renamed Handler to Validator

* refactor(testSuite): Added test inheritance for column tests

* refactor(testSuite): cleaned up column type check

* refactor(testSuite): cleaned up typo

* refactor(testSuite): extracted main table test logic into parent class

* refactor(testSuite): linting

* refactor(testSuite): linting fixes

* refactor(testSuite): address doc string and linting issues
2023-02-22 09:42:34 +01:00
Ayush Shah
785142d86a
Add policy tags from Bigquery (#10189) 2023-02-20 19:13:45 +00:00
Suresh Srinivas
afad0a4769
Fixes #10123 - Change entityReference in createRequests to fullyQualifiedName (#10124)
* Change entityReference to entity name or fullyQualifiedName

* Change backend code and tests to use FQN

* UI change for using fqns instead of EntityReference

* Ingestion framework changes for using fqns instead of EntityReference

* Fix test failures

* Fixed python tests and sample data new

* fix: minor ui changes for fqn

* Fixed python integration tests

* Fixed superset tests

* fix UI tests

* fix type issue

* fix cypress

* fix name for testcase

---------

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2023-02-13 13:38:55 +05:30
NiharDoshi99
34a0cc147e
Fix: Added changes for Pii sensitive (#10119)
* Fix: added changes for pii sensitive

* Fix: removed comments

* Fix: python checkstyle

* differtiate between sensitive and non sensitive tag

* fix: python test

* fix: added tests

* fix: maven CI
2023-02-08 16:00:47 +00:00
Pere Miquel Brull
fb15c896b3
Handle XLets in groups for AirflowLineageRunner (#10114)
* Handle XLets in groups

* Linting

* Linting
2023-02-07 06:49:46 +01:00
Pere Miquel Brull
f2fb0521c2
Update airflow loggers and rename ometa loggers (#9868)
* Update airflow loggers and rename ometa loggers

* ANSI print to logger

* Remove colored logging from tests

* Merge ometa_logger into the one used in loggers class

* linting

* linting

Co-authored-by: Nahuel Verdugo Revigliono <nahuel@getcollate.io>
2023-01-23 16:28:17 +01:00
Teddy
dcf220f867
fix: pytest error (#9824)
* fix: pytest error

* fix: linting

* increased verbosity

* empty commit to re-run tests

* print registry and test definition set

* renamed columnValuesToBeUnique fqn

* removed print statements + verbosity
2023-01-20 10:45:11 +01:00
Pere Miquel Brull
7f21a7bced
Fix #8088 - Restructure source connections & clients (#9545) 2023-01-02 13:52:27 +01:00
Suresh Srinivas
758c976cba
Fixes #9259 Change Tags APIs to conform with rest of the APIs (#9260) 2022-12-26 12:32:17 -08:00
Ayush Shah
2bf5eb9051
fix 7995: profileSample % and row number (#9104) 2022-12-20 14:55:11 +05:30
Pere Miquel Brull
3b7ae73473
Airflow e2e integration test (#9363)
* Prep airflow operator integration tests

* Add integration test to Makefile
2022-12-16 19:52:12 -08:00
Teddy
d1a739ec55
Fixes #9025 -- Added deletion of WebAnalytics events in dataInsight Workflow (#9208) 2022-12-13 11:43:29 +01:00
Pere Miquel Brull
c75ba751b7
Fix #9116 & #8284 - Clean tableau source, fix ownership, add description and SSL verification (#9241)
Fix #9116 & #8284 - Clean tableau source, fix ownership, add description and SSL verification (#9241)
2022-12-13 06:36:55 +01:00