26 Commits

Author SHA1 Message Date
Teddy
ac77f33b08
Fixes #7447 -- Add freshness metrics to profiler (#9159)
* refactor(profiler): integrated getter func.

Removed metric getter function from their own file.
Added metric getter to their own interface classs.
created dispatch by value methdo to dispatch metric getter func.

* feature(profiler): added systemProfiler schema

* feat(profiler): workflow fresh. & snflk impl.

* feat(profiler): freshness endpoint for put and get

* feat(profiler): added system met. for redshift

* feat(profiler): freshness met. for bigquery

* fix(profiler): keyword not found in func

* feat(profiler): Added sample data for freshness

* fix(profiler): fetch previous day for BQ

* fix(profiler): sonar + data fetching logic

* fix: typo in SystemMetric Class

* fix: linting

* fix: extracted out EntityList class into models.py
2022-12-07 14:33:30 +01:00
Ayush Shah
5be0f8ee76
Dl Profiler (#8694)
* DQ commit

* Add DL Profiler

* Fix Ingestion and Profliing pylint checks

* Fix Tests

* PyFormat files

* Fix Tests

* Resolve Comments

* Fix Tests and Format Files

* Resolve Comments

* Fix Pylint and Code smells

* Resolve Comments

* Fix S3 parquet

* Fix Metrics Code Smell
2022-11-15 16:01:10 +01:00
Teddy
f883863b8a
Fixes #7490 - Split Profiler and TestSuite Interface (#8032)
* Clean up test suite workflow and interface

* Fixed tests

* Split profiler and testSuite interfaces

* Cleaned up workflows and runners

* Fixed code formatting

* - remove old code
- remove `table` attribute used for testing and used mock instead

* Fixed execution bugs from refactor

* Fixed static type checking for profiler/api/workflow.py

* Fixed linting

* Added __init__ files
2022-10-11 15:57:25 +02:00
Teddy
ce578e73d4
Fixes #5831 by implenting testSuite workflow logic (#6911)
* Added database filter in workflow

* Removed association between profiler and data quality

* fixed tests with removed association

* Fixed sonar code smells and bugs

* Updated profiler workflow to:
- support only running profiler (removed test run)
- support column inclusion and exclusion
- added back support for partitioned table and sample

* moved status to workflow

* Fixed tests

* removed test logic from profiler sink

* Added logic to return sample from workflow sample value

* Added profiler examples

* Updated documentation for profiler

* Fixed code smells

* commited changed to profiler

* initial commit of the revamp workflow

* Fixed python formating

* cleaned up profiler submodule by removing test related files and functions

* Added airflow DAG logic for testSuite workflow

* Fixed code smells + added airflow ingestion tests + fixed comments
2022-08-25 10:01:28 +02:00
Teddy
78b5f8c8e2
Part 1 of #5831 -- Profiler workflow implementation (#6809)
* Added database filter in workflow

* Removed association between profiler and data quality

* fixed tests with removed association

* Fixed sonar code smells and bugs

* Updated profiler workflow to:
- support only running profiler (removed test run)
- support column inclusion and exclusion
- added back support for partitioned table and sample

* moved status to workflow

* Fixed tests

* removed test logic from profiler sink

* Added logic to return sample from workflow sample value

* Added profiler examples

* Updated documentation for profiler

* Fixed code smells
2022-08-19 10:52:08 +02:00
Teddy
abaf8a84e9
Fixes #5661 by removing association between profiler and data quality (#6715)
* Added database filter in workflow

* Removed association between profiler and data quality

* fixed tests with removed association

* Fixed sonar code smells and bugs
2022-08-17 12:53:16 +02:00
Sriharsha Chintalapani
1a42428e42
Add time series extention (#6416)
Co-authored-by: Vivek Ratnavel Subramanian <vivekratnavel90@gmail.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2022-08-04 07:22:47 -07:00
Teddy
a920a4c17d
Fixed SQA Warning (#6442) 2022-07-30 09:39:36 -07:00
Teddy
6397b6a0b1
Fixes #6325 -- Implement multithreading for metrics computation (#6406)
* Added tests for multithreading SQA interface

* Added multithread support for metric computation

* Added thread ID to log debuger

* Cleaned up tests

* Fixed python formatting issues

* Added non blocking result processing + threadCount in config file to set numbers of threads

* Added frontend input field to set number of threads

* Fixed code smell, bug and comments from reviewer
2022-07-29 10:41:53 +02:00
Teddy
e1fac99353
Fixes #5723 and implement interface processor logic (#6219)
* Added datetime for min/max

* Added profiler interface

* Update core.py to work with profiler_interface

* Implement interface logic for orm_profiler object

* Fix unique_ratio logic

* removed changes to table.json

* Added Protocol for type hint

* Changed protocol to abc + fixed sonar code smell

* Fixed py_format
2022-07-20 17:54:10 +02:00
nna077
baa5295cc2
Add Date/Time Metrics In Profiler Tab (#5821)
Add Date/Time Metrics In Profiler Tab (#5821)
2022-07-13 21:23:03 +02:00
Teddy
3a7c11424b
Fixes #3133 -- Adding Additional Column Tests (#5867)
* Added additional table + test coverage

* Added logic for front end input fields

* Added comment for median metric

* skipping `Update owner and check description` cypress test

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2022-07-06 10:12:29 +02:00
Mayur Singal
db0e34c709
Fixing Test Connection for Dynamo & Glue (#4316)
* Fixing Test Connection for Dynamo

* Fixed Glue Connector

* renamed engine to connection

* Fixed the return signature

* Added dataclass
2022-04-22 11:30:59 +05:30
Pere Miquel Brull
b3087d08b9
Fix #3522 - Add timeout to profiler (#3707)
Fix #3522 - Add timeout to profiler (#3707)
2022-03-30 08:54:27 +02:00
Pere Miquel Brull
eb906589fd
Fix #3525 - Profiler breaks on Postgres data (#3583)
Fix #3525 - Profiler breaks on Postgres data (#3583)
2022-03-22 15:55:44 +01:00
Pere Miquel Brull
16e82d45de
Fix #3371 - Run Profiler and Tests on a % of the data (#3424)
Fix #3371 - Run Profiler and Tests on a % of the data (#3424)
2022-03-16 06:05:59 +01:00
Pere Miquel Brull
94d7500216
Fix #3248 & #3251 - Update metrics and column profile (#3262)
Fix #3248 & #3251 - Update metrics and column profile (#3262)
2022-03-08 11:44:39 +01:00
Pere Miquel Brull
4a752e3ab2
Fix #3151 - Ingestion profiler should use ORM Profiler (#3192) 2022-03-06 15:43:43 -08:00
Pere Miquel Brull
4116233697
Fix #3105 - ColumnValuesToMatchRegex & other fixes (#3149)
Fix #3105 - ColumnValuesToMatchRegex & other fixes (#3149)
2022-03-04 18:11:49 +01:00
Pere Miquel Brull
e96ac838ff
Fix #3084 - Implement missing tests (#3117)
Fix #3084 - Implement missing tests
2022-03-04 06:59:47 +01:00
Pere Miquel Brull
71207de362
Fix #2875 - Profiler API Sink (#3011)
Fix #2875 - Profiler API Sink
2022-03-02 16:46:28 +01:00
Alberto Miorin
fe5618c8f1
Fix #3037: metadata --version doesn't work (#3038) 2022-03-01 12:19:36 +01:00
Pere Miquel Brull
990608522a
Fix #2981 - Update Profile to match TableProfile (#2982) 2022-02-25 09:26:30 -08:00
Pere Miquel Brull
1fb0e7c489
Fix #2878 & #2877 - Implement Metrics and Validate Composed Metrics (#2926)
Fix #2878 & #2877 - Implement Metrics and Validate Composed Metrics
2022-02-24 07:08:39 +01:00
Pere Miquel Brull
1224d20a36
Fix #2894 - Profiler Processor & Metrics (#2900)
Fix #2894 - Profiler Processor & Metrics (#2900)
2022-02-22 08:09:02 +01:00
Pere Miquel Brull
f304d290b4
Fix #2751 - Init ORM Profiler (#2831)
* ORM Profiler skeleton

* Fix table name within service

* Add license

* Prepare custom types

* Fix converter

* Compute stddev only on numeric

* Prepare smart registries

* Update tests

* Update results retrieval

* Fix composed metrics result

* Format

* Add missing type

* Add _label decorator

* clean readme

* clean readme

* Filter types when profiler runs not allowed metric types

* Fix null ratio

* Add proper type

* RuleMetric skeleton

* Prepare table metrics

* Update simple profiler

* Format

* Define test expression grammar and node visiting

* Unify metric registry

* Prepare validation core

* Add grammar lib

* Add safe get

* Format

* Allow decimals in grammar

* Test validation conversion

* Fix validation conversion and test

* Rename to row_number

* Update READMEs

* Format

* Row number naming

* Fix rename
2022-02-18 07:48:38 +01:00