Teddy
c7ac28f2c2
Fixes #11357 - Implement profiler custom metric processing ( #14021 )
...
* feat: add backend support for custom metrics
* feat: fix python test
* feat: support custom metrics computation
* feat: updated tests for custom metrics
* feat: added dl support for min max of datetime
* feat: added is safe query check for query sampler
* feat: added support for custom metric computation in dl
* feat: added explicit addProper for pydantic model import fo Extra
* feat: added custom metric to returned obj
* feat: wrapped trino import in __init__
* feat: fix python linting
* feat: fix typing in 3.8
2023-11-17 17:51:39 +01:00
Mayur Singal
a8145a82fa
Fix #13603 : Configurable Sample Data Rows for Profiler ( #13807 )
...
* Fix #13603 : Configurable Sample Data Rows
* Fix #13603 : Configurable Sample Data Rows for Profiler
* fix table config
* support configurable overwriting of sample data
* add support for schema and database profiler configuration
* chore(ui): put sampleDataStorageConfig under advanced config
* fix tests
* py format
* chore(ui): add sampleDataCount in table profiler config
* fix tests
* pylint & tests
* feat(ui): add profiler settings tab in database and database schema page
* chore(ui): show different inputs for profile sample type
* schema changes to make default storange config null
* add unit test
* schema changes to fix api
* update profiler setting schema
* move profiler settings to manage button
* sync locals
* fix(ui): unit tests
* fix tests
* py format
* fix lint
* minor improvements
* chore(ui): update profiler settings schema
* resolve review comments
* pytest
---------
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-11-09 18:49:42 +05:30
Ayush Shah
ec6184d2da
Fix Trino Dialect Import issue ( #13869 )
2023-11-07 12:10:59 +05:30
Teddy
452a33b1a0
Fixes Druid Profiler failures ( #13700 )
...
* fix: updated playwrigth test structure
* fix: druid profiler queries
* fix: python linting
* fix: python linting
* fix: do not compute random sample if profile sample is 100
* fix: updated workflow to test on push
* fix: move connector config to category folder
* fix: updated imports
* fix: added pytest-dependency package
* fix: updated readme.md
* fix: python linting
* fix: updated profile doc for Druid sampling
* fix: empty commit for CI
* fix: added workflow constrain back
* fix: sonar code smell
* fix: added secrets to container
* Update openmetadata-docs/content/v1.2.x-SNAPSHOT/connectors/ingestion/workflows/profiler/index.md
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* Update openmetadata-docs/content/v1.2.x-SNAPSHOT/connectors/ingestion/workflows/profiler/index.md
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* Update ingestion/tests/e2e/entity/database/test_redshift.py
* fix: ran pylint
* fix: updated redshift env var.
* fix: import linting
---------
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-10-25 20:47:51 +02:00
Teddy
1cbdfb3ae7
Fixes #12601 - column filter for profiler workflow ( #13535 )
...
* fix: sample data ingestion to match entity profiler column setting
* fix: python linting
* fix: updated fn call
* fix: added logic to handle json filed in datalake connector
* fix: handle NA values in parsing
* fix: reverted sampler changes from #13338
* fix: reverted metric changes from #13338
* fix: added datalake profiler ingestion test
* fix: python linting
* fix: removed normalization of json blob in NoSQL db
2023-10-12 14:51:38 +02:00
Ayush Shah
97f4f8fbf3
Fixes 12922: Trino NaN issue + TrinoUserError ( #13244 )
...
* Fix Trino NaN issue + TrinoUserError
2023-10-04 18:39:39 +05:30
Teddy
d4593e9caa
fix: implement percentile computation logic for SingleStore ( #13170 )
2023-09-13 16:32:55 +02:00
Pere Miquel Brull
a3bfd4e696
Part of #11968 - Restructure Profiler Workflow and PII Processor ( #13059 )
...
* Structure PII
* Restructure Profiler Workflow
* Update signature for abc
* remove profiler sink
* Fix tests
* Fix lint
* Fix test
* Fix test
2023-09-04 11:02:57 +02:00
Teddy
101cd0ebac
Issue 8930 - Update profiler timestamp from seconds to milliseconds ( #12948 )
2023-08-25 08:47:16 +02:00
Teddy
54fbe250a1
fix: import error + BQ E2E CLI ( #12420 )
2023-07-13 13:35:37 +02:00
Teddy
b89cf64f14
Clean up profiler ( #12369 )
...
* ref: implemented interface for profiler components + removed struct logic
* ref: ran python linting
* ref: added UML diagram to readme.md
* ref: empty commit for labeler check
* ref: remove multiple context manager for 3.7 3.8 compatibility
* ref: remove
2023-07-12 17:02:32 +02:00
Hung Duong
64f147c517
Fixes #12129 : enhance bigquery sample data query ( #12130 )
...
* upgrade BigQuery Sampler
* beautify code
* revert old way of profiler & data quality, keep fetch new way sample
* Update profiler_source.py
* Update profiler_source.py
---------
Co-authored-by: hung.duong <hung.duong@be.com.vn>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2023-07-04 08:37:15 +02:00
Ayush Shah
cb6e42941a
Fix 12025: Clickhouse NaN issue ( #12079 )
2023-06-22 12:51:56 +05:30