984 Commits

Author SHA1 Message Date
Teddy
f3da919329
Feat: Backend Support for Custom Metrics (#13965)
* feat: add backend support for custom metrics

* feat: fix python test
2023-11-17 19:16:35 +05:30
Suresh Srinivas
5f34bd02d3
Fixes #13595 Consolidate changes by a user in a single session to a single change (#13617)
* getChangeDescription to use entity and update type to decide previous version

* Fixes #13595 - Consolidate changes by a user in a single session to a single change
2023-11-16 18:14:27 -08:00
Mohit Yadav
10d8ec84fe
Logs added for Search Indexing and Stats issue fixed (#13956)
* Logs added for Search Indexing and Stats issue fixed

* Fix uninstall error

* Add error handling

* fix lint

* Push Job Level Exception on top

* disable flaky tests

* Fix Logs not visible in Search

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2023-11-13 23:39:56 +05:30
Pere Miquel Brull
c742835766
Auto Tagger Application - Preparing the Ingestion Framework (#13862)
* Prepare the skeleton for generic app registration

* Prepare the skeleton for generic app registration

* Handle app runner

* Prepare the skeleton for generic app registration

* Prepare the skeleton for generic app registration

* Allow deployment

* Fix PII APP

* Fix lint

* Fix PII APP

* Fix PII APP

* Prepare config-based external apps

* Prepare config-based external apps

* Fix lint

* Prepare config-based external apps

* Fix DI errors

* Amend comments
2023-11-13 08:58:38 +01:00
Mayur Singal
367bac9064
Fix #13787: Add support for ES data types (#13916)
* Fix #13787: Add support for ES data types

* fixed tests

---------

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2023-11-10 20:14:42 +05:30
Pere Miquel Brull
0eacc829a4
Fix #13794 - Add domain support to the Python SDK (#13931) 2023-11-10 11:00:06 +01:00
Pere Miquel Brull
7c06116b53
Add deprecation warnings (#13927) 2023-11-10 15:17:07 +05:30
Pere Miquel Brull
8891a9a410
Fix #13906 - Fix add_mlmodel_lineage description field (#13920) 2023-11-10 15:16:09 +05:30
Pere Miquel Brull
b250cd8808
Fix #13699 - Add separator for Storage Container manifest (#13924)
* Fix #13699 - Add separator for Storage Container manifest

* Fix #13906 - Fix add_mlmodel_lineage description field

* Add separator

* Add separator
2023-11-10 10:44:47 +01:00
Mayur Singal
4b625f7ba5
Move pandas top level import (#13926) 2023-11-10 14:15:14 +05:30
Mayur Singal
a8145a82fa
Fix #13603: Configurable Sample Data Rows for Profiler (#13807)
* Fix #13603: Configurable Sample Data Rows

* Fix #13603: Configurable Sample Data Rows for Profiler

* fix table config

* support configurable overwriting of sample data

* add support for schema and database profiler configuration

* chore(ui): put sampleDataStorageConfig under advanced config

* fix tests

* py format

* chore(ui): add sampleDataCount in table profiler config

* fix tests

* pylint & tests

* feat(ui): add profiler settings tab in database and database schema page

* chore(ui): show different inputs for profile sample type

* schema changes to make default storange config null

* add unit test

* schema changes to fix api

* update profiler setting schema

* move profiler settings to manage button

* sync locals

* fix(ui): unit tests

* fix tests

* py format

* fix lint

* minor improvements

* chore(ui): update profiler settings schema

* resolve review comments

* pytest

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-11-09 18:49:42 +05:30
Suresh Srinivas
a89e317a2b
Fixes #13863 - Show inherited relationships of an entity (#13864)
* Fixes #13863 - Show inherited relationships of an entity

* Test failure fixes

* Commenting out invalid python test
2023-11-07 09:11:06 -08:00
Onkar Ravgan
c7834e74cc
fixed avro recursive record (#13856) 2023-11-06 16:27:06 +05:30
Teddy
d025e217d6
fix: catch not Either type in workflow and return explicit error message (#13796) 2023-11-02 13:02:26 +01:00
Ayush Shah
0a04ce85bb
Add Multilingual Support in EntityLink (#13826) 2023-11-02 14:35:22 +05:30
Teddy
10904049e4
fix: handle lower and upper case name (#13778) 2023-10-31 09:51:13 +01:00
Teddy
452a33b1a0
Fixes Druid Profiler failures (#13700)
* fix: updated playwrigth test structure

* fix: druid profiler queries

* fix: python linting

* fix: python linting

* fix: do not compute random sample if profile sample is 100

* fix: updated workflow to test on push

* fix: move connector config to category folder

* fix: updated imports

* fix: added pytest-dependency package

* fix: updated readme.md

* fix: python linting

* fix: updated profile doc for Druid sampling

* fix: empty commit for CI

* fix: added workflow constrain back

* fix: sonar code smell

* fix: added secrets to container

* Update openmetadata-docs/content/v1.2.x-SNAPSHOT/connectors/ingestion/workflows/profiler/index.md

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>

* Update openmetadata-docs/content/v1.2.x-SNAPSHOT/connectors/ingestion/workflows/profiler/index.md

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>

* Update ingestion/tests/e2e/entity/database/test_redshift.py

* fix: ran pylint

* fix: updated redshift env var.

* fix: import linting

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-10-25 20:47:51 +02:00
Ayush Shah
bfb361dc85
Fix Bigquery lineage Pytests (#13695) 2023-10-25 11:15:41 +05:30
Ayush Shah
57cb72c26f
Fix Checkstyle (#13683) 2023-10-23 15:51:40 +05:30
Iaroslav Frolikov
420da29841
Fixes #13607: BigQuery lineage ingestion fails when using GcpCredentialsPath authentication config (#13608) 2023-10-23 15:42:06 +05:30
Onkar Ravgan
0f0bccdd45
Converted and fixed pipelinestatus timestamps to milliseconds (#13670)
* fixed pipelinestatus timestamps in mills

* Added migrations
2023-10-20 09:39:24 -07:00
Pere Miquel Brull
8cf8720a9d
Clean Airflow Lineage Backend and migrate status to millis (#13666)
* Clean Airflow Lineage Backend and migrate status to millis

* Format

* chore(ui): update executions startTs and endTs to millis

* Remove lineage providers

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2023-10-20 15:42:38 +02:00
Pere Miquel Brull
660bf01a5b
Fix Stored Procedures Lineage for multi-db processes (#13655) 2023-10-20 09:14:08 +02:00
Ayush Shah
ad86d8969f
Fix E2E failures (#13648) 2023-10-19 17:49:02 +05:30
Pere Miquel Brull
255bfb95b1
Remove duplicates from entity_extension_time_series and add the const… (#13626)
* Remove duplicates from entity_extension_time_series and add the constraing if missing

* Add sort buffer and work mem

* Revert "Add sort buffer and work mem"

This reverts commit fcfff5feb60c9212bb7c1cad34b524dc8c03bfc5.
2023-10-19 12:15:02 +02:00
Ayush Shah
f94e2dbb47
Fix Hive Bytes issue, add athena yaml, fix bigquerymultiple project id token issue (#13640) 2023-10-18 23:48:21 +05:30
Ayush Shah
ac9e8c9e89
Add E2E - Oracle, Athena. Remove Duplicated code (#13563) 2023-10-18 16:57:06 +05:30
Teddy
31d2595e4f
fix: pass rnd table bound columns to sample query (#13561) 2023-10-13 14:57:28 +05:30
Teddy
1cbdfb3ae7
Fixes #12601 - column filter for profiler workflow (#13535)
* fix: sample data ingestion to match entity profiler column setting

* fix: python linting

* fix: updated fn call

* fix: added logic to handle json filed in datalake connector

* fix: handle NA values in parsing

* fix: reverted sampler changes from #13338

* fix: reverted metric changes from #13338

* fix: added datalake profiler ingestion test

* fix: python linting

* fix: removed normalization of json blob in NoSQL db
2023-10-12 14:51:38 +02:00
Mayur Singal
f63881b8b6
Fix mysql E2E test count (#13529) 2023-10-12 11:25:14 +05:30
Onkar Ravgan
6e013246a7
dbt fixed null sql updates and source descriptions (#13467) 2023-10-12 11:07:58 +05:30
Teddy
e57849b732
Fixes #12298 - Update report data type to camel case (#13505)
* fix: updated DI to camelCase

* fix: ran linting

* fix: added migration

* fix: remove extra parenthesis in migration file

* fix: psql migration query

* fix: OS compose host

* fix: removed commented code block
2023-10-11 08:14:21 +02:00
Mayur Singal
f69cd9f54a
Fix hive e2e test count (#13497) 2023-10-10 00:21:23 -07:00
Pere Miquel Brull
d3da2d1b9f
Register Ingestion pipelines just from YAML (#13501)
* Register Ingestion pipelines just from YAML

* Format
2023-10-10 07:04:04 +02:00
Pere Miquel Brull
f5e10c4a5f
Fix #7272 - BaseWorkflow docs and cleanup (#13471)
* DQ BaseWorkflow

* Test suite runner

* test Suite workflow

* Refactor DQ for BaseWorkflow

* Lint

* Fix source

* Fix source

* Fix source

* Fix source

* Fix test

* Prepare docs

* Clean sink

* Clean legacy classes

* typo

* ProcessorStatus
2023-10-09 07:05:05 +02:00
Ayush Shah
08d7ee6d55
Fixes #13052: Datalake Nested Columns Sample Data ingestion (#13338) 2023-10-08 20:08:51 +05:30
Pere Miquel Brull
aed9e3875f
DQ base workflow (#13454)
* DQ BaseWorkflow

* Test suite runner

* test Suite workflow

* Refactor DQ for BaseWorkflow

* Lint

* Fix source

* Fix source

* Fix source

* Fix source

* Fix test

* Fix test

* Fix test
2023-10-06 18:29:18 +02:00
Mayur Singal
0090286924
Fix Bigquery Test connection for multiproject (#13380)
Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-10-05 14:50:42 +05:30
Teddy
ddae3d8143
Refactor Data Insight aggregators Classes (#13433)
* fix: removed legacy OS and ES aggregator classes

* fix: centralized aggregator business logic

* fix: implemented client specific aggregator

* fix: updated client instantiation to use client specific aggregator

* fix: clean up json schema

* fix: updated DI index names

* fix: added searchIndex + storedProcedure

* fix: ran linting

* fix: updated python test to include new entity types
2023-10-05 09:31:27 +02:00
Pere Miquel Brull
0282574bdd
Create ometa client once and pass it around & improve pycln config (#13310)
* Create ometa client once and pass it around & improve pycln config

* Fix

* Fix

* Fix tests

* Fix maven ci

* Fix tests

* Fix tests

* Fix tests

* Format

* Fix DI
2023-10-04 09:14:03 +02:00
Pere Miquel Brull
31b827585b
Allow ometa to create services without storing the connection (#13400)
* Allow ometa to create services without storing the connection

* Allow ometa to create services without storing the connection

* Fix backend tests with null connection
2023-10-04 07:48:49 +02:00
Mayur Singal
4f4d1c725c
Fix failing E2Es (#13419) 2023-10-04 10:56:34 +05:30
Pere Miquel Brull
b5596a4640
Batch PII tagging (#13385)
* Batch PII tagging

* Batch PII tagging

* Fix tests

* Fix tests
2023-10-02 14:44:41 +02:00
Pere Miquel Brull
d915254fac
Prepare Storage Connector for ADLS & Docs (#13376)
* Prepare Storage Connector for ADLS & Docs

* Format

* Fix test
2023-10-02 12:15:09 +02:00
Teddy
6ca71ae323
Issue 12679 - Handle Entity Object Instantiation Error + Refactor Workflow (#13384)
* feat: updated DI workflow to inherit from BaseWorkflow + split processor and producer classes

* feat: __init__.py files creation

* feat: updated workflow import classes in code and doc

* feat: moved kpi runner from runner to processor folder

* fix: skip failure on list entities

* feat: deleted unused files

* feat: updated status reporter

* feat: ran linting

* feat: fix test error with typing and fqn

* feat: updated test dependencies

* feat: ran linting

* feat: move execution order up

* feat: updated cost analysis report to align with new workflow

* feat: fix entity already exists for pipeline entity status

* feat: ran python linting

* feat: move skip_on_failure to method

* feat: ran linting

---------

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-10-02 12:05:30 +02:00
Cristian Calugaru
5d8457b597
Fixes ISSUE-10587: global manifest option for storage services (#12017)
* global manifest option for storage services

* added a no metadata config source option for global manifest s3 services option

* merge fixes

* more merge fixes.

* black stuff

* test fixes

* formatting

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-09-28 07:55:40 +02:00
Ayush Shah
04760177f6
Fixes 13321: Fix Test Connection timeout (#13323) 2023-09-25 15:17:38 +05:30
Mayur Singal
65f65137e6
Fix: Bigquery query log not picked up for multiproject (#13313) 2023-09-25 08:07:48 +05:30
Teddy
a7dd7012ea
fix: python test to remove database race condition (#13307) 2023-09-22 15:05:57 +02:00
Teddy
e9ef7b5e81
Issue-12857: Remove ES Dependency from DI Workflow (#13303)
* feat: move elasticsearch indexing to backend + introduced EntityTimeSeries interface for timeseries type object

* feat: make reportData.json inherit from EntityTimeSeriesInterface

* feat: updated type to Object

* feat: deleted elasticsearch dependencies

* feat: removed elasticsearch indexing from workflow

* feat: added data insight sample data

* feat: cleaned up tests
2023-09-21 16:17:47 -07:00