167 Commits

Author SHA1 Message Date
sonika-shah
5d733b490c
Minor Fix : query_cost_record_search_index Search exception for elasticsearch instance (#21985)
* Fix : query_cost_record_search_index Search exception for elasticsearch instance

* add sample query to cover test scenarios

* update mapping and fix test
2025-06-28 11:22:34 +05:30
Ayush Shah
11ac56356b
MINOR: Modify Sample data (#21599) 2025-06-24 17:16:13 +05:30
Sriharsha Chintalapani
802438f0ea
Fix default boost score, improve fqn parsing (#21854)
* Fix explain turned by default, use dfs_query_then_fetch in cases of sharding of search cluster

* Add exact match configs

* Add exact match configs

* Update Logic to build search source builder with exact match priority

* Revert "Update Logic to build search source builder with exact match priority"

This reverts commit 175a2e9c6b67ee90d4b2a35af89bb035e8c45131.

* Revert "Add exact match configs"

This reverts commit 3fd52606610bbb97a676170004cab6d7adc31a0d.

* revert display name change

* make boost mode as sum by defaul

* add more fqnparts for schema and database

* revert DFS_QUERY_THEN_FETCH since sharding wasn the issue

* use fqn split

* refactor fqn parsing

---------

Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
2025-06-18 18:56:11 -07:00
Sriharsha Chintalapani
8adda4955c
Revert "Issues in Search Relevancy (#21841)" (#21853)
This reverts commit f388e570c1dac5b9eee31364870fb66e42715f18.
2025-06-18 16:43:34 -07:00
Mohit Yadav
f388e570c1
Issues in Search Relevancy (#21841)
* Fix explain turned by default, use dfs_query_then_fetch in cases of sharding of search cluster

* Add exact match configs

* Add exact match configs

* Update Logic to build search source builder with exact match priority

* Revert "Update Logic to build search source builder with exact match priority"

This reverts commit 175a2e9c6b67ee90d4b2a35af89bb035e8c45131.

* Revert "Add exact match configs"

This reverts commit 3fd52606610bbb97a676170004cab6d7adc31a0d.

* revert display name change

* make boost mode as sum by defaul

* add more fqnparts for schema and database

* revert DFS_QUERY_THEN_FETCH since sharding wasn the issue

* use fqn split

* Refactor FQN Parts

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-06-18 16:33:46 -07:00
Sriharsha Chintalapani
c90138501f
Fix #21822: OpenSearch by default limits the number of characters it will analyze for highlighting to 1,000,000 characters. If your description field is very large (e.g. Markdown docs, embedded HTML, or verbose documentation), this limit gets exceeded. (#21821)
* Add sample data

* Fix index mappings to optimize the highlighter for OpenSearch
2025-06-17 14:22:11 -07:00
Sriharsha Chintalapani
074329418f
Fix #17244: Pagination for columns in UI (#21508) 2025-06-15 21:30:31 +05:30
Teddy
7ab6755beb
ISSUE #21101 - Implement BQ Partitioned Tests (#21348)
* feat: add query logger as an event listent in debug mode

* fix: added ingestion.src plugin to pylint

* minor: add partition sampled table

* test: added test for partitioned BQ table

* Remove log_query function from logger.py

* style: ran python linting
2025-05-22 17:22:05 +02:00
Ayush Shah
653c878497
MINOR: Transform Reserved keywords like quotes to OM compatible (#20459) 2025-03-27 13:02:07 +05:30
Mayur Singal
d30fd90096
Minor: Query Cost Table Aggregation Endpoint (#20270) 2025-03-17 11:33:50 +05:30
Mayur Singal
b727f76ce4
MINOR: Add source url in sample data (#20133) 2025-03-14 11:42:51 +05:30
Mayur Singal
f291e5cb1d
MINOR: Add cost in sample usage queries (#20211) 2025-03-13 19:27:03 +01:00
harshsoni2024
de72e87bdf
mysql sample data createTable order fix (#20019) 2025-02-28 18:51:13 +05:30
Sriharsha Chintalapani
799e49e391
Search: improve relevancy for plural/singular words, partial matches,… (#20000)
* Search: improve relevancy for plural/singular words, partial matches, exact matches

* apply to all indexes

* Fix other query patterns

* Revert changes of database and databaseSchema fields in TableIndex.getFields() and table index mapping

* add missing boost query builder in es

* fix ci

* add max_ngram_diff setting in di-assets index

* fix TestCaseResourceTest mvn test failure

---------

Co-authored-by: sonikashah <sonikashah94@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-02-27 16:47:08 +01:00
Ayush Shah
17ffdf9850
fix: modify fqn to allow quotes with dots (#18719) 2024-11-22 09:33:50 +05:30
Ayush Shah
6f1df37ba1
Fixes GEN-1260: Add Validators while creating table to escape special characters (#18456) 2024-11-18 15:02:57 +05:30
Onkar Ravgan
4a0c8406e9
[ER Diagrams] Add ER diagram APIs and sample data (#18021)
* Add ER diag APIs and sample data

* fix pylint

* formatting fixes2

* fixed es client return

* fixed os client return

* supported TableDetailPage tabs as classBase for supporting collate only tabs

* Added schema Apis

* change the base class to .ts and move the component in the util files

* beautify function arguments

* Added optimizations

* Ingestion changes

* svg dimension change

* supported class base tab in databaseSchema

* supported classBase action button in schema table name column

* added further keys data for constraint modal

* fix sonar issue

* remove old method to override edit action on column and shifted to DisplayNameModal for fields

* supported table right panel component to further extends on collate side

* minor fix around duplicate constraint

* added support to update table constraints and column constraints in the UI

* code optimization and minor fixes

* review comments and multi col fix

* added queryFilter option in NodeSuggestion and tableConstrainst to fetch and use only in service tables

---------

Co-authored-by: Ashish Gupta <ashish@getcollate.io>
2024-10-28 20:26:19 +05:30
Sachin Chaurasiya
457f3d919a
GEN-1322: API Entity - Remove Beta (#17967)
* GEN-1322: API Entity - Remove Beta

* minor: add doc for the metadata pipeline

* api service refactor

* api service refactor backend changes

* add apiconnection in test service connection

* pytest fix

* fix java file formatting

* Fix casing of REST in ApiServiceRest.spec.ts

* Refactor REST to Rest in API classes

* minor change

* minor change

* minor change

* fix cashing for API to Api

* add playwright test for api service ingestion

* fix: playwright test

---------

Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
2024-10-08 14:39:55 +05:30
Ayush Shah
23c6aa3ca5
fix: add children to array of struct (#17850) 2024-09-16 18:02:17 +05:30
Pere Miquel Brull
a098c20c7c
MINOR - LKML sample data (#17359) 2024-08-10 18:01:00 +02:00
Teddy
c336d86a44
chore: add "learning" sample data for dynamic assertion (#17155) 2024-07-25 08:24:15 +02:00
harshsoni2024
d5af2ba6b6
sample data for open metadata user & table apis (#16981) 2024-07-12 10:32:21 +05:30
harshsoni2024
0caa8a8018
API service sample data (#16971)
* add sample data for api service

* pylint fix

* correct service connection json

* fix sample data

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-07-09 22:06:47 -07:00
Teddy
c361305902
MINOR - fix sample data ingestion (#16949)
* fix: sample data ingestion

* style: ran python linting
2024-07-09 10:18:55 +02:00
Imri Paran
2f0a8efceb
MINOR: added table diff sample results (#16844)
* chore(sample-data): table diff

added table diff results sample

* format
2024-07-01 15:28:14 +00:00
Teddy
38fe061227
MINOR -- Add Test Definition Dimension (#16769)
* feat: added test defintion dimension + sample data for bounds

* chore: added migration for definition dimension

* style: ran python linting

* fix: rename dimension to dataQualityDimension

* fix: test definition dimension key
2024-06-24 15:01:12 +00:00
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
Imri Paran
4ac5912d4c
MINOR: added TestCase inspection query to backend and sample data (#16003)
* added TestCase inspection query to backend and sample data

* format

* format
2024-04-26 11:49:08 +02:00
Imri Paran
706d1ab97e
fixed ingestion of sample data for failed sample rows (#15879) 2024-04-15 07:59:27 +02:00
Imri Paran
b2ce491ff1
MINOR: Add failed rows sample to test case (#15682)
* add failed sample data

* format

* fixed masking pii data in test failed rows sample

* format

* failedRowsSamples -> failedRowsSample

* failedRowsSamples -> failedRowsSample

* fixed tests

* format

* wip

* added computePassedFailedRowCount to python client

* comment for loggerLevel

* format

* fixed tests

* tests for putting / deleting failed samples

* format

* format

* added test case for pii test

* changed method name to deleteTestCaseFailedRowsSample

* added getComputePassedFailedRowCount
2024-04-10 17:00:00 +02:00
Mayur Singal
07ea5e97c8
Minor: Postgres cypress move to query log file usage (#14750) 2024-01-17 14:26:57 +05:30
Pere Miquel Brull
0255171218
MINOR - Create Test Case Resolution ts entry & delete resolution when… (#14541)
* MINOR - Create Test Case Resolution ts entry & delete resolution when Test Case is deleted
2024-01-05 09:15:49 +01:00
Teddy
3bbf55fcda
FIXES #14049 - Split test case resolution status from test case result (#14204)
* refactor: entityFQN as ListFilter condition

* feat: implement resolution entity timeseries

* fix: rename to testCaseResolutionStatus

* ref: extracted ES query builder into private method

* ref: extract OS query builder in its own method

* ref: remove ingestion logic for test case resolution

* fix: reorganize json schemas to fix circular import in Python

* ref: object names in typescript code

* feat: added indexing of test case resolution

* feat: added test case resolution sample data

* fix: test case resolution api logic

* fix: audit logger for entityTimeSeriesInterface

* fix: DDL generation

* style: python linting

* fix: skip UI test case resolution tests

* fix: remove extension field

* fix: renamed testCaseFailureStatus to testCaseResolutionStatus

* fix: remove reviewer

* fix: rename sequenceId to stateId

* fix: re adjust search weights

* fix: removed InReview status

* style: ran python linting
2023-12-04 23:18:01 -08:00
Ashish Gupta
42463ff40b
#14134: supported retention period in table entity (#14163)
* supported retention period in table entity

* Add retention period updates

* supported unit test

* added curd operation for retention center

* minor changes

* fix modal issues and added validation

* added unit test for retention period

* fix code smell

* fix sonarcloud

* minor chnages

* Fix java code styling

* added hours in retention period

* changes as per comments

* fix sonar

* remove localization keys

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
2023-12-05 10:42:37 +05:30
Teddy
8ff70e31fe
MINOR: fixes ingestion of sample data for custom metrics (#14170)
* fix: updated sample data ingestoon for custom metrics

* style: fix python linting

* style: fix java linting
2023-11-29 17:28:19 +01:00
Teddy
131eea32f8
Add Custom metrics sample data (#14069)
* feat: add sample data for custom metrics

* feat: update project.scripts to `metadata`
2023-11-22 11:57:19 +01:00
Onkar Ravgan
9d58b56a1c
Added stored procedures sample data (#13838)
* Added stored proc sample data

* Added sp lineage
2023-11-03 14:05:02 +01:00
07Himank
6ffe79f793
fixed ES Indexing for very large S3 Storage Service buckets fails (#13507) 2023-10-13 10:22:53 +05:30
Teddy
e57849b732
Fixes #12298 - Update report data type to camel case (#13505)
* fix: updated DI to camelCase

* fix: ran linting

* fix: added migration

* fix: remove extra parenthesis in migration file

* fix: psql migration query

* fix: OS compose host

* fix: removed commented code block
2023-10-11 08:14:21 +02:00
Teddy
eefce68015
fix: updated DI cost analysis aggregated report (#13498) 2023-10-10 07:04:40 +02:00
Onkar Ravgan
1e48d2ecff
Added sd changes (#13446)
Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-10-05 12:24:32 +02:00
Teddy
9ef3ff7a58
Cost analysis agg (#13408)
* feat: updated DI workflow to inherit from BaseWorkflow + split processor and producer classes

* feat: __init__.py files creation

* feat: updated workflow import classes in code and doc

* feat: moved kpi runner from runner to processor folder

* fix: skip failure on list entities

* feat: deleted unused files

* feat: updated status reporter

* feat: ran linting

* feat: fix test error with typing and fqn

* feat: updated test dependencies

* feat: ran linting

* feat: move execution order up

* feat: updated cost analysis report to align with new workflow

* feat: fix entity already exists for pipeline entity status

* feat: ran python linting

* feat: move skip_on_failure to method

* feat: added unusedReport to DI

* feat: added aggregated unused report

* feat: ran linting

* feat: reverted compose file changes

---------

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-10-03 09:27:18 +02:00
07Himank
74e29a9f16
Generic search changes. (#13326)
* working on new search changes

* working on new search changes

* working

* working

* owner propogation done

* working on propogation

* done

* change in storageservice index

* Merge conflict fix

* Draft changes

* working on making updates generic

* added code to opensearchClientImp

* renamed suppportsSearchIndex to supportsSearch

* checkstyle

* added generic code for deleted as well

* fix tests

* fix all tests

* addressing comments

* fixed test case failure

* Fix lifecycle validation error name typo

* fix realted domain propogation

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Ayush Shah <ayush@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-09-27 10:48:33 -07:00
Teddy
e9ef7b5e81
Issue-12857: Remove ES Dependency from DI Workflow (#13303)
* feat: move elasticsearch indexing to backend + introduced EntityTimeSeries interface for timeseries type object

* feat: make reportData.json inherit from EntityTimeSeriesInterface

* feat: updated type to Object

* feat: deleted elasticsearch dependencies

* feat: removed elasticsearch indexing from workflow

* feat: added data insight sample data

* feat: cleaned up tests
2023-09-21 16:17:47 -07:00
Onkar Ravgan
8acebbb892
Update ingestion logic to use PATCH API for lifeCycle info (#13283) 2023-09-21 16:40:09 +05:30
Sriharsha Chintalapani
e0ecf49585
Fix #11970: Search by FQN; Refactor Search Indexing, Add API to searc… (#13271)
* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Add wildcard support

* Fix GlossaryTerm Patch
2023-09-20 14:40:10 +05:30
Sriharsha Chintalapani
c2ed4f422f
Fix LifeCycle inconsistencies in Schema, make it common entity field (#13252)
* Fix LifeCycle inconsistencies in Schema; Add DELETE api

* set autocommit to true for non transactional

* make lifecycle common field for entities

* Add LifeCycle as common entity field

* Fix python life cycle code

* Fix search indexes

* remove unnecessary constant

* Add test back to entity resource test

* Fix lint

* Fix lint

* Fix lint

* Fix lint

* Add missing schema

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-09-19 14:03:57 +02:00
Onkar Ravgan
1e4d48a034
Added Life Cycle sample data and changed datetime to timestamp (#13141) 2023-09-13 10:59:19 +05:30
Mayur Singal
4e633877b3
Fix ElasticSearch Test Connection & Deploy (#13061) 2023-09-08 12:40:48 +05:30