60 Commits

Author SHA1 Message Date
Sriharsha Chintalapani
802438f0ea
Fix default boost score, improve fqn parsing (#21854)
* Fix explain turned by default, use dfs_query_then_fetch in cases of sharding of search cluster

* Add exact match configs

* Add exact match configs

* Update Logic to build search source builder with exact match priority

* Revert "Update Logic to build search source builder with exact match priority"

This reverts commit 175a2e9c6b67ee90d4b2a35af89bb035e8c45131.

* Revert "Add exact match configs"

This reverts commit 3fd52606610bbb97a676170004cab6d7adc31a0d.

* revert display name change

* make boost mode as sum by defaul

* add more fqnparts for schema and database

* revert DFS_QUERY_THEN_FETCH since sharding wasn the issue

* use fqn split

* refactor fqn parsing

---------

Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
2025-06-18 18:56:11 -07:00
Sriharsha Chintalapani
8adda4955c
Revert "Issues in Search Relevancy (#21841)" (#21853)
This reverts commit f388e570c1dac5b9eee31364870fb66e42715f18.
2025-06-18 16:43:34 -07:00
Mohit Yadav
f388e570c1
Issues in Search Relevancy (#21841)
* Fix explain turned by default, use dfs_query_then_fetch in cases of sharding of search cluster

* Add exact match configs

* Add exact match configs

* Update Logic to build search source builder with exact match priority

* Revert "Update Logic to build search source builder with exact match priority"

This reverts commit 175a2e9c6b67ee90d4b2a35af89bb035e8c45131.

* Revert "Add exact match configs"

This reverts commit 3fd52606610bbb97a676170004cab6d7adc31a0d.

* revert display name change

* make boost mode as sum by defaul

* add more fqnparts for schema and database

* revert DFS_QUERY_THEN_FETCH since sharding wasn the issue

* use fqn split

* Refactor FQN Parts

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-06-18 16:33:46 -07:00
Sriharsha Chintalapani
c90138501f
Fix #21822: OpenSearch by default limits the number of characters it will analyze for highlighting to 1,000,000 characters. If your description field is very large (e.g. Markdown docs, embedded HTML, or verbose documentation), this limit gets exceeded. (#21821)
* Add sample data

* Fix index mappings to optimize the highlighter for OpenSearch
2025-06-17 14:22:11 -07:00
Sriharsha Chintalapani
074329418f
Fix #17244: Pagination for columns in UI (#21508) 2025-06-15 21:30:31 +05:30
Teddy
7ab6755beb
ISSUE #21101 - Implement BQ Partitioned Tests (#21348)
* feat: add query logger as an event listent in debug mode

* fix: added ingestion.src plugin to pylint

* minor: add partition sampled table

* test: added test for partitioned BQ table

* Remove log_query function from logger.py

* style: ran python linting
2025-05-22 17:22:05 +02:00
Mayur Singal
d30fd90096
Minor: Query Cost Table Aggregation Endpoint (#20270) 2025-03-17 11:33:50 +05:30
Mayur Singal
b727f76ce4
MINOR: Add source url in sample data (#20133) 2025-03-14 11:42:51 +05:30
Mayur Singal
f291e5cb1d
MINOR: Add cost in sample usage queries (#20211) 2025-03-13 19:27:03 +01:00
Sriharsha Chintalapani
799e49e391
Search: improve relevancy for plural/singular words, partial matches,… (#20000)
* Search: improve relevancy for plural/singular words, partial matches, exact matches

* apply to all indexes

* Fix other query patterns

* Revert changes of database and databaseSchema fields in TableIndex.getFields() and table index mapping

* add missing boost query builder in es

* fix ci

* add max_ngram_diff setting in di-assets index

* fix TestCaseResourceTest mvn test failure

---------

Co-authored-by: sonikashah <sonikashah94@gmail.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-02-27 16:47:08 +01:00
Ayush Shah
6f1df37ba1
Fixes GEN-1260: Add Validators while creating table to escape special characters (#18456) 2024-11-18 15:02:57 +05:30
Ayush Shah
23c6aa3ca5
fix: add children to array of struct (#17850) 2024-09-16 18:02:17 +05:30
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
Ashish Gupta
42463ff40b
#14134: supported retention period in table entity (#14163)
* supported retention period in table entity

* Add retention period updates

* supported unit test

* added curd operation for retention center

* minor changes

* fix modal issues and added validation

* added unit test for retention period

* fix code smell

* fix sonarcloud

* minor chnages

* Fix java code styling

* added hours in retention period

* changes as per comments

* fix sonar

* remove localization keys

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
2023-12-05 10:42:37 +05:30
Teddy
8ff70e31fe
MINOR: fixes ingestion of sample data for custom metrics (#14170)
* fix: updated sample data ingestoon for custom metrics

* style: fix python linting

* style: fix java linting
2023-11-29 17:28:19 +01:00
Teddy
131eea32f8
Add Custom metrics sample data (#14069)
* feat: add sample data for custom metrics

* feat: update project.scripts to `metadata`
2023-11-22 11:57:19 +01:00
Onkar Ravgan
9d58b56a1c
Added stored procedures sample data (#13838)
* Added stored proc sample data

* Added sp lineage
2023-11-03 14:05:02 +01:00
07Himank
74e29a9f16
Generic search changes. (#13326)
* working on new search changes

* working on new search changes

* working

* working

* owner propogation done

* working on propogation

* done

* change in storageservice index

* Merge conflict fix

* Draft changes

* working on making updates generic

* added code to opensearchClientImp

* renamed suppportsSearchIndex to supportsSearch

* checkstyle

* added generic code for deleted as well

* fix tests

* fix all tests

* addressing comments

* fixed test case failure

* Fix lifecycle validation error name typo

* fix realted domain propogation

---------

Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Ayush Shah <ayush@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-09-27 10:48:33 -07:00
Sriharsha Chintalapani
e0ecf49585
Fix #11970: Search by FQN; Refactor Search Indexing, Add API to searc… (#13271)
* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Fix #11970: Search by FQN; Refactor Search Indexing, Add API to search for specific field

* Add wildcard support

* Fix GlossaryTerm Patch
2023-09-20 14:40:10 +05:30
Sriharsha Chintalapani
19b5c946a7
Fix #12167: Support for Stored Procedures as another entity under Database Schema (#12999)
* Add Stored Procedure Entity

* Stored Procedure repository

* Stored Procedure repository

* Fix #12998: Support for Stored Procedures as another entity under Database Schema

* Fix #12998: Support for Stored Procedures as another entity under Database Schema
2023-08-25 08:14:30 +02:00
Pere Miquel Brull
6773541d15
[1.1.1] - Bump size for FQN (#12092)
* Bump size for FQN

* Bump table entityName size

* Bump table entityName size

* Fix table resource tests

* Remove pattern from fqn

* Remove pattern from fqn

* Remove pattern from fqn

* Generalize get_by_name in ometa client

* Generalize get_by_name in ometa client

* Format

* Fix test suite

* Remove limit from FQN max size

* Remove limit from FQN max size

* Add sample data

* Update lint names

* Add more sample data

* Bump column name size

* 1024 max FQN length

* 1024 max FQN length

* 1024 max FQN length

* Bump FQN
2023-07-26 12:36:42 -07:00
Ayush Shah
65f370e4aa
Rename GCS to GCP (#11812) 2023-06-06 11:57:00 +05:30
Pere Miquel Brull
019014b8d3
11579 Task - Add () in sample data (#11591) 2023-05-15 10:09:57 +02:00
Sriharsha Chintalapani
d9e4fbdebb
Fix #10454: Improve Search Relevancy, by adding functional scoring an… (#10455)
* Fix #10454: Improve Search Relevancy, by adding functional scoring and add ngram analyzer; Fix #10452: Enable Table and Column search by BOTH name and displayName

* fix stylecheck

* Undo changes in table example names

* remove ngram from teams & users

* Fix topic tags

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf (#10430)

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Added top level parsing and unit tests

* fix(ui): show schemaText and fields both

* fix no data placeholder for fields & schema text

* addressing comments

* fixed py checkstyle

---------

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>

* revert common_broker_source changes

* revert common_broker_source changes

* remove changes to user & team indexes

* fix team index

* fix glossary & tag index

* Fix to TopicIndex

* fix advance search pre-requisites cypress failure

* fix group advance search cy failures

---------

Co-authored-by: Nahuel <nahuel@getcollate.io>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2023-03-09 21:37:08 +05:30
NiharDoshi99
03d4011a17
Fix: Changes in bigquey for project-id (#8708) 2022-11-17 14:26:37 +05:30
Sriharsha Chintalapani
1a42428e42
Add time series extention (#6416)
Co-authored-by: Vivek Ratnavel Subramanian <vivekratnavel90@gmail.com>
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2022-08-04 07:22:47 -07:00
Milan Bariya
45e6c2c7d7
Sample Data Ingestion use Create request (#6546)
* Sample Data Ingestion use Create request

* Fix: Code smell

* Fix: make_pyformat

* Fix: Changed based on comments
2022-08-04 11:13:48 +02:00
Pere Miquel Brull
9ca7a75e2d
Add col lineage in sample usage (#5790) 2022-06-30 12:16:06 +02:00
Pere Miquel Brull
0ecc9f0da6
Fix #5459 - Remove sql-metadata in favor of sqllineage (#5494)
Fix #5459 - Remove sql-metadata in favor of sqllineage (#5494)
2022-06-21 18:02:50 +02:00
Pere Miquel Brull
8e9d0a73f6
Fix #3573 - Sample Data refactor & ORM converter improvements (#5265)
Fix #3573 - Sample Data refactor & ORM converter improvements (#5265)
2022-06-08 16:10:40 +02:00
Mayur Singal
f9bb54ed91
Fix #5162: Sample usage fixed (#5163)
* Fix #5162: Sample usage fixed

* Test Fix
2022-05-26 15:15:41 +02:00
Vivek Ratnavel Subramanian
1c68748618
Fix #4542. Tables with a slash (/) in their name aren't appearing (#4717) 2022-05-04 23:30:39 -07:00
Mayur Singal
450fb2b132
SampleData Usage Fix (#4398)
* SampleData Test Connection & Usage Fix

* Fixed Pytest
2022-05-04 16:45:49 +02:00
Pere Miquel Brull
256b16d877
Fix #4032 - Bigquery properties & GCS Credentials (#4202)
Fix #4032 - Bigquery properties & GCS Credentials (#4202)
2022-04-19 12:31:34 +02:00
Pere Miquel Brull
63415952e3
Fix sample data (#4200)
Fix sample data (#4200)
2022-04-18 19:00:36 +02:00
Pere Miquel Brull
6a6507e754
Fix #3962 - Profiler uses DatabaseSchema & Sample Data fix (#4056) 2022-04-12 13:40:59 +05:30
Sriharsha Chintalapani
2e870669e3 Fix #4042: Ingestion: Sample data ingestion is failing 2022-04-11 16:12:01 -07:00
Mayur Singal
7292695bd3
Sample Data Fix (#3888)
* Sample Data Fix
2022-04-06 18:26:54 +05:30
codingwithabhi
dbfe488a9e
Schema support fix (#3856)
Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2022-04-05 10:51:42 -07:00
Sriharsha Chintalapani
b14c8dc2c4
Issue-3685: Variable based separator used for fullyQualifiedName instead of hardcoded . for Python and make the separator to : (#3778)
* Issue-3685: Variable based separator used for fullyQualifiedName instead of hardcoded . for Python and make the sepearator to :

* Fix failing test

* Use colon for run_local_docker validation

* Update tests FQDN

* Update tests FQDN

Co-authored-by: Sachin-chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: pmbrull <peremiquelbrull@gmail.com>
2022-03-31 19:20:27 +05:30
Ayush Shah
a87ed206c0
Fix Redshift Usage (#3624) 2022-03-23 22:07:25 -07:00
Pere Miquel Brull
94d7500216
Fix #3248 & #3251 - Update metrics and column profile (#3262)
Fix #3248 & #3251 - Update metrics and column profile (#3262)
2022-03-08 11:44:39 +01:00
Pere Miquel Brull
bd7b91b448
Fix #3112 - col profile safety & sample data (#3142)
Fix #3112 - col profile safety & sample data (#3142)
2022-03-04 13:14:11 +01:00
Pere Miquel Brull
71207de362
Fix #2875 - Profiler API Sink (#3011)
Fix #2875 - Profiler API Sink
2022-03-02 16:46:28 +01:00
Ayush Shah
d2c64007cb
Table Constraints Added - Ingestion (#2854) 2022-02-19 09:16:15 -08:00
Mayur Singal
44cca4d020
fix #2663: added table queries in sample data (#2763) 2022-02-15 15:45:58 +05:30
Ayush Shah
7013b070d2
Minor Data Type Display modification (#2654)
Characters modified from unicode for e.g. &lt; to <
2022-02-07 18:07:08 +05:30
Sriharsha Chintalapani
0d3ded0742
Data Profiler Integration (#2235)
* Fix 2234: Data profiler integration
2022-01-18 20:25:43 -08:00
Ayush Shah
7263510124
Query Log fixed (#1615)
* Query Log fixed

* Wrong ColumnName fixed

* Added Service Name in usage configs
2021-12-08 07:36:45 -08:00