feat(search): enable search initial customization (#7901)

This commit is contained in:
david-leifker 2023-05-01 13:18:19 -05:00 committed by GitHub
parent b71baac0b8
commit ebb2af637f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
29 changed files with 1135 additions and 126 deletions

View File

@ -182,6 +182,153 @@ for integrations and programmatic use-cases.
### DataHub Blog
* [Using DataHub for Search & Discovery](https://blog.datahubproject.io/using-datahub-for-search-discovery-fa309089be22)
## Customizing Search
It is possible to completely customize search ranking, filtering, and queries using a search configuration yaml file.
This no-code solution provides the ability to extend, or replace, the Elasticsearch-based search functionality. The
only limitation is that the information used in the query/ranking/filtering must be present in the entities' document,
however this does include `customProperties`, `tags`, `terms`, `domain`, as well as many additional fields.
Additionally, multiple customizations can be applied to different query strings. A regex is applied to the search query
to determine which customized search profile to use. This means a different query/ranking/filtering can be applied to
a `select all`/`*` query or one that contains an actual query.
Search results (excluding select `*`) are a balance between relevancy and the scoring function. In
general, when trying to improve relevancy, focus on changing the query in the `boolQuery` section and rely on the
`functionScore` for surfacing the *importance* in the case of a relevancy tie. Consider the scenario
where a dataset named `orders` exists in multiple places. The relevancy between the dataset with the **name** `orders` and
the **term** `orders` is the same, however one location may be more important and thus the function score preferred.
**Note:** The customized query is a pass-through to Elasticsearch and must comply with their API, syntax errors are possible.
It is recommended to test the customized queries prior to production deployment and knowledge of the Elasticsearch query
language is required.
### Enable Custom Search
The following environment variables on GMS control whether a search configuration is enabled and the location of the
configuration file.
Enable Custom Search:
```shell
ELASTICSEARCH_QUERY_CUSTOM_CONFIG_ENABLED=true
```
Custom Search File Location:
```shell
ELASTICSEARCH_QUERY_CUSTOM_CONFIG_FILE=search_config.yml
```
The location of the configuration file can be on the Java classpath or the local filesystem. A default configuration
file is included with the GMS jar with the name `search_config.yml`.
### Search Configuration
The search configuration yaml contains a simple list of configuration profiles selected using the `queryRegex`. If a
single profile is desired, a catch-all regex of `.*` can be used.
The list of search configurations can be grouped into 4 general sections.
1. `queryRegex` - Responsible for selecting the search customization based on the [regex matching](https://www.w3schools.com/java/java_regex.asp) the search query string.
*The first match is applied.*
2. Built-in query booleans - There are 3 built-in queries which can be individually enabled/disabled. These include
the `simple query string`[[1]](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-simple-query-string-query.html),
`match phrase prefix`[[2]](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-match-query-phrase-prefix.html), and
`exact match`[[3]](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-term-query.html) queries,
enabled with the following booleans
respectively [`simpleQuery`, `prefixMatchQuery`, `exactMatchQuery`]
3. `boolQuery` - The base Elasticsearch `boolean query`[[4](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-bool-query.html)].
If enabled in #2 above, those queries will
appear in the `should` section of the `boolean query`[[4](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-bool-query.html)].
4. `functionScore` - The Elasticsearch `function score`[[5](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-function-score-query.html#score-functions)] section of the overall query.
### Examples
These examples assume a match-all `queryRegex` of `.*` so that it would impact any search query for simplicity.
#### Example 1: Ranking By Tags/Terms
Boost entities with tags of `primary` or `gold` and an example glossary term's uuid.
```yaml
queryConfigurations:
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
functionScore:
functions:
- filter:
terms:
tags.keyword:
- urn:li:tag:primary
- urn:li:tag:gold
weight: 3.0
- filter:
terms:
glossaryTerms.keyword:
- urn:li:glossaryTerm:9afa9a59-93b2-47cb-9094-aa342eec24ad
weight: 3.0
score_mode: multiply
boost_mode: multiply
```
#### Example 2: Preferred Data Platform
Boost the `urn:li:dataPlatform:hive` platform.
```yaml
queryConfigurations:
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
functionScore:
functions:
- filter:
terms:
platform.keyword:
- urn:li:dataPlatform:hive
weight: 3.0
score_mode: multiply
boost_mode: multiply
```
#### Example 3: Exclusion & Bury
This configuration extends the 3 built-in queries with a rule to exclude `deprecated` entities from search results
because they are not generally relevant as well as reduces the score of `materialized`.
```yaml
queryConfigurations:
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
boolQuery:
must_not:
term:
deprecated:
value: true
functionScore:
functions:
- filter:
term:
materialized:
value: true
weight: 0.5
score_mode: multiply
boost_mode: multiply
```
## FAQ and Troubleshooting
**How are the results ordered?**

View File

@ -0,0 +1,47 @@
package com.linkedin.metadata.config.search;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.extern.slf4j.Slf4j;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.FileSystemResource;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
@Data
@AllArgsConstructor
@Slf4j
public class CustomConfiguration {
private boolean configEnabled;
private String configFile;
/**
* Materialize the search configuration from a location external to main application.yml
* @param mapper yaml enabled jackson mapper
* @return search configuration class
* @throws IOException
*/
public CustomSearchConfiguration customSearchConfiguration(ObjectMapper mapper) throws IOException {
if (configEnabled) {
log.info("Custom search configuration enabled.");
try (InputStream stream = new ClassPathResource(configFile).getInputStream()) {
log.info("Custom search configuration found in classpath: {}", configFile);
return mapper.readValue(stream, CustomSearchConfiguration.class);
} catch (FileNotFoundException e) {
try (InputStream stream = new FileSystemResource(configFile).getInputStream()) {
log.info("Custom search configuration found in filesystem: {}", configFile);
return mapper.readValue(stream, CustomSearchConfiguration.class);
}
}
} else {
log.info("Custom search configuration disabled.");
return null;
}
}
}

View File

@ -9,5 +9,6 @@ public class SearchConfiguration {
private int maxTermBucketSize;
private ExactMatchConfiguration exactMatch;
private PartialConfiguration partial;
private CustomConfiguration custom;
private GraphQueryConfiguration graph;
}

View File

@ -0,0 +1,27 @@
package com.linkedin.metadata.config.search.custom;
import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
import com.fasterxml.jackson.databind.annotation.JsonPOJOBuilder;
import lombok.Builder;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.ToString;
@Builder(toBuilder = true)
@Getter
@ToString
@EqualsAndHashCode
@JsonDeserialize(builder = BoolQueryConfiguration.BoolQueryConfigurationBuilder.class)
public class BoolQueryConfiguration {
private Object must;
private Object should;
//CHECKSTYLE:OFF
private Object must_not;
//CHECKSTYLE:ON
private Object filter;
@JsonPOJOBuilder(withPrefix = "")
public static class BoolQueryConfigurationBuilder {
}
}

View File

@ -0,0 +1,23 @@
package com.linkedin.metadata.config.search.custom;
import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
import com.fasterxml.jackson.databind.annotation.JsonPOJOBuilder;
import lombok.Builder;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import java.util.List;
@Builder(toBuilder = true)
@Getter
@EqualsAndHashCode
@JsonDeserialize(builder = CustomSearchConfiguration.CustomSearchConfigurationBuilder.class)
public class CustomSearchConfiguration {
private List<QueryConfiguration> queryConfigurations;
@JsonPOJOBuilder(withPrefix = "")
public static class CustomSearchConfigurationBuilder {
}
}

View File

@ -0,0 +1,103 @@
package com.linkedin.metadata.config.search.custom;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
import com.fasterxml.jackson.databind.annotation.JsonPOJOBuilder;
import lombok.Builder;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.ToString;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.xcontent.LoggingDeprecationHandler;
import org.elasticsearch.common.xcontent.NamedXContentRegistry;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder;
import org.elasticsearch.search.SearchModule;
import java.io.IOException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Optional;
@Slf4j
@Builder(toBuilder = true)
@Getter
@ToString
@EqualsAndHashCode
@JsonDeserialize(builder = QueryConfiguration.QueryConfigurationBuilder.class)
public class QueryConfiguration {
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
static {
OBJECT_MAPPER.setSerializationInclusion(JsonInclude.Include.NON_NULL);
}
private static final NamedXContentRegistry X_CONTENT_REGISTRY;
static {
SearchModule searchModule = new SearchModule(Settings.EMPTY, false, Collections.emptyList());
X_CONTENT_REGISTRY = new NamedXContentRegistry(searchModule.getNamedXContents());
}
private String queryRegex;
@Builder.Default
private boolean simpleQuery = true;
@Builder.Default
private boolean exactMatchQuery = true;
@Builder.Default
private boolean prefixMatchQuery = true;
private BoolQueryConfiguration boolQuery;
private Map<String, Object> functionScore;
public FunctionScoreQueryBuilder functionScoreQueryBuilder(QueryBuilder queryBuilder) {
return toFunctionScoreQueryBuilder(queryBuilder, functionScore);
}
public Optional<BoolQueryBuilder> boolQueryBuilder(String query) {
if (boolQuery != null) {
log.debug("Using custom query configuration queryRegex: {}", queryRegex);
}
return Optional.ofNullable(boolQuery).map(bq -> toBoolQueryBuilder(query, bq));
}
@JsonPOJOBuilder(withPrefix = "")
public static class QueryConfigurationBuilder {
}
private static BoolQueryBuilder toBoolQueryBuilder(String query, BoolQueryConfiguration boolQuery) {
try {
String jsonFragment = OBJECT_MAPPER.writeValueAsString(boolQuery)
.replace("\"{{query_string}}\"", OBJECT_MAPPER.writeValueAsString(query));
XContentParser parser = XContentType.JSON.xContent().createParser(X_CONTENT_REGISTRY,
LoggingDeprecationHandler.INSTANCE, jsonFragment);
return BoolQueryBuilder.fromXContent(parser);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private static FunctionScoreQueryBuilder toFunctionScoreQueryBuilder(QueryBuilder queryBuilder,
Map<String, Object> params) {
try {
HashMap<String, Object> body = new HashMap<>(params);
if (!body.isEmpty()) {
log.debug("Using custom scoring functions: {}", body);
}
body.put("query", OBJECT_MAPPER.readValue(queryBuilder.toString(), Map.class));
String jsonFragment = OBJECT_MAPPER.writeValueAsString(Map.of(
"function_score", body
));
XContentParser parser = XContentType.JSON.xContent().createParser(X_CONTENT_REGISTRY,
LoggingDeprecationHandler.INSTANCE, jsonFragment);
return (FunctionScoreQueryBuilder) FunctionScoreQueryBuilder.parseInnerQueryBuilder(parser);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}

View File

@ -4,6 +4,7 @@ import com.codahale.metrics.Timer;
import com.datahub.util.exception.ESQueryException;
import com.fasterxml.jackson.core.type.TypeReference;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.metadata.models.EntitySpec;
import com.linkedin.metadata.models.registry.EntityRegistry;
import com.linkedin.metadata.query.AutoCompleteResult;
@ -53,6 +54,8 @@ public class ESSearchDAO {
private final String elasticSearchImplementation;
@Nonnull
private final SearchConfiguration searchConfiguration;
@Nullable
private final CustomSearchConfiguration customSearchConfiguration;
public long docCount(@Nonnull String entityName) {
EntitySpec entitySpec = entityRegistry.getEntitySpec(entityName);
@ -75,7 +78,9 @@ public class ESSearchDAO {
log.debug("Executing request {}: {}", id, searchRequest);
final SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
// extract results, validated against document model as well
return SearchRequestHandler.getBuilder(entitySpec, searchConfiguration).extractResult(searchResponse, filter, from, size);
return SearchRequestHandler
.getBuilder(entitySpec, searchConfiguration, customSearchConfiguration)
.extractResult(searchResponse, filter, from, size);
} catch (Exception e) {
log.error("Search query failed", e);
throw new ESQueryException("Search query failed:", e);
@ -91,7 +96,9 @@ public class ESSearchDAO {
try (Timer.Context ignored = MetricUtils.timer(this.getClass(), "executeAndExtract_scroll").time()) {
final SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
// extract results, validated against document model as well
return SearchRequestHandler.getBuilder(entitySpecs, searchConfiguration).extractScrollResult(searchResponse,
return SearchRequestHandler
.getBuilder(entitySpecs, searchConfiguration, customSearchConfiguration)
.extractScrollResult(searchResponse,
filter, scrollId, keepAlive, size, supportsPointInTime());
} catch (Exception e) {
if (e instanceof ElasticsearchStatusException) {
@ -126,8 +133,9 @@ public class ESSearchDAO {
Timer.Context searchRequestTimer = MetricUtils.timer(this.getClass(), "searchRequest").time();
EntitySpec entitySpec = entityRegistry.getEntitySpec(entityName);
// Step 1: construct the query
final SearchRequest searchRequest = SearchRequestHandler.getBuilder(entitySpec, searchConfiguration)
.getSearchRequest(finalInput, postFilters, sortCriterion, from, size, searchFlags);
final SearchRequest searchRequest = SearchRequestHandler
.getBuilder(entitySpec, searchConfiguration, customSearchConfiguration)
.getSearchRequest(finalInput, postFilters, sortCriterion, from, size, searchFlags);
searchRequest.indices(indexConvention.getIndexName(entitySpec));
searchRequestTimer.stop();
// Step 2: execute the query and extract results, validated against document model as well
@ -148,7 +156,9 @@ public class ESSearchDAO {
@Nullable SortCriterion sortCriterion, int from, int size) {
EntitySpec entitySpec = entityRegistry.getEntitySpec(entityName);
final SearchRequest searchRequest =
SearchRequestHandler.getBuilder(entitySpec, searchConfiguration).getFilterRequest(filters, sortCriterion, from, size);
SearchRequestHandler
.getBuilder(entitySpec, searchConfiguration, customSearchConfiguration)
.getFilterRequest(filters, sortCriterion, from, size);
searchRequest.indices(indexConvention.getIndexName(entitySpec));
return executeAndExtract(entitySpec, searchRequest, filters, from, size);
}
@ -252,8 +262,9 @@ public class ESSearchDAO {
}
// Step 1: construct the query
final SearchRequest searchRequest = SearchRequestHandler.getBuilder(entitySpecs, searchConfiguration)
.getSearchRequest(finalInput, postFilters, sortCriterion, sort, pitId, keepAlive, size, searchFlags);
final SearchRequest searchRequest = SearchRequestHandler
.getBuilder(entitySpecs, searchConfiguration, customSearchConfiguration)
.getSearchRequest(finalInput, postFilters, sortCriterion, sort, pitId, keepAlive, size, searchFlags);
// PIT specifies indices in creation so it doesn't support specifying indices on the request, so we only specify if not using PIT
if (!supportsPointInTime()) {

View File

@ -0,0 +1,43 @@
package com.linkedin.metadata.search.elasticsearch.query.request;
import com.linkedin.metadata.config.search.custom.QueryConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import lombok.Builder;
import lombok.Getter;
import lombok.extern.slf4j.Slf4j;
import javax.annotation.Nullable;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
@Slf4j
@Builder(builderMethodName = "hiddenBuilder")
@Getter
public class CustomizedQueryHandler {
private CustomSearchConfiguration customSearchConfiguration;
@Builder.Default
private List<Map.Entry<Pattern, QueryConfiguration>> queryConfigurations = List.of();
public Optional<QueryConfiguration> lookupQueryConfig(String query) {
return queryConfigurations.stream()
.filter(e -> e.getKey().matcher(query).matches())
.map(Map.Entry::getValue)
.findFirst();
}
public static CustomizedQueryHandlerBuilder builder(@Nullable CustomSearchConfiguration customSearchConfiguration) {
CustomizedQueryHandlerBuilder builder = hiddenBuilder()
.customSearchConfiguration(customSearchConfiguration);
if (customSearchConfiguration != null) {
builder.queryConfigurations(customSearchConfiguration.getQueryConfigurations().stream()
.map(cfg -> Map.entry(Pattern.compile(cfg.getQueryRegex()), cfg))
.collect(Collectors.toList()));
}
return builder;
}
}

View File

@ -4,6 +4,7 @@ import com.linkedin.metadata.models.SearchableFieldSpec;
import com.linkedin.metadata.models.annotation.SearchableAnnotation;
import lombok.Builder;
import lombok.Getter;
import lombok.experimental.Accessors;
import javax.annotation.Nonnull;
@ -16,6 +17,7 @@ import static com.linkedin.metadata.search.elasticsearch.indexbuilder.SettingsBu
@Builder
@Getter
@Accessors(fluent = true)
public class SearchFieldConfig {
public static final float DEFAULT_BOOST = 1.0f;
@ -61,41 +63,47 @@ public class SearchFieldConfig {
@Nonnull
private final String fieldName;
@Nonnull
private final String shortName;
@Builder.Default
private final Float boost = DEFAULT_BOOST;
private final String analyzer;
private boolean hasKeywordSubfield;
private boolean hasDelimitedSubfield;
private boolean isQueryByDefault;
private boolean isDelimitedSubfield;
private boolean isKeywordSubfield;
public static SearchFieldConfig detectSubFieldType(@Nonnull SearchableFieldSpec fieldSpec) {
final String fieldName = fieldSpec.getSearchableAnnotation().getFieldName();
final float boost = (float) fieldSpec.getSearchableAnnotation().getBoostScore();
final SearchableAnnotation.FieldType fieldType = fieldSpec.getSearchableAnnotation().getFieldType();
return detectSubFieldType(fieldName, boost, fieldType);
final SearchableAnnotation searchableAnnotation = fieldSpec.getSearchableAnnotation();
final String fieldName = searchableAnnotation.getFieldName();
final float boost = (float) searchableAnnotation.getBoostScore();
final SearchableAnnotation.FieldType fieldType = searchableAnnotation.getFieldType();
return detectSubFieldType(fieldName, boost, fieldType, searchableAnnotation.isQueryByDefault());
}
public static SearchFieldConfig detectSubFieldType(String fieldName,
SearchableAnnotation.FieldType fieldType) {
return detectSubFieldType(fieldName, DEFAULT_BOOST, fieldType);
SearchableAnnotation.FieldType fieldType,
boolean isQueryByDefault) {
return detectSubFieldType(fieldName, DEFAULT_BOOST, fieldType, isQueryByDefault);
}
public static SearchFieldConfig detectSubFieldType(String fieldName, float boost,
SearchableAnnotation.FieldType fieldType) {
public static SearchFieldConfig detectSubFieldType(String fieldName,
float boost,
SearchableAnnotation.FieldType fieldType,
boolean isQueryByDefault) {
return SearchFieldConfig.builder()
.fieldName(fieldName)
.boost(boost)
.analyzer(getAnalyzer(fieldName, fieldType))
.hasKeywordSubfield(hasKeywordSubfield(fieldName, fieldType))
.hasDelimitedSubfield(hasDelimitedSubfield(fieldName, fieldType))
.isQueryByDefault(isQueryByDefault)
.build();
}
public boolean hasDelimitedSubfield() {
return isHasDelimitedSubfield();
}
public boolean hasKeywordSubfield() {
return isHasKeywordSubfield();
public boolean isKeyword() {
return KEYWORD_ANALYZER.equals(analyzer()) || isKeyword(fieldName());
}
private static boolean hasDelimitedSubfield(String fieldName, SearchableAnnotation.FieldType fieldType) {
@ -108,8 +116,8 @@ public class SearchFieldConfig {
&& (TYPES_WITH_DELIMITED_SUBFIELD.contains(fieldType) // if delimited then also has keyword
|| TYPES_WITH_KEYWORD_SUBFIELD.contains(fieldType));
}
private static boolean isKeyword(String fieldName, SearchableAnnotation.FieldType fieldType) {
return fieldName.equals(".keyword")
private static boolean isKeyword(String fieldName) {
return fieldName.endsWith(".keyword")
|| KEYWORD_FIELDS.contains(fieldName);
}
@ -118,7 +126,7 @@ public class SearchFieldConfig {
if (TYPES_WITH_BROWSE_PATH.contains(fieldType)) {
return BROWSE_PATH_HIERARCHY_ANALYZER;
// sub fields
} else if (isKeyword(fieldName, fieldType)) {
} else if (isKeyword(fieldName)) {
return KEYWORD_ANALYZER;
} else if (fieldName.endsWith(".delimited")) {
return TEXT_SEARCH_ANALYZER;
@ -131,4 +139,14 @@ public class SearchFieldConfig {
throw new IllegalStateException(String.format("Unknown analyzer for fieldName: %s, fieldType: %s", fieldName, fieldType));
}
}
public static class SearchFieldConfigBuilder {
public SearchFieldConfigBuilder fieldName(@Nonnull String fieldName) {
this.fieldName = fieldName;
isDelimitedSubfield(fieldName.endsWith(".delimited"));
isKeywordSubfield(fieldName.endsWith(".keyword"));
shortName(fieldName.split("[.]")[0]);
return this;
}
}
}

View File

@ -3,6 +3,8 @@ package com.linkedin.metadata.search.elasticsearch.query.request;
import com.linkedin.metadata.config.search.ExactMatchConfiguration;
import com.linkedin.metadata.config.search.PartialConfiguration;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.metadata.config.search.custom.QueryConfiguration;
import com.linkedin.metadata.models.EntitySpec;
import com.linkedin.metadata.models.SearchableFieldSpec;
import com.linkedin.metadata.models.annotation.SearchScoreAnnotation;
@ -17,6 +19,7 @@ import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import com.linkedin.metadata.search.utils.ESUtils;
import org.elasticsearch.common.lucene.search.function.CombineFunction;
@ -41,33 +44,40 @@ public class SearchQueryBuilder {
private final ExactMatchConfiguration exactMatchConfiguration;
private final PartialConfiguration partialConfiguration;
public SearchQueryBuilder(@Nonnull SearchConfiguration searchConfiguration) {
private final CustomizedQueryHandler customizedQueryHandler;
public SearchQueryBuilder(@Nonnull SearchConfiguration searchConfiguration,
@Nullable CustomSearchConfiguration customSearchConfiguration) {
this.exactMatchConfiguration = searchConfiguration.getExactMatch();
this.partialConfiguration = searchConfiguration.getPartial();
this.customizedQueryHandler = CustomizedQueryHandler.builder(customSearchConfiguration).build();
}
public QueryBuilder buildQuery(@Nonnull List<EntitySpec> entitySpecs, @Nonnull String query, boolean fulltext) {
final QueryBuilder queryBuilder = buildInternalQuery(entitySpecs, query, fulltext);
return QueryBuilders.functionScoreQuery(queryBuilder, buildScoreFunctions(entitySpecs))
.scoreMode(FunctionScoreQuery.ScoreMode.AVG) // Average score functions
.boostMode(CombineFunction.MULTIPLY); // Multiply score function with the score from query
QueryConfiguration customQueryConfig = customizedQueryHandler.lookupQueryConfig(query).orElse(null);
final QueryBuilder queryBuilder = buildInternalQuery(customQueryConfig, entitySpecs, query, fulltext);
return buildScoreFunctions(customQueryConfig, entitySpecs, queryBuilder);
}
/**
* Constructs the search query.
* @param customQueryConfig custom configuration
* @param entitySpecs entities being searched
* @param query search string
* @param fulltext use fulltext queries
* @return query builder
*/
private QueryBuilder buildInternalQuery(@Nonnull List<EntitySpec> entitySpecs, @Nonnull String query, boolean fulltext) {
BoolQueryBuilder finalQuery = QueryBuilders.boolQuery();
private QueryBuilder buildInternalQuery(@Nullable QueryConfiguration customQueryConfig, @Nonnull List<EntitySpec> entitySpecs,
@Nonnull String query, boolean fulltext) {
final String sanitizedQuery = query.replaceFirst("^:+", "");
final BoolQueryBuilder finalQuery = Optional.ofNullable(customQueryConfig)
.flatMap(cqc -> cqc.boolQueryBuilder(sanitizedQuery))
.orElse(QueryBuilders.boolQuery());
if (fulltext && !query.startsWith(STRUCTURED_QUERY_PREFIX)) {
final String sanitizedQuery = query.replaceFirst("^:+", "");
getSimpleQuery(entitySpecs, sanitizedQuery).ifPresent(finalQuery::should);
getPrefixAndExactMatchQuery(entitySpecs, sanitizedQuery).ifPresent(finalQuery::should);
getSimpleQuery(customQueryConfig, entitySpecs, sanitizedQuery).ifPresent(finalQuery::should);
getPrefixAndExactMatchQuery(customQueryConfig, entitySpecs, sanitizedQuery).ifPresent(finalQuery::should);
} else {
final String withoutQueryPrefix = query.startsWith(STRUCTURED_QUERY_PREFIX) ? query.substring(STRUCTURED_QUERY_PREFIX.length()) : query;
@ -77,10 +87,10 @@ public class SearchQueryBuilder {
.map(this::getStandardFields)
.flatMap(Set::stream)
.distinct()
.forEach(cfg -> queryBuilder.field(cfg.getFieldName(), cfg.getBoost()));
.forEach(cfg -> queryBuilder.field(cfg.fieldName(), cfg.boost()));
finalQuery.should(queryBuilder);
if (exactMatchConfiguration.isEnableStructured()) {
getPrefixAndExactMatchQuery(entitySpecs, withoutQueryPrefix).ifPresent(finalQuery::should);
getPrefixAndExactMatchQuery(null, entitySpecs, withoutQueryPrefix).ifPresent(finalQuery::should);
}
}
@ -93,9 +103,9 @@ public class SearchQueryBuilder {
// Always present
final float urnBoost = Float.parseFloat((String) PRIMARY_URN_SEARCH_PROPERTIES.get("boostScore"));
fields.add(SearchFieldConfig.detectSubFieldType("urn", urnBoost, SearchableAnnotation.FieldType.URN));
fields.add(SearchFieldConfig.detectSubFieldType("urn", urnBoost, SearchableAnnotation.FieldType.URN, true));
fields.add(SearchFieldConfig.detectSubFieldType("urn.delimited", urnBoost * partialConfiguration.getUrnFactor(),
SearchableAnnotation.FieldType.URN));
SearchableAnnotation.FieldType.URN, true));
List<SearchableFieldSpec> searchableFieldSpecs = entitySpec.getSearchableFieldSpecs();
for (SearchableFieldSpec fieldSpec : searchableFieldSpecs) {
@ -107,9 +117,11 @@ public class SearchQueryBuilder {
fields.add(searchFieldConfig);
if (SearchFieldConfig.detectSubFieldType(fieldSpec).hasDelimitedSubfield()) {
fields.add(SearchFieldConfig.detectSubFieldType(searchFieldConfig.getFieldName() + ".delimited",
searchFieldConfig.getBoost() * partialConfiguration.getFactor(),
fieldSpec.getSearchableAnnotation().getFieldType()));
final SearchableAnnotation searchableAnnotation = fieldSpec.getSearchableAnnotation();
fields.add(SearchFieldConfig.detectSubFieldType(searchFieldConfig.fieldName() + ".delimited",
searchFieldConfig.boost() * partialConfiguration.getFactor(),
searchableAnnotation.getFieldType(), searchableAnnotation.isQueryByDefault()));
}
}
@ -124,24 +136,34 @@ public class SearchQueryBuilder {
return Stream.of("\"", "'").anyMatch(query::contains);
}
private Optional<QueryBuilder> getSimpleQuery(List<EntitySpec> entitySpecs, String sanitizedQuery) {
private Optional<QueryBuilder> getSimpleQuery(@Nullable QueryConfiguration customQueryConfig,
List<EntitySpec> entitySpecs,
String sanitizedQuery) {
Optional<QueryBuilder> result = Optional.empty();
if (!isQuoted(sanitizedQuery) || !exactMatchConfiguration.isExclusive()) {
final boolean executeSimpleQuery;
if (customQueryConfig != null) {
executeSimpleQuery = customQueryConfig.isSimpleQuery();
} else {
executeSimpleQuery = !isQuoted(sanitizedQuery) || !exactMatchConfiguration.isExclusive();
}
if (executeSimpleQuery) {
BoolQueryBuilder simplePerField = QueryBuilders.boolQuery();
// Simple query string does not use per field analyzers
// Group the fields by analyzer
Map<String, List<SearchFieldConfig>> analyzerGroup = entitySpecs.stream()
.map(this::getStandardFields)
.flatMap(Set::stream)
.collect(Collectors.groupingBy(SearchFieldConfig::getAnalyzer));
.filter(SearchFieldConfig::isQueryByDefault)
.collect(Collectors.groupingBy(SearchFieldConfig::analyzer));
analyzerGroup.keySet().stream().sorted().forEach(analyzer -> {
List<SearchFieldConfig> fieldConfigs = analyzerGroup.get(analyzer);
SimpleQueryStringBuilder simpleBuilder = QueryBuilders.simpleQueryStringQuery(sanitizedQuery);
simpleBuilder.analyzer(analyzer);
simpleBuilder.defaultOperator(Operator.AND);
fieldConfigs.forEach(cfg -> simpleBuilder.field(cfg.getFieldName(), cfg.getBoost()));
fieldConfigs.forEach(cfg -> simpleBuilder.field(cfg.fieldName(), cfg.boost()));
simplePerField.should(simpleBuilder);
});
@ -151,62 +173,77 @@ public class SearchQueryBuilder {
return result;
}
private Optional<QueryBuilder> getPrefixAndExactMatchQuery(@Nonnull List<EntitySpec> entitySpecs, String query) {
private Optional<QueryBuilder> getPrefixAndExactMatchQuery(@Nullable QueryConfiguration customQueryConfig,
@Nonnull List<EntitySpec> entitySpecs,
String query) {
final boolean isPrefixQuery = customQueryConfig == null ? exactMatchConfiguration.isWithPrefix() : customQueryConfig.isPrefixMatchQuery();
final boolean isExactQuery = customQueryConfig == null || customQueryConfig.isExactMatchQuery();
BoolQueryBuilder finalQuery = QueryBuilders.boolQuery();
String unquotedQuery = unquote(query);
// Exact match case-sensitive
finalQuery.should(QueryBuilders.termQuery("urn", unquotedQuery)
.boost(Float.parseFloat((String) PRIMARY_URN_SEARCH_PROPERTIES.get("boostScore"))
* exactMatchConfiguration.getExactFactor())
.queryName("urn"));
// Exact match case-insensitive
finalQuery.should(QueryBuilders.termQuery("urn", unquotedQuery)
.caseInsensitive(true)
.boost(Float.parseFloat((String) PRIMARY_URN_SEARCH_PROPERTIES.get("boostScore"))
* exactMatchConfiguration.getExactFactor()
* exactMatchConfiguration.getCaseSensitivityFactor())
.queryName("urn"));
entitySpecs.stream()
.map(EntitySpec::getSearchableFieldSpecs)
.flatMap(List::stream)
.map(SearchableFieldSpec::getSearchableAnnotation)
.filter(SearchableAnnotation::isQueryByDefault)
.filter(SearchableAnnotation::isEnableAutocomplete) // Proxy for identifying likely exact match fields
.forEach(srchAnnotation -> {
boolean hasDelimited = SearchFieldConfig.detectSubFieldType(srchAnnotation.getFieldName(),
srchAnnotation.getFieldType()).hasDelimitedSubfield();
.map(this::getStandardFields)
.flatMap(Set::stream)
.filter(SearchFieldConfig::isQueryByDefault)
.forEach(searchFieldConfig -> {
if (hasDelimited && exactMatchConfiguration.isWithPrefix()) {
finalQuery.should(QueryBuilders.matchPhrasePrefixQuery(srchAnnotation.getFieldName() + ".delimited", query)
.boost((float) srchAnnotation.getBoostScore() * exactMatchConfiguration.getCaseSensitivityFactor())
.queryName(srchAnnotation.getFieldName())); // less than exact
if (searchFieldConfig.isDelimitedSubfield() && isPrefixQuery) {
finalQuery.should(QueryBuilders.matchPhrasePrefixQuery(searchFieldConfig.fieldName(), query)
.boost(searchFieldConfig.boost()
* exactMatchConfiguration.getPrefixFactor()
* exactMatchConfiguration.getCaseSensitivityFactor())
.queryName(searchFieldConfig.shortName())); // less than exact
}
// Exact match case-sensitive
finalQuery.should(QueryBuilders
.termQuery(ESUtils.toKeywordField(srchAnnotation.getFieldName(), false), unquotedQuery)
.boost((float) srchAnnotation.getBoostScore() * exactMatchConfiguration.getExactFactor())
.queryName(ESUtils.toKeywordField(srchAnnotation.getFieldName(), false)));
// Exact match case-insensitive
finalQuery.should(QueryBuilders
.termQuery(ESUtils.toKeywordField(srchAnnotation.getFieldName(), false), unquotedQuery)
.caseInsensitive(true)
.boost((float) srchAnnotation.getBoostScore()
* exactMatchConfiguration.getExactFactor()
* exactMatchConfiguration.getCaseSensitivityFactor())
.queryName(ESUtils.toKeywordField(srchAnnotation.getFieldName(), false)));
if (searchFieldConfig.isKeyword() && isExactQuery) {
// It is important to use the subfield .keyword (it uses a different normalizer)
// The non-.keyword field removes case information
// Exact match case-sensitive
finalQuery.should(QueryBuilders
.termQuery(ESUtils.toKeywordField(searchFieldConfig.fieldName(), false), unquotedQuery)
.caseInsensitive(false)
.boost(searchFieldConfig.boost()
* exactMatchConfiguration.getExactFactor())
.queryName(searchFieldConfig.shortName()));
// Exact match case-insensitive
finalQuery.should(QueryBuilders
.termQuery(ESUtils.toKeywordField(searchFieldConfig.fieldName(), false), unquotedQuery)
.caseInsensitive(true)
.boost(searchFieldConfig.boost()
* exactMatchConfiguration.getExactFactor()
* exactMatchConfiguration.getCaseSensitivityFactor())
.queryName(searchFieldConfig.fieldName()));
}
});
return finalQuery.should().size() > 0 ? Optional.of(finalQuery) : Optional.empty();
}
private static FunctionScoreQueryBuilder.FilterFunctionBuilder[] buildScoreFunctions(@Nonnull List<EntitySpec> entitySpecs) {
private FunctionScoreQueryBuilder buildScoreFunctions(@Nullable QueryConfiguration customQueryConfig,
@Nonnull List<EntitySpec> entitySpecs,
@Nonnull QueryBuilder queryBuilder) {
if (customQueryConfig != null) {
// Prefer configuration function scoring over annotation scoring
return customQueryConfig.functionScoreQueryBuilder(queryBuilder);
} else {
return QueryBuilders.functionScoreQuery(queryBuilder, buildAnnotationScoreFunctions(entitySpecs))
.scoreMode(FunctionScoreQuery.ScoreMode.AVG) // Average score functions
.boostMode(CombineFunction.MULTIPLY); // Multiply score function with the score from query;
}
}
private static FunctionScoreQueryBuilder.FilterFunctionBuilder[] buildAnnotationScoreFunctions(@Nonnull List<EntitySpec> entitySpecs) {
List<FunctionScoreQueryBuilder.FilterFunctionBuilder> finalScoreFunctions = new ArrayList<>();
// Add a default weight of 1.0 to make sure the score function is larger than 1
finalScoreFunctions.add(
new FunctionScoreQueryBuilder.FilterFunctionBuilder(ScoreFunctionBuilders.weightFactorFunction(1.0f)));
new FunctionScoreQueryBuilder.FilterFunctionBuilder(ScoreFunctionBuilders.weightFactorFunction(1.0f)));
entitySpecs.stream()
.map(EntitySpec::getSearchableFieldSpecs)
.flatMap(List::stream)

View File

@ -7,6 +7,7 @@ import com.linkedin.common.urn.Urn;
import com.linkedin.data.template.DoubleMap;
import com.linkedin.data.template.LongMap;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.metadata.models.EntitySpec;
import com.linkedin.metadata.models.SearchableFieldSpec;
import com.linkedin.metadata.models.annotation.SearchableAnnotation;
@ -94,11 +95,13 @@ public class SearchRequestHandler {
private final SearchQueryBuilder _searchQueryBuilder;
private SearchRequestHandler(@Nonnull EntitySpec entitySpec, @Nonnull SearchConfiguration configs) {
this(ImmutableList.of(entitySpec), configs);
private SearchRequestHandler(@Nonnull EntitySpec entitySpec, @Nonnull SearchConfiguration configs,
@Nullable CustomSearchConfiguration customSearchConfiguration) {
this(ImmutableList.of(entitySpec), configs, customSearchConfiguration);
}
private SearchRequestHandler(@Nonnull List<EntitySpec> entitySpecs, @Nonnull SearchConfiguration configs) {
private SearchRequestHandler(@Nonnull List<EntitySpec> entitySpecs, @Nonnull SearchConfiguration configs,
@Nullable CustomSearchConfiguration customSearchConfiguration) {
_entitySpecs = entitySpecs;
List<SearchableAnnotation> annotations = getSearchableAnnotations();
_facetFields = getFacetFields(annotations);
@ -107,16 +110,20 @@ public class SearchRequestHandler {
.filter(SearchableAnnotation::isAddToFilters)
.collect(Collectors.toMap(SearchableAnnotation::getFieldName, SearchableAnnotation::getFilterName, mapMerger()));
_highlights = getHighlights();
_searchQueryBuilder = new SearchQueryBuilder(configs);
_searchQueryBuilder = new SearchQueryBuilder(configs, customSearchConfiguration);
_configs = configs;
}
public static SearchRequestHandler getBuilder(@Nonnull EntitySpec entitySpec, @Nonnull SearchConfiguration configs) {
return REQUEST_HANDLER_BY_ENTITY_NAME.computeIfAbsent(ImmutableList.of(entitySpec), k -> new SearchRequestHandler(entitySpec, configs));
public static SearchRequestHandler getBuilder(@Nonnull EntitySpec entitySpec, @Nonnull SearchConfiguration configs,
@Nullable CustomSearchConfiguration customSearchConfiguration) {
return REQUEST_HANDLER_BY_ENTITY_NAME.computeIfAbsent(
ImmutableList.of(entitySpec), k -> new SearchRequestHandler(entitySpec, configs, customSearchConfiguration));
}
public static SearchRequestHandler getBuilder(@Nonnull List<EntitySpec> entitySpecs, @Nonnull SearchConfiguration configs) {
return REQUEST_HANDLER_BY_ENTITY_NAME.computeIfAbsent(ImmutableList.copyOf(entitySpecs), k -> new SearchRequestHandler(entitySpecs, configs));
public static SearchRequestHandler getBuilder(@Nonnull List<EntitySpec> entitySpecs, @Nonnull SearchConfiguration configs,
@Nullable CustomSearchConfiguration customSearchConfiguration) {
return REQUEST_HANDLER_BY_ENTITY_NAME.computeIfAbsent(
ImmutableList.copyOf(entitySpecs), k -> new SearchRequestHandler(entitySpecs, configs, customSearchConfiguration));
}
private List<SearchableAnnotation> getSearchableAnnotations() {

View File

@ -1,9 +1,12 @@
package com.linkedin.metadata;
import com.fasterxml.jackson.dataformat.yaml.YAMLMapper;
import com.linkedin.entity.client.EntityClient;
import com.linkedin.metadata.client.JavaEntityClient;
import com.linkedin.metadata.config.search.CustomConfiguration;
import com.linkedin.metadata.config.search.ElasticSearchConfiguration;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.metadata.entity.AspectDao;
import com.linkedin.metadata.entity.EntityAspect;
import com.linkedin.metadata.entity.EntityAspectIdentifier;
@ -95,9 +98,12 @@ public class ESSampleDataFixture {
@Qualifier("entityRegistry") EntityRegistry entityRegistry,
@Qualifier("sampleDataEntityIndexBuilders") EntityIndexBuilders indexBuilders,
@Qualifier("sampleDataIndexConvention") IndexConvention indexConvention
) {
) throws IOException {
CustomConfiguration customConfiguration = new CustomConfiguration(true, "search_config_fixture_test.yml");
CustomSearchConfiguration customSearchConfiguration = customConfiguration.customSearchConfiguration(new YAMLMapper());
ESSearchDAO searchDAO = new ESSearchDAO(entityRegistry, _searchClient, indexConvention, false,
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration);
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration, customSearchConfiguration);
ESBrowseDAO browseDAO = new ESBrowseDAO(entityRegistry, _searchClient, indexConvention);
ESWriteDAO writeDAO = new ESWriteDAO(entityRegistry, _searchClient, indexConvention, _bulkProcessor, 1);
return new ElasticSearchService(indexBuilders, searchDAO, browseDAO, writeDAO);

View File

@ -106,7 +106,7 @@ public class ESSearchLineageFixture {
@Qualifier("searchLineageIndexConvention") IndexConvention indexConvention
) {
ESSearchDAO searchDAO = new ESSearchDAO(entityRegistry, _searchClient, indexConvention, false,
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration);
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration, null);
ESBrowseDAO browseDAO = new ESBrowseDAO(entityRegistry, _searchClient, indexConvention);
ESWriteDAO writeDAO = new ESWriteDAO(entityRegistry, _searchClient, indexConvention, _bulkProcessor, 1);
return new ElasticSearchService(indexBuilders, searchDAO, browseDAO, writeDAO);

View File

@ -156,7 +156,7 @@ public class LineageSearchServiceTest extends AbstractTestNGSpringContextTests {
new EntityIndexBuilders(_esIndexBuilder, _entityRegistry,
_indexConvention, _settingsBuilder);
ESSearchDAO searchDAO = new ESSearchDAO(_entityRegistry, _searchClient, _indexConvention, false,
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration);
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration, null);
ESBrowseDAO browseDAO = new ESBrowseDAO(_entityRegistry, _searchClient, _indexConvention);
ESWriteDAO writeDAO = new ESWriteDAO(_entityRegistry, _searchClient, _indexConvention, _bulkProcessor, 1);
return new ElasticSearchService(indexBuilders, searchDAO, browseDAO, writeDAO);

View File

@ -117,7 +117,7 @@ public class SearchServiceTest extends AbstractTestNGSpringContextTests {
new EntityIndexBuilders(_esIndexBuilder, _entityRegistry,
_indexConvention, _settingsBuilder);
ESSearchDAO searchDAO = new ESSearchDAO(_entityRegistry, _searchClient, _indexConvention, false,
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration);
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration, null);
ESBrowseDAO browseDAO = new ESBrowseDAO(_entityRegistry, _searchClient, _indexConvention);
ESWriteDAO writeDAO = new ESWriteDAO(_entityRegistry, _searchClient, _indexConvention,
_bulkProcessor, 1);

View File

@ -81,7 +81,7 @@ public class ElasticSearchServiceTest extends AbstractTestNGSpringContextTests {
EntityIndexBuilders indexBuilders =
new EntityIndexBuilders(_esIndexBuilder, _entityRegistry, _indexConvention, _settingsBuilder);
ESSearchDAO searchDAO = new ESSearchDAO(_entityRegistry, _searchClient, _indexConvention, false,
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration);
ELASTICSEARCH_IMPLEMENTATION_ELASTICSEARCH, _searchConfiguration, null);
ESBrowseDAO browseDAO = new ESBrowseDAO(_entityRegistry, _searchClient, _indexConvention);
ESWriteDAO writeDAO =
new ESWriteDAO(_entityRegistry, _searchClient, _indexConvention, _bulkProcessor, 1);

View File

@ -119,8 +119,8 @@ public class SampleDataFixtureTests extends AbstractTestNGSpringContextTests {
for (SearchableFieldSpec fieldSpec : entitySpec.getSearchableFieldSpecs()) {
SearchFieldConfig test = SearchFieldConfig.detectSubFieldType(fieldSpec);
if (!test.getFieldName().contains(".")) {
Map<String, Object> actual = mappings.get(test.getFieldName());
if (!test.fieldName().contains(".")) {
Map<String, Object> actual = mappings.get(test.fieldName());
final String expectedAnalyzer;
if (actual.get("search_analyzer") != null) {
@ -131,36 +131,36 @@ public class SampleDataFixtureTests extends AbstractTestNGSpringContextTests {
expectedAnalyzer = "keyword";
}
assertEquals(test.getAnalyzer(), expectedAnalyzer,
assertEquals(test.analyzer(), expectedAnalyzer,
String.format("Expected search analyzer to match for entity: `%s`field: `%s`",
entitySpec.getName(), test.getFieldName()));
entitySpec.getName(), test.fieldName()));
if (test.hasDelimitedSubfield()) {
assertTrue(((Map<String, Map<String, String>>) actual.get("fields")).containsKey("delimited"),
String.format("Expected entity: `%s` field to have .delimited subfield: `%s`",
entitySpec.getName(), test.getFieldName()));
entitySpec.getName(), test.fieldName()));
} else {
boolean nosubfield = !actual.containsKey("fields")
|| !((Map<String, Map<String, String>>) actual.get("fields")).containsKey("delimited");
assertTrue(nosubfield, String.format("Expected entity: `%s` field to NOT have .delimited subfield: `%s`",
entitySpec.getName(), test.getFieldName()));
entitySpec.getName(), test.fieldName()));
}
if (test.hasKeywordSubfield()) {
assertTrue(((Map<String, Map<String, String>>) actual.get("fields")).containsKey("keyword"),
String.format("Expected entity: `%s` field to have .keyword subfield: `%s`",
entitySpec.getName(), test.getFieldName()));
entitySpec.getName(), test.fieldName()));
} else {
boolean nosubfield = !actual.containsKey("fields")
|| !((Map<String, Map<String, String>>) actual.get("fields")).containsKey("keyword");
assertTrue(nosubfield, String.format("Expected entity: `%s` field to NOT have .keyword subfield: `%s`",
entitySpec.getName(), test.getFieldName()));
entitySpec.getName(), test.fieldName()));
}
} else {
// this is a subfield therefore cannot have a subfield
assertFalse(test.hasKeywordSubfield());
assertFalse(test.hasDelimitedSubfield());
String[] fieldAndSubfield = test.getFieldName().split("[.]", 2);
String[] fieldAndSubfield = test.fieldName().split("[.]", 2);
Map<String, Object> actualParent = mappings.get(fieldAndSubfield[0]);
Map<String, Object> actualSubfield = ((Map<String, Map<String, Object>>) actualParent.get("fields")).get(fieldAndSubfield[0]);
@ -168,8 +168,8 @@ public class SampleDataFixtureTests extends AbstractTestNGSpringContextTests {
String expectedAnalyzer = actualSubfield.get("search_analyzer") != null ? (String) actualSubfield.get("search_analyzer")
: "keyword";
assertEquals(test.getAnalyzer(), expectedAnalyzer,
String.format("Expected search analyzer to match for field `%s`", test.getFieldName()));
assertEquals(test.analyzer(), expectedAnalyzer,
String.format("Expected search analyzer to match for field `%s`", test.fieldName()));
}
}
}
@ -195,7 +195,7 @@ public class SampleDataFixtureTests extends AbstractTestNGSpringContextTests {
final SearchResult result = search(searchService, "test");
Map<String, Integer> expectedTypes = Map.of(
"dataset", 10,
"dataset", 13,
"chart", 0,
"container", 1,
"dashboard", 0,
@ -1132,6 +1132,7 @@ public class SampleDataFixtureTests extends AbstractTestNGSpringContextTests {
"Expected exact match and 1st position");
}
// Note: This test can fail if not using .keyword subfields (check for possible query builder regression)
@Test
public void testPrefixVsExactCaseSensitivity() {
List<String> insensitiveExactMatches = List.of("testExactMatchCase", "testexactmatchcase", "TESTEXACTMATCHCASE");

View File

@ -0,0 +1,178 @@
package com.linkedin.metadata.search.elasticsearch.query.request;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.yaml.YAMLMapper;
import com.linkedin.metadata.config.search.CustomConfiguration;
import com.linkedin.metadata.config.search.custom.BoolQueryConfiguration;
import com.linkedin.metadata.config.search.custom.QueryConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import org.elasticsearch.common.lucene.search.function.CombineFunction;
import org.elasticsearch.common.lucene.search.function.FunctionScoreQuery;
import org.elasticsearch.index.query.MatchAllQueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder;
import org.elasticsearch.index.query.functionscore.ScoreFunctionBuilders;
import org.testng.annotations.Test;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import static org.testng.Assert.assertEquals;
import static org.testng.Assert.assertNotNull;
public class CustomizedQueryHandlerTest {
public static final ObjectMapper TEST_MAPPER = new YAMLMapper();
private static final CustomSearchConfiguration TEST_CONFIG;
static {
try {
TEST_CONFIG = new CustomConfiguration(true, "search_config_test.yml")
.customSearchConfiguration(TEST_MAPPER);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private static final List<QueryConfiguration> EXPECTED_CONFIGURATION = List.of(
QueryConfiguration.builder()
.queryRegex("[*]|")
.simpleQuery(false)
.exactMatchQuery(false)
.prefixMatchQuery(false)
.functionScore(Map.of("score_mode", "avg", "boost_mode", "multiply",
"functions", List.of(
Map.of(
"weight", 1,
"filter", Map.<String, Object>of("match_all", Map.<String, Object>of())),
Map.of(
"weight", 0.5,
"filter", Map.<String, Object>of("term", Map.of(
"materialized", Map.of("value", true)
))),
Map.of(
"weight", 0.5,
"filter", Map.<String, Object>of("term", Map.<String, Object>of(
"deprecated", Map.of("value", true)
)))
)))
.build(),
QueryConfiguration.builder()
.queryRegex(".*")
.simpleQuery(true)
.exactMatchQuery(true)
.prefixMatchQuery(true)
.boolQuery(BoolQueryConfiguration.builder()
.must(List.of(
Map.of("term", Map.of("name", "{{query_string}}"))
))
.build())
.functionScore(Map.of("score_mode", "avg", "boost_mode", "multiply",
"functions", List.of(
Map.of(
"weight", 1,
"filter", Map.<String, Object>of("match_all", Map.<String, Object>of())),
Map.of(
"weight", 0.5,
"filter", Map.<String, Object>of("term", Map.of(
"materialized", Map.of("value", true)
))),
Map.of(
"weight", 1.5,
"filter", Map.<String, Object>of("term", Map.<String, Object>of(
"deprecated", Map.of("value", false)
)))
)))
.build()
);
@Test
public void configParsingTest() {
assertNotNull(TEST_CONFIG);
assertEquals(TEST_CONFIG.getQueryConfigurations(), EXPECTED_CONFIGURATION);
}
@Test
public void customizedQueryHandlerInitTest() {
CustomizedQueryHandler test = CustomizedQueryHandler.builder(TEST_CONFIG).build();
assertEquals(test.getQueryConfigurations().stream().map(e -> e.getKey().toString()).collect(Collectors.toList()),
List.of("[*]|", ".*"));
assertEquals(test.getQueryConfigurations().stream()
.map(e -> Map.entry(e.getKey().toString(), e.getValue()))
.collect(Collectors.toList()),
EXPECTED_CONFIGURATION.stream()
.map(cfg -> Map.entry(cfg.getQueryRegex(), cfg))
.collect(Collectors.toList()));
}
@Test
public void patternMatchTest() {
CustomizedQueryHandler test = CustomizedQueryHandler.builder(TEST_CONFIG).build();
for (String selectAllQuery: List.of("*", "")) {
QueryConfiguration actual = test.lookupQueryConfig(selectAllQuery).get();
assertEquals(actual, EXPECTED_CONFIGURATION.get(0), String.format("Failed to match: `%s`", selectAllQuery));
}
for (String otherQuery: List.of("foo", "bar")) {
QueryConfiguration actual = test.lookupQueryConfig(otherQuery).get();
assertEquals(actual, EXPECTED_CONFIGURATION.get(1));
}
}
@Test
public void functionScoreQueryBuilderTest() {
CustomizedQueryHandler test = CustomizedQueryHandler.builder(TEST_CONFIG).build();
MatchAllQueryBuilder inputQuery = QueryBuilders.matchAllQuery();
/*
* Test select star
*/
FunctionScoreQueryBuilder selectStarTest = test.lookupQueryConfig("*").get().functionScoreQueryBuilder(inputQuery);
FunctionScoreQueryBuilder.FilterFunctionBuilder[] expectedSelectStarScoreFunctions = {
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
ScoreFunctionBuilders.weightFactorFunction(1f)
),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("materialized", true),
ScoreFunctionBuilders.weightFactorFunction(0.5f)
),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("deprecated", true),
ScoreFunctionBuilders.weightFactorFunction(0.5f)
)
};
FunctionScoreQueryBuilder expectedSelectStar = new FunctionScoreQueryBuilder(expectedSelectStarScoreFunctions)
.scoreMode(FunctionScoreQuery.ScoreMode.AVG)
.boostMode(CombineFunction.MULTIPLY);
assertEquals(selectStarTest, expectedSelectStar);
/*
* Test default (non-select start)
*/
FunctionScoreQueryBuilder defaultTest = test.lookupQueryConfig("foobar").get().functionScoreQueryBuilder(inputQuery);
FunctionScoreQueryBuilder.FilterFunctionBuilder[] expectedDefaultScoreFunctions = {
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
ScoreFunctionBuilders.weightFactorFunction(1f)
),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("materialized", true),
ScoreFunctionBuilders.weightFactorFunction(0.5f)
),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("deprecated", false),
ScoreFunctionBuilders.weightFactorFunction(1.5f)
)
};
FunctionScoreQueryBuilder expectedDefault = new FunctionScoreQueryBuilder(expectedDefaultScoreFunctions)
.scoreMode(FunctionScoreQuery.ScoreMode.AVG)
.boostMode(CombineFunction.MULTIPLY);
assertEquals(defaultTest, expectedDefault);
}
}

View File

@ -1,16 +1,22 @@
package com.linkedin.metadata.search.elasticsearch.query.request;
import com.fasterxml.jackson.dataformat.yaml.YAMLMapper;
import com.google.common.collect.ImmutableList;
import com.linkedin.metadata.TestEntitySpecBuilder;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import com.linkedin.metadata.config.search.CustomConfiguration;
import com.linkedin.metadata.config.search.ExactMatchConfiguration;
import com.linkedin.metadata.config.search.PartialConfiguration;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.util.Pair;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.MatchAllQueryBuilder;
import org.elasticsearch.index.query.MatchPhrasePrefixQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryStringQueryBuilder;
@ -47,7 +53,7 @@ public class SearchQueryBuilderTest {
testQueryConfig.setExactMatch(exactMatchConfiguration);
testQueryConfig.setPartial(partialConfiguration);
}
public static final SearchQueryBuilder TEST_BUILDER = new SearchQueryBuilder(testQueryConfig);
public static final SearchQueryBuilder TEST_BUILDER = new SearchQueryBuilder(testQueryConfig, null);
@Test
public void testQueryBuilderFulltext() {
@ -110,13 +116,15 @@ public class SearchQueryBuilderTest {
}
}).collect(Collectors.toList());
assertEquals(prefixFieldWeights, List.of(
assertEquals(prefixFieldWeights.size(), 22);
List.of(
Pair.of("urn", 100.0f),
Pair.of("urn", 70.0f),
Pair.of("keyPart1.delimited", 7.0f),
Pair.of("keyPart1.delimited", 16.8f),
Pair.of("keyPart1.keyword", 100.0f),
Pair.of("keyPart1.keyword", 70.0f)
));
).forEach(p -> assertTrue(prefixFieldWeights.contains(p), "Missing: " + p));
// Validate scorer
FunctionScoreQueryBuilder.FilterFunctionBuilder[] scoringFunctions = result.filterFunctionBuilders();
@ -147,4 +155,87 @@ public class SearchQueryBuilderTest {
FunctionScoreQueryBuilder.FilterFunctionBuilder[] scoringFunctions = result.filterFunctionBuilders();
assertEquals(scoringFunctions.length, 3);
}
private static final SearchQueryBuilder TEST_CUSTOM_BUILDER;
static {
try {
CustomSearchConfiguration customSearchConfiguration = new CustomConfiguration(
true, "search_config_builder_test.yml").customSearchConfiguration(new YAMLMapper());
TEST_CUSTOM_BUILDER = new SearchQueryBuilder(testQueryConfig, customSearchConfiguration);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
@Test
public void testCustomSelectAll() {
for (String triggerQuery : List.of("*", "")) {
FunctionScoreQueryBuilder result = (FunctionScoreQueryBuilder) TEST_CUSTOM_BUILDER
.buildQuery(ImmutableList.of(TestEntitySpecBuilder.getSpec()), triggerQuery, true);
BoolQueryBuilder mainQuery = (BoolQueryBuilder) result.query();
List<QueryBuilder> shouldQueries = mainQuery.should();
assertEquals(shouldQueries.size(), 0);
}
}
@Test
public void testCustomExactMatch() {
for (String triggerQuery : List.of("test_table", "'single quoted'", "\"double quoted\"")) {
FunctionScoreQueryBuilder result = (FunctionScoreQueryBuilder) TEST_CUSTOM_BUILDER
.buildQuery(ImmutableList.of(TestEntitySpecBuilder.getSpec()), triggerQuery, true);
BoolQueryBuilder mainQuery = (BoolQueryBuilder) result.query();
List<QueryBuilder> shouldQueries = mainQuery.should();
assertEquals(shouldQueries.size(), 1, String.format("Expected query for `%s`", triggerQuery));
BoolQueryBuilder boolPrefixQuery = (BoolQueryBuilder) shouldQueries.get(0);
assertTrue(boolPrefixQuery.should().size() > 0);
List<QueryBuilder> queries = boolPrefixQuery.should().stream().map(prefixQuery -> {
if (prefixQuery instanceof MatchPhrasePrefixQueryBuilder) {
return (MatchPhrasePrefixQueryBuilder) prefixQuery;
} else {
// exact
return (TermQueryBuilder) prefixQuery;
}
}).collect(Collectors.toList());
assertFalse(queries.isEmpty(), "Expected queries with specific types");
}
}
@Test
public void testCustomDefault() {
for (String triggerQuery : List.of("foo", "bar", "foo\"bar", "foo:bar")) {
FunctionScoreQueryBuilder result = (FunctionScoreQueryBuilder) TEST_CUSTOM_BUILDER
.buildQuery(ImmutableList.of(TestEntitySpecBuilder.getSpec()), triggerQuery, true);
BoolQueryBuilder mainQuery = (BoolQueryBuilder) result.query();
List<QueryBuilder> shouldQueries = mainQuery.should();
assertEquals(shouldQueries.size(), 3);
List<QueryBuilder> queries = mainQuery.should().stream().map(query -> {
if (query instanceof SimpleQueryStringBuilder) {
return (SimpleQueryStringBuilder) query;
} else if (query instanceof MatchAllQueryBuilder) {
// custom
return (MatchAllQueryBuilder) query;
} else {
// exact
return (BoolQueryBuilder) query;
}
}).collect(Collectors.toList());
assertEquals(queries.size(), 3, "Expected queries with specific types");
// validate query injection
List<QueryBuilder> mustQueries = mainQuery.must();
assertEquals(mustQueries.size(), 1);
TermQueryBuilder termQueryBuilder = (TermQueryBuilder) mainQuery.must().get(0);
assertEquals(termQueryBuilder.fieldName(), "fieldName");
assertEquals(termQueryBuilder.value().toString(), triggerQuery);
}
}
}

View File

@ -74,7 +74,7 @@ public class SearchRequestHandlerTest extends AbstractTestNGSpringContextTests {
@Test
public void testDatasetFieldsAndHighlights() {
EntitySpec entitySpec = entityRegistry.getEntitySpec("dataset");
SearchRequestHandler datasetHandler = SearchRequestHandler.getBuilder(entitySpec, testQueryConfig);
SearchRequestHandler datasetHandler = SearchRequestHandler.getBuilder(entitySpec, testQueryConfig, null);
/*
Ensure efficient query performance, we do not expect upstream/downstream/fineGrained lineage
@ -89,7 +89,7 @@ public class SearchRequestHandlerTest extends AbstractTestNGSpringContextTests {
@Test
public void testSearchRequestHandler() {
SearchRequestHandler requestHandler = SearchRequestHandler.getBuilder(TestEntitySpecBuilder.getSpec(), testQueryConfig);
SearchRequestHandler requestHandler = SearchRequestHandler.getBuilder(TestEntitySpecBuilder.getSpec(), testQueryConfig, null);
SearchRequest searchRequest = requestHandler.getSearchRequest("testQuery", null, null, 0,
10, new SearchFlags().setFulltext(false));
SearchSourceBuilder sourceBuilder = searchRequest.source();
@ -118,7 +118,7 @@ public class SearchRequestHandlerTest extends AbstractTestNGSpringContextTests {
@Test
public void testFilteredSearch() {
final SearchRequestHandler requestHandler = SearchRequestHandler.getBuilder(TestEntitySpecBuilder.getSpec(), testQueryConfig);
final SearchRequestHandler requestHandler = SearchRequestHandler.getBuilder(TestEntitySpecBuilder.getSpec(), testQueryConfig, null);
final BoolQueryBuilder testQuery = constructFilterQuery(requestHandler, false);
@ -398,7 +398,7 @@ public class SearchRequestHandlerTest extends AbstractTestNGSpringContextTests {
));
final SearchRequestHandler requestHandler = SearchRequestHandler.getBuilder(
TestEntitySpecBuilder.getSpec(), testQueryConfig);
TestEntitySpecBuilder.getSpec(), testQueryConfig, null);
return (BoolQueryBuilder) requestHandler
.getSearchRequest("", filter, null, 0, 10, new SearchFlags().setFulltext(false))

View File

@ -0,0 +1,74 @@
# Used for testing more real-world configurations
queryConfigurations:
# Criteria for exact-match only
# Contains `_`, `'`, `"` then use exact match query
- queryRegex: >-
["'].+["']|\S+_\S+
simpleQuery: false
prefixMatchQuery: true
exactMatchQuery: true
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: true
weight: 0.5
score_mode: avg
boost_mode: multiply
# Select *
- queryRegex: '[*]|'
simpleQuery: false
prefixMatchQuery: false
exactMatchQuery: false
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: true
weight: 0.5
score_mode: avg
boost_mode: multiply
# default
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
boolQuery:
should:
match_all: {}
must:
- term:
fieldName: '{{query_string}}'
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
score_mode: avg
boost_mode: multiply

View File

@ -0,0 +1,51 @@
# Use for testing with search fixtures
queryConfigurations:
# Select *
- queryRegex: '[*]|'
simpleQuery: false
prefixMatchQuery: false
exactMatchQuery: false
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: true
weight: 0.5
score_mode: avg
boost_mode: multiply
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: true
weight: 0.5
- filter:
terms:
tags:
- urn:li:tag:pii
weight: 1.25
score_mode: avg
boost_mode: multiply

View File

@ -0,0 +1,55 @@
# https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-function-score-query.html
# First match
queryConfigurations:
# `*` or empty, select all queries
- queryRegex: '[*]|'
simpleQuery: false
prefixMatchQuery: false
exactMatchQuery: false
functionScore:
functions:
- filter:
match_all: { }
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: true
weight: 0.5
score_mode: avg
boost_mode: multiply
# default catch all
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
# {{query_string}} is the search query string
# https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-bool-query.html
boolQuery:
must:
- term:
name: '{{query_string}}'
functionScore:
functions:
- filter:
match_all: {}
weight: 1
- filter:
term:
materialized:
value: true
weight: 0.5
- filter:
term:
deprecated:
value: false
weight: 1.5
score_mode: avg
boost_mode: multiply

View File

@ -16,7 +16,8 @@ record SchemaField {
@Searchable = {
"fieldName": "fieldPaths",
"fieldType": "TEXT",
"boostScore": 5.0
"boostScore": 5.0,
"queryByDefault": "true"
}
fieldPath: SchemaFieldPath

View File

@ -1,8 +1,13 @@
package com.linkedin.gms.factory.search;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.yaml.YAMLMapper;
import com.linkedin.gms.factory.config.ConfigurationProvider;
import com.linkedin.gms.factory.entityregistry.EntityRegistryFactory;
import com.linkedin.gms.factory.spring.YamlPropertySourceFactory;
import com.linkedin.metadata.config.search.ElasticSearchConfiguration;
import com.linkedin.metadata.config.search.SearchConfiguration;
import com.linkedin.metadata.config.search.custom.CustomSearchConfiguration;
import com.linkedin.metadata.models.registry.EntityRegistry;
import com.linkedin.metadata.search.elasticsearch.ElasticSearchService;
import com.linkedin.metadata.search.elasticsearch.indexbuilder.EntityIndexBuilders;
@ -20,12 +25,16 @@ import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Import;
import org.springframework.context.annotation.PropertySource;
import java.io.IOException;
@Slf4j
@Configuration
@PropertySource(value = "classpath:/application.yml", factory = YamlPropertySourceFactory.class)
@Import({EntityRegistryFactory.class, SettingsBuilderFactory.class})
public class ElasticSearchServiceFactory {
private static final ObjectMapper YAML_MAPPER = new YAMLMapper();
@Autowired
@Qualifier("baseElasticSearchComponents")
private BaseElasticSearchComponentsFactory.BaseElasticSearchComponents components;
@ -43,13 +52,18 @@ public class ElasticSearchServiceFactory {
@Bean(name = "elasticSearchService")
@Nonnull
protected ElasticSearchService getInstance(ConfigurationProvider configurationProvider) {
protected ElasticSearchService getInstance(ConfigurationProvider configurationProvider) throws IOException {
log.info("Search configuration: {}", configurationProvider.getElasticSearch().getSearch());
ElasticSearchConfiguration elasticSearchConfiguration = configurationProvider.getElasticSearch();
SearchConfiguration searchConfiguration = elasticSearchConfiguration.getSearch();
CustomSearchConfiguration customSearchConfiguration = searchConfiguration.getCustom() == null ? null
: searchConfiguration.getCustom().customSearchConfiguration(YAML_MAPPER);
ESSearchDAO esSearchDAO =
new ESSearchDAO(entityRegistry, components.getSearchClient(), components.getIndexConvention(),
configurationProvider.getFeatureFlags().isPointInTimeCreationEnabled(),
configurationProvider.getElasticSearch().getImplementation(), configurationProvider.getElasticSearch().getSearch());
configurationProvider.getFeatureFlags().isPointInTimeCreationEnabled(),
elasticSearchConfiguration.getImplementation(), searchConfiguration, customSearchConfiguration);
return new ElasticSearchService(
new EntityIndexBuilders(components.getIndexBuilder(), entityRegistry, components.getIndexConvention(),
settingsBuilder), esSearchDAO,

View File

@ -197,6 +197,9 @@ elasticsearch:
partial:
urnFactor: ${ELASTICSEARCH_QUERY_PARTIAL_URN_FACTOR:0.5} # multiplier on Urn token match, a partial match on Urn > non-Urn is assumed
factor: ${ELASTICSEARCH_QUERY_PARTIAL_FACTOR:0.4} # multiplier on possible non-Urn token match
custom:
configEnabled: ${ELASTICSEARCH_QUERY_CUSTOM_CONFIG_ENABLED:false}
configFile: ${ELASTICSEARCH_QUERY_CUSTOM_CONFIG_FILE:search_config.yml}
graph:
timeoutSeconds: ${ELASTICSEARCH_SEARCH_GRAPH_TIMEOUT_SECONDS:50} # graph dao timeout seconds
batchSize: ${ELASTICSEARCH_SEARCH_GRAPH_BATCH_SIZE:1000} # graph dao batch size

View File

@ -0,0 +1,71 @@
# Notes:
#
# First match wins
#
# queryRegex = Java regex syntax
#
# functionScores - See the following for function score syntax
# https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-function-score-query.html
queryConfigurations:
# Select *
- queryRegex: '[*]|'
simpleQuery: false
prefixMatchQuery: false
exactMatchQuery: false
boolQuery:
must_not:
term:
deprecated:
value: true
functionScore:
functions:
- filter:
term:
materialized:
value: true
weight: 0.8
score_mode: multiply
boost_mode: multiply
# Criteria for exact-match only
# Contains quoted or contains underscore then use exact match query
- queryRegex: >-
["'].+["']|\S+_\S+
simpleQuery: false
prefixMatchQuery: true
exactMatchQuery: true
functionScore:
functions:
- filter:
term:
materialized:
value: true
weight: 0.8
- filter:
term:
deprecated:
value: true
weight: 0
score_mode: multiply
boost_mode: multiply
# default
- queryRegex: .*
simpleQuery: true
prefixMatchQuery: true
exactMatchQuery: true
boolQuery:
must_not:
term:
deprecated:
value: true
functionScore:
functions:
- filter:
term:
materialized:
value: true
weight: 0.8
score_mode: multiply
boost_mode: multiply

View File

@ -66,7 +66,7 @@ public class ConfigSearchExport extends HttpServlet {
.filter(Optional::isPresent)
.forEach(entitySpecOpt -> {
EntitySpec entitySpec = entitySpecOpt.get();
SearchRequest searchRequest = SearchRequestHandler.getBuilder(entitySpec, searchConfiguration)
SearchRequest searchRequest = SearchRequestHandler.getBuilder(entitySpec, searchConfiguration, null)
.getSearchRequest("*", null, null, 0, 0, new SearchFlags()
.setFulltext(true).setSkipHighlighting(true).setSkipAggregates(true));