ElasticSearch - 按术语和字段优先级查询文档
ElasticSearch - query documents by term and field priority
我目前正在使用 elasticsearch,我正在尝试从 Java 后端实现一个查询,该查询将不仅按术语而且按字段优先级从我的索引中查询文档。在我的索引中,我有包含术语和指定类型的字段的文档。
e.g
term: "Flu Shot"
type: "procedure"
term: "Fluphenazine"
type: "drug"
我创建了一个按术语搜索的查询,弹性索引将 return 匹配该术语的最相关结果。我想要创建的功能是创建一个查询 return 结果匹配相同的术语,但按 'type' 字段的优先级排序。例如,当我输入“flu”时,我想首先获取类型为“procedure”的文档,然后是“drug”类型的文档。目前,由于许多药物以“流感”开头,因此索引 return 仅记录类型为“药物”的文件。
您可以使用 function_score
.
The function_score
allows you to modify the score of documents that are retrieved by a query. To use function_score
, the user has to define a query and one or more functions, that compute a new score for each document returned by the query.
示例您的相关数据(使用 Elasticsearch 服务器 7.9):
创建索引,添加文档
PUT /example_index
{
"mappings": {
"properties": {
"term": {"type": "text" },
"type": {"type": "keyword"}
}
}
}
PUT /_bulk
{"create": {"_index": "example_index", "_id": 1}}
{"term": "Flu Shot", "type": "procedure"}
{"create": {"_index": "example_index", "_id": 2}}
{"term": "Fluphenazine", "type": "drug"}
{"create": {"_index": "example_index", "_id": 3}}
{"term": "Flu Shot2", "type": "procedure"}
{"create": {"_index": "example_index", "_id": 4}}
{"term": "Fluphenazine2", "type": "drug"}
使用自定义评分逻辑查询文档
GET /example_index/_search
{
"query": {
"function_score": {
"query": {
"wildcard": {
"term": {
"value": "*flu*"
}
}
},
"functions": [
{
"filter": {
"term": {
"type": "procedure"
}
},
"weight": 2
},
{
"filter": {
"term": {
"type": "drug"
}
},
"weight": 1
}
]
}
}
}
结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 2.0,
"hits" : [
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.0,
"_source" : {
"term" : "Flu Shot",
"type" : "procedure"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 2.0,
"_source" : {
"term" : "Flu Shot2",
"type" : "procedure"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"term" : "Fluphenazine",
"type" : "drug"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"term" : "Fluphenazine2",
"type" : "drug"
}
}
]
}
}
您可以看到 type
设置为 procedure
的文档比 type
设置为 drug
的文档得分更高。这是因为我们为 function_score
.
中的不同 type
分配了不同的权重
我目前正在使用 elasticsearch,我正在尝试从 Java 后端实现一个查询,该查询将不仅按术语而且按字段优先级从我的索引中查询文档。在我的索引中,我有包含术语和指定类型的字段的文档。
e.g
term: "Flu Shot"
type: "procedure"
term: "Fluphenazine"
type: "drug"
我创建了一个按术语搜索的查询,弹性索引将 return 匹配该术语的最相关结果。我想要创建的功能是创建一个查询 return 结果匹配相同的术语,但按 'type' 字段的优先级排序。例如,当我输入“flu”时,我想首先获取类型为“procedure”的文档,然后是“drug”类型的文档。目前,由于许多药物以“流感”开头,因此索引 return 仅记录类型为“药物”的文件。
您可以使用 function_score
.
The
function_score
allows you to modify the score of documents that are retrieved by a query. To usefunction_score
, the user has to define a query and one or more functions, that compute a new score for each document returned by the query.
示例您的相关数据(使用 Elasticsearch 服务器 7.9):
创建索引,添加文档
PUT /example_index { "mappings": { "properties": { "term": {"type": "text" }, "type": {"type": "keyword"} } } } PUT /_bulk {"create": {"_index": "example_index", "_id": 1}} {"term": "Flu Shot", "type": "procedure"} {"create": {"_index": "example_index", "_id": 2}} {"term": "Fluphenazine", "type": "drug"} {"create": {"_index": "example_index", "_id": 3}} {"term": "Flu Shot2", "type": "procedure"} {"create": {"_index": "example_index", "_id": 4}} {"term": "Fluphenazine2", "type": "drug"}
使用自定义评分逻辑查询文档
GET /example_index/_search { "query": { "function_score": { "query": { "wildcard": { "term": { "value": "*flu*" } } }, "functions": [ { "filter": { "term": { "type": "procedure" } }, "weight": 2 }, { "filter": { "term": { "type": "drug" } }, "weight": 1 } ] } } }
结果:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 4, "relation" : "eq" }, "max_score" : 2.0, "hits" : [ { "_index" : "example_index", "_type" : "_doc", "_id" : "1", "_score" : 2.0, "_source" : { "term" : "Flu Shot", "type" : "procedure" } }, { "_index" : "example_index", "_type" : "_doc", "_id" : "3", "_score" : 2.0, "_source" : { "term" : "Flu Shot2", "type" : "procedure" } }, { "_index" : "example_index", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "term" : "Fluphenazine", "type" : "drug" } }, { "_index" : "example_index", "_type" : "_doc", "_id" : "4", "_score" : 1.0, "_source" : { "term" : "Fluphenazine2", "type" : "drug" } } ] } }
您可以看到 type
设置为 procedure
的文档比 type
设置为 drug
的文档得分更高。这是因为我们为 function_score
.
type
分配了不同的权重