ElasticSearch NEST 结合 AND 和 OR 查询
ElasticSearch NEST combining AND with OR queries
问题
如何编写 NEST 代码来为这个简单的布尔逻辑生成弹性搜索查询?
term1 && (term2 || term3 || term4)
我使用 Nest (5.2) 语句查询 ElasticSearch (5.2) 来实现此逻辑的伪代码
// additional requirements
( truckOemName = "HYSTER" && truckModelName = "S40FT" && partCategoryCode = "RECO" && partID != "")
//Section I can't get working correctly
AND (
( SerialRangeInclusiveFrom <= "F187V-6785D" AND SerialRangeInclusiveTo >= "F187V-6060D" )
OR
( SerialRangeInclusiveFrom = "" || SerialRangeInclusiveTo = "" )
)
相关文档解读
Writing Bool Queries中的"Combining queries with || or should clauses"提到了
The bool
query does not quite follow the same boolean logic you expect from a programming language. term1 && (term2 || term3 || term4)
does not become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
you could get back results that only contain term1
这正是我认为正在发生的事情。
但是他们解决这个问题的答案超出了我对如何将其应用于 Nest 的理解。答案是?
- Add parentheses to force evaluation order (i am)
- Use
boost
factor? (what?)
代码
这是 NEST 代码
var searchDescriptor = new SearchDescriptor<ElasticPart>();
var terms = new List<Func<QueryContainerDescriptor<ElasticPart>, QueryContainer>>
{
s =>
(s.TermRange(r => r.Field(f => f.SerialRangeInclusiveFrom)
.LessThanOrEquals(dataSearchParameters.SerialRangeEnd))
&&
s.TermRange(r => r.Field(f => f.SerialRangeInclusiveTo)
.GreaterThanOrEquals(dataSearchParameters.SerialRangeStart)))
//None of the data that matches these ORs returns with the query this code generates, below.
||
(!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveFrom))
||
!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveTo))
)
};
//Terms is the piece in question
searchDescriptor.Query(s => s.Bool(bq => bq.Filter(terms))
&& !s.Terms(term => term.Field(x => x.OemID)
.Terms(RulesHelper.GetOemExclusionList(exclusions))));
searchDescriptor.Aggregations(a => a
.Terms(aggPartInformation, t => t.Script(s => s.Inline(script)).Size(50000))
);
searchDescriptor.Type(string.Empty);
searchDescriptor.Size(0);
var searchResponse = ElasticClient.Search<ElasticPart>(searchDescriptor);
这是它生成的 ES JSON 查询
{
"query":{
"bool":{
"must":[
{
"term":{ "truckOemName": { "value":"HYSTER" }}
},
{
"term":{ "truckModelName": { "value":"S40FT" }}
},
{
"term":{ "partCategoryCode": { "value":"RECO" }}
},
{
"bool":{
"should":[
{
"bool":{
"must":[
{
"range":{ "serialRangeInclusiveFrom": { "lte":"F187V-6785D" }}
},
{
"range":{ "serialRangeInclusiveTo": { "gte":"F187V-6060D" }}
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveFrom" }
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveTo" }
}
]
}
}
]
}
},
{
"exists":{
"field":"partID"
}
}
]
}
}
}
这是我们希望它生成的似乎有效的查询。
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
"should": [
{
"bool": {
"must": [
{
"range": { "serialRangeInclusiveFrom": {"lte": "F187V-6785D"}}
},
{
"range": {"serialRangeInclusiveTo": {"gte": "F187V-6060D"}}
}
]
}
},
{
"bool": {
"must_not": [
{
"exists": {"field": "serialRangeInclusiveFrom"}
},
{
"exists": { "field": "serialRangeInclusiveTo"}
}
]
}
}
]
}
}
]
}
}
}
文档
使用 bool
查询的重载运算符,无法表达 must
子句与 should
子句的组合,即
term1 && (term2 || term3 || term4)
变成
bool
|___must
|___term1
|___bool
|___should
|___term2
|___term3
|___term4
这是一个带有两个 must
子句的 bool
查询,其中第二个 must
子句是一个 bool
查询,其中至少有一个匹配项should
个子句。 NEST 像这样组合查询,因为它符合 .NET 中对布尔逻辑的期望。
如果确实变成了
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
如果文档仅满足 must
子句,则该文档被视为匹配项。 should
子句在这种情况下起到了推动作用,即如果文档除了匹配 must
子句之外还匹配一个或多个 should
子句,那么它将具有更高的相关性分数,假设 term2
、term3
和 term4
是计算相关性得分的查询。
在此基础上,您要生成的查询表示要将文档视为匹配项,它必须匹配 must
子句中的所有 4 个查询
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
然后,对于匹配 must
子句的文档,if
它有一个 serialRangeInclusiveFrom
小于或等于 "F187V-6785D"
和一个 serialRangeInclusiveFrom
大于或等于 "F187V-6060D"
或
serialRangeInclusiveFrom
和 serialRangeInclusiveTo
然后提高该文档的相关性得分。关键是
If a document matches the must
clauses but does not match any
of the should
clauses, it will still be a match for the query (but
have a lower relevancy score).
如果这是意图,则可以构建此查询 using the longer form of the Bool
query
问题
如何编写 NEST 代码来为这个简单的布尔逻辑生成弹性搜索查询?
term1 && (term2 || term3 || term4)
我使用 Nest (5.2) 语句查询 ElasticSearch (5.2) 来实现此逻辑的伪代码
// additional requirements
( truckOemName = "HYSTER" && truckModelName = "S40FT" && partCategoryCode = "RECO" && partID != "")
//Section I can't get working correctly
AND (
( SerialRangeInclusiveFrom <= "F187V-6785D" AND SerialRangeInclusiveTo >= "F187V-6060D" )
OR
( SerialRangeInclusiveFrom = "" || SerialRangeInclusiveTo = "" )
)
相关文档解读
Writing Bool Queries中的"Combining queries with || or should clauses"提到了
The
bool
query does not quite follow the same boolean logic you expect from a programming language.term1 && (term2 || term3 || term4)
does not become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
you could get back results that only contain term1
这正是我认为正在发生的事情。
但是他们解决这个问题的答案超出了我对如何将其应用于 Nest 的理解。答案是?
- Add parentheses to force evaluation order (i am)
- Use
boost
factor? (what?)
代码
这是 NEST 代码
var searchDescriptor = new SearchDescriptor<ElasticPart>();
var terms = new List<Func<QueryContainerDescriptor<ElasticPart>, QueryContainer>>
{
s =>
(s.TermRange(r => r.Field(f => f.SerialRangeInclusiveFrom)
.LessThanOrEquals(dataSearchParameters.SerialRangeEnd))
&&
s.TermRange(r => r.Field(f => f.SerialRangeInclusiveTo)
.GreaterThanOrEquals(dataSearchParameters.SerialRangeStart)))
//None of the data that matches these ORs returns with the query this code generates, below.
||
(!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveFrom))
||
!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveTo))
)
};
//Terms is the piece in question
searchDescriptor.Query(s => s.Bool(bq => bq.Filter(terms))
&& !s.Terms(term => term.Field(x => x.OemID)
.Terms(RulesHelper.GetOemExclusionList(exclusions))));
searchDescriptor.Aggregations(a => a
.Terms(aggPartInformation, t => t.Script(s => s.Inline(script)).Size(50000))
);
searchDescriptor.Type(string.Empty);
searchDescriptor.Size(0);
var searchResponse = ElasticClient.Search<ElasticPart>(searchDescriptor);
这是它生成的 ES JSON 查询
{
"query":{
"bool":{
"must":[
{
"term":{ "truckOemName": { "value":"HYSTER" }}
},
{
"term":{ "truckModelName": { "value":"S40FT" }}
},
{
"term":{ "partCategoryCode": { "value":"RECO" }}
},
{
"bool":{
"should":[
{
"bool":{
"must":[
{
"range":{ "serialRangeInclusiveFrom": { "lte":"F187V-6785D" }}
},
{
"range":{ "serialRangeInclusiveTo": { "gte":"F187V-6060D" }}
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveFrom" }
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveTo" }
}
]
}
}
]
}
},
{
"exists":{
"field":"partID"
}
}
]
}
}
}
这是我们希望它生成的似乎有效的查询。
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
"should": [
{
"bool": {
"must": [
{
"range": { "serialRangeInclusiveFrom": {"lte": "F187V-6785D"}}
},
{
"range": {"serialRangeInclusiveTo": {"gte": "F187V-6060D"}}
}
]
}
},
{
"bool": {
"must_not": [
{
"exists": {"field": "serialRangeInclusiveFrom"}
},
{
"exists": { "field": "serialRangeInclusiveTo"}
}
]
}
}
]
}
}
]
}
}
}
文档
使用 bool
查询的重载运算符,无法表达 must
子句与 should
子句的组合,即
term1 && (term2 || term3 || term4)
变成
bool
|___must
|___term1
|___bool
|___should
|___term2
|___term3
|___term4
这是一个带有两个 must
子句的 bool
查询,其中第二个 must
子句是一个 bool
查询,其中至少有一个匹配项should
个子句。 NEST 像这样组合查询,因为它符合 .NET 中对布尔逻辑的期望。
如果确实变成了
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
如果文档仅满足 must
子句,则该文档被视为匹配项。 should
子句在这种情况下起到了推动作用,即如果文档除了匹配 must
子句之外还匹配一个或多个 should
子句,那么它将具有更高的相关性分数,假设 term2
、term3
和 term4
是计算相关性得分的查询。
在此基础上,您要生成的查询表示要将文档视为匹配项,它必须匹配 must
子句中的所有 4 个查询
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
然后,对于匹配 must
子句的文档,if
它有一个
serialRangeInclusiveFrom
小于或等于"F187V-6785D"
和一个serialRangeInclusiveFrom
大于或等于"F187V-6060D"
或
serialRangeInclusiveFrom
和serialRangeInclusiveTo
然后提高该文档的相关性得分。关键是
If a document matches the
must
clauses but does not match any of theshould
clauses, it will still be a match for the query (but have a lower relevancy score).
如果这是意图,则可以构建此查询 using the longer form of the Bool
query