过滤的 bool 与 Bool 查询:elasticsearch
Filtered bool vs Bool query : elasticsearch
我在 ES 中有两个查询。两者在同一组文档上的周转时间不同。两者在概念上都在做同样的事情。我几乎没有怀疑
1- 这两者有什么区别?
2- 哪个更好用?
3- 如果两者相同,为什么它们的表现不同?
1. Filtered bool
{
"from": 0,
"size": 5,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1987112602"
}
},
{
"term": {
"original_sender_address_number": "6870340319"
}
},
{
"range": {
"x_event_timestamp": {
"gte": "2016-07-01T00:00:00.000Z",
"lte": "2016-07-30T00:00:00.000Z"
}
}
}
]
}
}
}
},
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
2. Simple Bool
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
映射:
{
"ccp": {
"mappings": {
"type1": {
"properties": {
"original_sender_address_number": {
"type": "string"
},
"called_party_address_number": {
"type": "string"
},
"cause_code": {
"type": "string"
},
"x_event_timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
.
.
.
}
}
}
}
}
更新 1:
我在同一组数据上尝试了 bool/must 查询和 bool/filter 查询,但我发现了奇怪的行为
1-
bool/must 查询能够搜索到所需的文档
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
2-
虽然 bool/filter 无法搜索文档。如果我删除第二个字段条件,它会搜索字段 2 的值为 401 的相同记录。
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
更新2:
通过将查询包装在 "constant_score".
中找到了抑制 bool/must 查询评分阶段的解决方案
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1235235757"
}
},
{
"term": {
"cause_code": "304"
}
}
]
}
}
}
}
}
我们尝试匹配的记录有 "called_party_address_number": "1235235757" 和 "cause_code": "304".
第一个使用旧的 1.x query/filter 语法(即 filtered
queries have been deprecated in favor of bool/filter
)。
第二个使用新的 2.x 语法但不在过滤器上下文中(即您使用 bool/must
而不是 bool/filter
)。具有 2.x 语法的查询等同于您的第一个查询(即在没有分数计算的过滤器上下文中运行=更快)将是这个:
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
我在 ES 中有两个查询。两者在同一组文档上的周转时间不同。两者在概念上都在做同样的事情。我几乎没有怀疑
1- 这两者有什么区别? 2- 哪个更好用? 3- 如果两者相同,为什么它们的表现不同?
1. Filtered bool
{
"from": 0,
"size": 5,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1987112602"
}
},
{
"term": {
"original_sender_address_number": "6870340319"
}
},
{
"range": {
"x_event_timestamp": {
"gte": "2016-07-01T00:00:00.000Z",
"lte": "2016-07-30T00:00:00.000Z"
}
}
}
]
}
}
}
},
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
2. Simple Bool
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
映射:
{
"ccp": {
"mappings": {
"type1": {
"properties": {
"original_sender_address_number": {
"type": "string"
},
"called_party_address_number": {
"type": "string"
},
"cause_code": {
"type": "string"
},
"x_event_timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
.
.
.
}
}
}
}
}
更新 1:
我在同一组数据上尝试了 bool/must 查询和 bool/filter 查询,但我发现了奇怪的行为
1- bool/must 查询能够搜索到所需的文档
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
2- 虽然 bool/filter 无法搜索文档。如果我删除第二个字段条件,它会搜索字段 2 的值为 401 的相同记录。
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
更新2:
通过将查询包装在 "constant_score".
中找到了抑制 bool/must 查询评分阶段的解决方案{
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1235235757"
}
},
{
"term": {
"cause_code": "304"
}
}
]
}
}
}
}
}
我们尝试匹配的记录有 "called_party_address_number": "1235235757" 和 "cause_code": "304".
第一个使用旧的 1.x query/filter 语法(即 filtered
queries have been deprecated in favor of bool/filter
)。
第二个使用新的 2.x 语法但不在过滤器上下文中(即您使用 bool/must
而不是 bool/filter
)。具有 2.x 语法的查询等同于您的第一个查询(即在没有分数计算的过滤器上下文中运行=更快)将是这个:
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}