搜索值数组
Search for an array of values
我在 elasticsearch 中有一个索引,其中正文包含一个字段数组和一个值数组。例如:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 1,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 1,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "pMrEd2EB9CizMt-kq5m-",
"_score": 1,
"_source": {
"names": [
"lia shelton",
"joanna shaffer",
"mathias little"
]
}
}
]
}
}
现在我需要一个搜索查询,我可以在其中从一组值中搜索文档,如下所示:
GET /families/_search
{
"query" : {
"bool" : {
"filter" : {
"bool" : {
"should" : [
{"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
]
}
}
}
}
}
它应该 return 包含这些名称的 2 个文档如下:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 0,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 0,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
}
]
}
}
如何进行这样的查询?我尝试使用 "terms" 关键字,但 "terms" 只允许我从数组中搜索单个单词,如下所示:
{"terms" : {"names" : ["bray", "nia"]}}
但我需要像这样使用全名:
{"names" : ["ahmed bray", "nia walsh"]}}
您拥有的 "problem" 与 Elasticsearch 如何处理文本字段的行为有关。默认情况下,每个文本字段都使用 Standard Tokenizer 进行标记,正如您在文档中看到的那样,将文本拆分为单词。
实现此目的的一个选择是改进默认设置和映射。您需要做的就是添加 multi field(在我们的例子中是 entire-phrase
),它将以不同的方式进行分析并通过它进行搜索。
首先使用以下内容创建索引 settings/mappings:
{
"settings": {
"analysis": {
"normalizer": {
"case_and_accent_insensitive": {
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"family": {
"properties": {
"names": {
"type": "text",
"fields": {
"entire-phrase": {
"type": "keyword",
"normalizer": "case_and_accent_insensitive"
}
}
}
}
}
}
}
然后您可以通过以下方式搜索您期望的内容:
{
"query": {
"terms": {
"names.entire-phrase": [
"ahmed bray",
"nia walsh"
]
}
}
}
必须警告您,此搜索只会根据名字或姓氏为您找到任何结果。只匹配整个短语。如果您想同时实现这两个目标,则必须按 names
和 names.entire-phrase
.
这两个字段进行搜索
我在 elasticsearch 中有一个索引,其中正文包含一个字段数组和一个值数组。例如:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 1,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 1,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "pMrEd2EB9CizMt-kq5m-",
"_score": 1,
"_source": {
"names": [
"lia shelton",
"joanna shaffer",
"mathias little"
]
}
}
]
}
}
现在我需要一个搜索查询,我可以在其中从一组值中搜索文档,如下所示:
GET /families/_search
{
"query" : {
"bool" : {
"filter" : {
"bool" : {
"should" : [
{"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
]
}
}
}
}
}
它应该 return 包含这些名称的 2 个文档如下:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 0,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 0,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
}
]
}
}
如何进行这样的查询?我尝试使用 "terms" 关键字,但 "terms" 只允许我从数组中搜索单个单词,如下所示: {"terms" : {"names" : ["bray", "nia"]}}
但我需要像这样使用全名: {"names" : ["ahmed bray", "nia walsh"]}}
您拥有的 "problem" 与 Elasticsearch 如何处理文本字段的行为有关。默认情况下,每个文本字段都使用 Standard Tokenizer 进行标记,正如您在文档中看到的那样,将文本拆分为单词。
实现此目的的一个选择是改进默认设置和映射。您需要做的就是添加 multi field(在我们的例子中是 entire-phrase
),它将以不同的方式进行分析并通过它进行搜索。
首先使用以下内容创建索引 settings/mappings:
{
"settings": {
"analysis": {
"normalizer": {
"case_and_accent_insensitive": {
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"family": {
"properties": {
"names": {
"type": "text",
"fields": {
"entire-phrase": {
"type": "keyword",
"normalizer": "case_and_accent_insensitive"
}
}
}
}
}
}
}
然后您可以通过以下方式搜索您期望的内容:
{
"query": {
"terms": {
"names.entire-phrase": [
"ahmed bray",
"nia walsh"
]
}
}
}
必须警告您,此搜索只会根据名字或姓氏为您找到任何结果。只匹配整个短语。如果您想同时实现这两个目标,则必须按 names
和 names.entire-phrase
.