搜索值数组

Search for an array of values

我在 elasticsearch 中有一个索引,其中正文包含一个字段数组和一个值数组。例如:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 1,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 1,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "pMrEd2EB9CizMt-kq5m-",
        "_score": 1,
        "_source": {
        "names": [
            "lia shelton",
            "joanna shaffer",
            "mathias little"
        ]
        }
    }
    ]
}
}

现在我需要一个搜索查询,我可以在其中从一组值中搜索文档,如下所示:

GET /families/_search
{
"query" : {
    "bool" : {
    "filter" : {
        "bool" : {
        "should" : [
            {"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
        ]
        }
    }
    }
}
}

它应该 return 包含这些名称的 2 个文档如下:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 0,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 0,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    }
    ]
}
}

如何进行这样的查询?我尝试使用 "terms" 关键字,但 "terms" 只允许我从数组中搜索单个单词,如下所示: {"terms" : {"names" : ["bray", "nia"]}}

但我需要像这样使用全名: {"names" : ["ahmed bray", "nia walsh"]}}

您拥有的 "problem" 与 Elasticsearch 如何处理文本字段的行为有关。默认情况下,每个文本字段都使用 Standard Tokenizer 进行标记,正如您在文档中看到的那样,将文本拆分为单词。

实现此目的的一个选择是改进默认设置和映射。您需要做的就是添加 multi field(在我们的例子中是 entire-phrase),它将以不同的方式进行分析并通过它进行搜索。

首先使用以下内容创建索引 settings/mappings:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "case_and_accent_insensitive": {
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "family": {
      "properties": {
        "names": {
          "type": "text",
          "fields": {
            "entire-phrase": {
              "type": "keyword",
              "normalizer": "case_and_accent_insensitive"
            }
          }
        }
      }
    }
  }
}

然后您可以通过以下方式搜索您期望的内容:

{
  "query": {
    "terms": {
      "names.entire-phrase": [
        "ahmed bray",
        "nia walsh"
      ]
    }
  }
}

必须警告您,此搜索只会根据名字或姓氏为您找到任何结果。只匹配整个短语。如果您想同时实现这两个目标,则必须按 namesnames.entire-phrase.

这两个字段进行搜索