搜索值数组

Question

我在 elasticsearch 中有一个索引，其中正文包含一个字段数组和一个值数组。例如：

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 1,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 1,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "pMrEd2EB9CizMt-kq5m-",
        "_score": 1,
        "_source": {
        "names": [
            "lia shelton",
            "joanna shaffer",
            "mathias little"
        ]
        }
    }
    ]
}
}

现在我需要一个搜索查询，我可以在其中从一组值中搜索文档，如下所示：

GET /families/_search
{
"query" : {
    "bool" : {
    "filter" : {
        "bool" : {
        "should" : [
            {"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
        ]
        }
    }
    }
}
}

它应该 return 包含这些名称的 2 个文档如下：

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 0,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 0,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    }
    ]
}
}

如何进行这样的查询？我尝试使用 "terms" 关键字，但 "terms" 只允许我从数组中搜索单个单词，如下所示： {"terms" : {"names" : ["bray", "nia"]}}

但我需要像这样使用全名： {"names" : ["ahmed bray", "nia walsh"]}}

Answer 1

您拥有的 "problem" 与 Elasticsearch 如何处理文本字段的行为有关。默认情况下，每个文本字段都使用 Standard Tokenizer 进行标记，正如您在文档中看到的那样，将文本拆分为单词。

实现此目的的一个选择是改进默认设置和映射。您需要做的就是添加 multi field（在我们的例子中是 entire-phrase），它将以不同的方式进行分析并通过它进行搜索。

首先使用以下内容创建索引 settings/mappings:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "case_and_accent_insensitive": {
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "family": {
      "properties": {
        "names": {
          "type": "text",
          "fields": {
            "entire-phrase": {
              "type": "keyword",
              "normalizer": "case_and_accent_insensitive"
            }
          }
        }
      }
    }
  }
}

然后您可以通过以下方式搜索您期望的内容：

{
  "query": {
    "terms": {
      "names.entire-phrase": [
        "ahmed bray",
        "nia walsh"
      ]
    }
  }
}

必须警告您，此搜索只会根据名字或姓氏为您找到任何结果。只匹配整个短语。如果您想同时实现这两个目标，则必须按 names 和 names.entire-phrase.

这两个字段进行搜索

搜索值数组

Search for an array of values

elasticsearch

elasticsearch-5