ElasticSearch 中不区分大小写的完全匹配

Question

我需要查询 ElasticSearch 索引的能力，以查看是否有任何文档已经具有如下所示字段的特定值：

"name" : {
      "type" : "text",
      "fields" : {
        "raw" : {
          "type" : "keyword"
        }
      }
 }

我最初打算使用 normalizer, but i'm hoping to avoid having to make changes to the index itself. I then found the match_phrase query 来做到这一点，它几乎正是我所需要的。问题是它也会 return 部分匹配，只要它们开始相同。例如 - 如果我正在搜索值 this is a test，它将 return 结果为以下值：

this is a test 1
this is a test but i'm almost done now
this is a test again

在我的情况下，我可以在数据 returned 后再次检查代码，看看它是否实际上是不区分大小写的精确匹配，但我对 ElasticSearch 比较陌生，而且我想知道是否有任何方法可以构造我的原始 match_phrase 查询，使其不会 return 我上面发布的示例？

Answer 1

对于任何感兴趣的人，我发现了几种不同的方法来执行此操作，第一种 - 执行 match_phrase 查询，然后使用脚本检查长度：

GET definitions/_search
{
  "query": {
    "bool":{
      "must":{
        "match_phrase":{
          "name":{
             "query":"Test Name"
          }
        }
      },
      "filter": [
        {
          "script": {
            "script": {
              "source": "doc['name.raw'].value.length() == 9",
              "lang": "painless"
            }
          }
        }
      ]
    }
  }
}

然后我想如果我可以检查脚本中的长度，也许我可以做一个不区分大小写的比较：

GET definitions/_search
{
  "query": {
    "bool": { 
      "filter": [
        {
          "script": {
            "script": {
              "source": "doc['name.raw'].value.toLowerCase() == 'test name'",
              "lang": "painless"
            }
          }
        }
      ]
    }
  }
}

所以这些是选项。在我的例子中，我担心性能，所以我们硬着头皮创建了一个规范器，允许不区分大小写的比较，所以这些甚至没有被使用。但我认为我应该把它扔在这里，因为我无法在其他任何地方找到这些答案。

ElasticSearch 中不区分大小写的完全匹配

Case insensitive exact match in ElasticSearch

elasticsearch

elasticsearch-7