elasticsearch 查询字符串不按单词部分搜索

Question

我正在发送此请求

curl -XGET 'host/process_test_3/14/_search' -d '{
  "query" : {
    "query_string" : {
      "query" : "\"*cor interface*\"",
      "fields" : ["title", "obj_id"]
    }
  }
}'

我得到了正确的结果

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 5.421598,
    "hits": [
      {
        "_index": "process_test_3",
        "_type": "14",
        "_id": "141_dashboard_14",
        "_score": 5.421598,
        "_source": {
          "obj_type": "dashboard",
          "obj_id": "141",
          "title": "Cor Interface Monitoring"
        }
      }
    ]
  }
}

但是当我想按单词部分搜索时，例如

curl -XGET 'host/process_test_3/14/_search' -d '
{
  "query" : {
    "query_string" : {
      "query" : "\"*cor inter*\"",
      "fields" : ["title", "obj_id"]
    }
  }
}'

我没有得到任何结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : []
  }
}

我做错了什么？

Answer 1

这是因为您的 title 字段可能已被标准分析器（默认设置）分析并且标题 Cor Interface Monitoring 已被标记为三个标记 cor, interface 和 monitoring.

为了搜索单词的任何子字符串，您需要创建一个 custom analyzer which leverages the ngram token filter 以便同时索引每个标记的所有子字符串。

您可以这样创建索引：

curl -XPUT localhost:9200/process_test_3 -d '{
  "settings": {
    "analysis": {
      "analyzer": {
        "substring_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "substring"]
        }
      },
      "filter": {
        "substring": {
          "type": "nGram",
          "min_gram": 2,
          "max_gram": 15
        }
      }
    }
  },
  "mappings": {
    "14": {
      "properties": {
        "title": {
          "type": "string",
          "analyzer": "substring_analyzer"
        }
      }
    }
  }
}'

然后您可以重新索引您的数据。这将做的是标题 Cor Interface Monitoring 现在将被标记为：

co、cor、or
in、int、inte、inter、interf 等
mo、mon、moni、等等

这样您的第二个搜索查询现在将 return 您期望的文档，因为标记 cor 和 inter 现在将匹配。

Answer 2

+1 到 Val 的解决方案。只是想添加一些东西。由于您的查询相对简单，您可能需要查看 match/match_phrase 查询。匹配查询确实有像 query_string 这样的正则表达式解析，因此更轻。您可以在此处找到详细信息：https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

elasticsearch 查询字符串不按单词部分搜索

elasticsearch query string dont search by word part

query-string

elasticsearch