在 Kibana Discover 上查询 Lucene 没有按预期工作？

Question

我正在尝试在我的 logstash 索引的 "host" 属性上搜索服务器名称。

我正在 Kibana 的“发现”选项卡上搜索。

当我在搜索栏中输入 sl00pm 时，我得到了：

No results found

但是当我添加星号 (*) 并搜索 sl00pm* 时，我得到了这个：

host:sl00pm.soo85.poly-vale.intra date:2019-03-20 15:23:10,591

我不明白为什么。

但是当我对另一个服务器名称 slzq85 执行相同的操作时，我得到了这个：

host:slzq85.soo85.poly-vale.intra date:21/Mar/2019:09:24:56 +0100

这就是我所期待的。

这是我在 LogStash 上的索引的定义：

{
 "logstash-2019.03.20": {
  "aliases": {},
  "mappings": {
   "apache-access": {
    "_all": {
     "enabled": true,
     "norms": false
    },
    "dynamic_templates": [
     {
      "message_field": {
       "match": "message",
       "match_mapping_type": "string",
       "mapping": {
        "index": "analyzed",
        "omit_norms": true,
        "type": "string"
       }
      }
     },
     {
      "string_fields": {
       "match": "*",
       "match_mapping_type": "string",
       "mapping": {
        "fields": {
         "raw": {
          "ignore_above": 256,
          "index": "not_analyzed",
          "type": "string"
         }
        },
        "index": "analyzed",
        "omit_norms": true,
        "type": "string"
       }
      }
     }
    ],
    "properties": {
     "@timestamp": {
      "type": "date"
     },
     "@version": {
      "type": "keyword"
     },
     "date": {
      "type": "text",
      "norms": false,
      "fields": {
       "raw": {
        "type": "keyword",
        "ignore_above": 256
       }
      }
     },
     "host": {
      "type": "text",
      "norms": false,
      "fields": {
       "raw": {
        "type": "keyword",
        "ignore_above": 256
       }
      }
     }
    }
   },
  },
  "settings": {
   "index": {
    "refresh_interval": "5s",
    "number_of_shards": "5",
    "provided_name": "logstash-2019.03.20",
    "creation_date": "1553036402235",
    "number_of_replicas": "1",
    "uuid": "mCSFLYGETPm6qbgOwShHog",
    "version": {
     "created": "5060399"
    }
   }
  }
 }
}

和版本：

version": {
 "number": "5.6.3",
 "lucene_version": "6.6.1"
},

你能告诉我为什么我的结果不好吗？

我想补充一点，我正在使用映射类型，并且在我的索引的不同映射类型中具有相同的属性，但具有与上面相同的定义

问候

Answer 1

该行为的原因是分析器中断了单词。标准分析器根据 UAX #29 中规定的规则拆分单词。规则WB6和WB11，尤其要注意这里。

基本上，它不会在带有“.”的字母上中断。在中间（例如："ab.cd"），或带有“.”的数字在中间（例如：“12.34”），但它会中断由“。”分隔的数字和字母（例如：“12.cd”）。

因此在您的索引中，"sl00pm.soo85" 被索引为单个标记，但 "slzq85.soo85" 被分成两个标记："slz85" 和 "soo85"。

标准分析器专为处理文本而设计。单词和句子。对于您正在查看的标识符，您可以尝试使用不同的分析器，也许 PatternAnalyzer.

在 Kibana Discover 上查询 Lucene 没有按预期工作？

Query Lucene on Kibana Discover not working as intended?

lucene

elasticsearch

logstash

kibana