弹性短语前缀工作短语不是

Question

我正在尝试 return 所有在 userName 和 documentName 中包含字符串的文档。

数据：

{
  "userName" : "johnwick",
  "documentName": "john",
  "office":{
     "name":"my_office"
  }
},
{
  "userName" : "johnsnow",
  "documentName": "snowy",
  "office": {
     "name":"Abraham deVilliers"
  }
},
{
  "userName" : "johnnybravo",
  "documentName": "bravo",
  "office": {
     "name":"blabla"
  }
},
{
  "userName" : "moana",
  "documentName": "disney",
  "office": {
     "name":"deVilliers"
  }
},
{
  "userName" : "stark",
  "documentName": "marvel",
  "office": {
     "name":"blabla"
  }
}

我可以执行精确的字符串匹配：

}   
  _source": [ "userName", "documentName"],    
  "query": {
    "multi_match": {
      "query":       "johnsnow",
      "fields":      [ "userName", "documentName"]
    }
  }
}

这次成功了 returns:

{
  "userName" : "johnsnow",
  "documentName": "snowy",
  "office": {
     "name":"Abraham deVilliers"
  }
}

如果我将 type: phrase_fix 与 john 一起使用，我也会 return 成功获得 3 个结果。

但后来我尝试使用：

{   
  "query": {
    "multi_match": {
      "query":       "ohn",  // <---- match all docs that contain 'ohn'
      "type":        "phrase_prefix"
      "fields":      [ "userName", "documentName"]
    }
  }
}

零个结果是 returned。

Answer 1

您正在寻找的是中缀搜索，您需要 ngram tokenizer with a search time analyzer 才能实现。

使用您的示例数据完成示例

索引映射和设置

{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "Ingram",  --> note this
                    "min_gram": 1,
                    "max_gram": 10
                }
            },
            "analyzer": {
                "autocomplete": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ]
                }
            }
        },
        "index.max_ngram_diff" : 10 --> this you can reduce based on your requirement.
    },
    "mappings": {
        "properties": {
            "userName": {
                "type": "text",
                "analyzer": "autocomplete",
                "search_analyzer": "standard"
            },
            "documentName": {
                "type": "text",
                "analyzer": "autocomplete",
                "search_analyzer": "standard"
            }
        }
    }
}

对您的文档进行抽样，然后使用相同的搜索查询，为了简洁起见，我只索引了第一个和最后一个文档，它返回给我第一个文档

"hits": [
      {
        "_index": "infix",
        "_type": "_doc",
        "_id": "1",
        "_score": 5.7100673,
        "_source": {
          "userName": "johnwick",
          "documentName": "john"
        }
      }
    ]

弹性短语前缀工作短语不是

Elastic phrase prefix working phrase isnt

elasticsearch

elasticsearch-query