弹性短语前缀工作短语不是
Elastic phrase prefix working phrase isnt
我正在尝试 return 所有在 userName 和 documentName 中包含字符串的文档。
数据:
{
"userName" : "johnwick",
"documentName": "john",
"office":{
"name":"my_office"
}
},
{
"userName" : "johnsnow",
"documentName": "snowy",
"office": {
"name":"Abraham deVilliers"
}
},
{
"userName" : "johnnybravo",
"documentName": "bravo",
"office": {
"name":"blabla"
}
},
{
"userName" : "moana",
"documentName": "disney",
"office": {
"name":"deVilliers"
}
},
{
"userName" : "stark",
"documentName": "marvel",
"office": {
"name":"blabla"
}
}
我可以执行精确的字符串匹配:
}
_source": [ "userName", "documentName"],
"query": {
"multi_match": {
"query": "johnsnow",
"fields": [ "userName", "documentName"]
}
}
}
这次成功了 returns:
{
"userName" : "johnsnow",
"documentName": "snowy",
"office": {
"name":"Abraham deVilliers"
}
}
如果我将 type: phrase_fix
与 john
一起使用,我也会 return 成功获得 3 个结果。
但后来我尝试使用:
{
"query": {
"multi_match": {
"query": "ohn", // <---- match all docs that contain 'ohn'
"type": "phrase_prefix"
"fields": [ "userName", "documentName"]
}
}
}
零个结果是 returned。
您正在寻找的是中缀搜索,您需要 ngram tokenizer with a search time analyzer 才能实现。
使用您的示例数据完成示例
索引映射和设置
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "Ingram", --> note this
"min_gram": 1,
"max_gram": 10
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
},
"index.max_ngram_diff" : 10 --> this you can reduce based on your requirement.
},
"mappings": {
"properties": {
"userName": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"documentName": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
对您的文档进行抽样,然后使用相同的搜索查询,为了简洁起见,我只索引了第一个和最后一个文档,它返回给我第一个文档
"hits": [
{
"_index": "infix",
"_type": "_doc",
"_id": "1",
"_score": 5.7100673,
"_source": {
"userName": "johnwick",
"documentName": "john"
}
}
]
我正在尝试 return 所有在 userName 和 documentName 中包含字符串的文档。
数据:
{
"userName" : "johnwick",
"documentName": "john",
"office":{
"name":"my_office"
}
},
{
"userName" : "johnsnow",
"documentName": "snowy",
"office": {
"name":"Abraham deVilliers"
}
},
{
"userName" : "johnnybravo",
"documentName": "bravo",
"office": {
"name":"blabla"
}
},
{
"userName" : "moana",
"documentName": "disney",
"office": {
"name":"deVilliers"
}
},
{
"userName" : "stark",
"documentName": "marvel",
"office": {
"name":"blabla"
}
}
我可以执行精确的字符串匹配:
}
_source": [ "userName", "documentName"],
"query": {
"multi_match": {
"query": "johnsnow",
"fields": [ "userName", "documentName"]
}
}
}
这次成功了 returns:
{
"userName" : "johnsnow",
"documentName": "snowy",
"office": {
"name":"Abraham deVilliers"
}
}
如果我将 type: phrase_fix
与 john
一起使用,我也会 return 成功获得 3 个结果。
但后来我尝试使用:
{
"query": {
"multi_match": {
"query": "ohn", // <---- match all docs that contain 'ohn'
"type": "phrase_prefix"
"fields": [ "userName", "documentName"]
}
}
}
零个结果是 returned。
您正在寻找的是中缀搜索,您需要 ngram tokenizer with a search time analyzer 才能实现。
使用您的示例数据完成示例
索引映射和设置
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "Ingram", --> note this
"min_gram": 1,
"max_gram": 10
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
},
"index.max_ngram_diff" : 10 --> this you can reduce based on your requirement.
},
"mappings": {
"properties": {
"userName": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"documentName": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
对您的文档进行抽样,然后使用相同的搜索查询,为了简洁起见,我只索引了第一个和最后一个文档,它返回给我第一个文档
"hits": [
{
"_index": "infix",
"_type": "_doc",
"_id": "1",
"_score": 5.7100673,
"_source": {
"userName": "johnwick",
"documentName": "john"
}
}
]