在 Elasticsearch 中搜索部分单词的期望功能不返回任何内容。只适用于完整的单词

Desire feature of searching for part of word in Elasticsearch returning nothing. Only works with complete word

我尝试了两种不同的方法来创建索引,如果我搜索单词的一部分,这两种方法都会返回任何内容。基本上,如果我搜索单词中间的第一个字母或字母,我想获得所有文档。

以这种方式创建索引的第一个尝试 (other Whosebug question a bit old):

POST correntistas/correntista
{
  "index": {
    "index": "correntistas",
    "type": "correntista",
    "analysis": {
      "index_analyzer": {
        "my_index_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "mynGram"
          ]
        }
      },
      "search_analyzer": {
        "my_search_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "mynGram"
          ]
        }
      },
      "filter": {
        "mynGram": {
          "type": "nGram",
          "min_gram": 2,
          "max_gram": 50
        }
      }
    }
  }
}

通过这种方式创建索引的第二个尝试 ()

PUT /correntistas
{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete_search": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase"
                    ]
                },
                "autocomplete_index": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "nome": {
                "type": "text",
                "analyzer": "autocomplete_index",
                "search_analyzer": "autocomplete_search"
            }
        }
    }
}

第二次尝试失败

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [properties]: Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
    }
  },
  "status": 400
}

虽然我创建索引的第一种方法无一例外地创建了索引,但是当我键入部分属性时它不起作用"nome"。

我这样添加了一个文档

POST /correntistas/correntista/1
    {
        "conta": "1234",
        "sobrenome": "Carvalho1",
        "nome": "Demetrio1"
    }

现在我想通过输入首字母(例如 De)或从中间输入部分单词(例如 met)来检索上述文档。但是我正在搜索的两种方式中的 none 是检索文档

简单查询方式:

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De" #### "met" should I also work from my perspective
            }
        }
    }
}

更详细的查询方式也失败了

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De",  #### "met" should I also work from my perspective
                "operator": "OR",
                "prefix_length": 0,
                "max_expansions": 50,
                "fuzzy_transpositions": true,
                "lenient": false,
                "zero_terms_query": "NONE",
                "auto_generate_synonyms_phrase_query": true,
                "boost": 1
            }
        }
    }
}

我认为不相关,但这里是版本(我正在使用这个版本,因为它旨在与 spring-data 一起在生产中工作,并且有一些 "delay" 添加 Elasticsearch 更新Spring-data)

中的版本
elasticsearch and kibana 6.8.4

PS.: 请不要建议我使用正则表达式和通配符 (*)。

*** 已编辑

以下所有步骤均在控制台中完成 - Kibana/Dev 工具

第 1 步:

POST /correntistas/correntista
{
  "settings": {
    "index.max_ngram_diff" :10,
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram", 
          "min_gram": 2,
          "max_gram": 8
        }
      },
      "analyzer": {
        "autocomplete": { 
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete", 
        "search_analyzer": "standard" 
      }
    }
  }
}

右侧面板的结果:

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
  "_index" : "correntistas",
  "_type" : "correntista",
  "_id" : "alrO-3EBU5lMnLQrXlwB",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

第 2 步:

POST /correntistas/correntista/1
{
    "title" : "Demetrio1"
}

右侧面板的结果:

{
  "_index" : "correntistas",
  "_type" : "correntista",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

第 3 步:

GET correntistas/_search
{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

右侧面板的结果:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

如果相关:

在获取时添加了文档类型 url

GET correntistas/correntista/_search
{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

也什么都没带来:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

正在搜索整个标题文本

GET correntistas/_search
{
    "query" :{
        "match" :{
            "title" :"Demetrio1"
        }
    }
}

自带文件:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "correntistas",
        "_type" : "correntista",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "title" : "Demetrio1"
        }
      }
    ]
  }
}

看索引有兴趣没看到分析器:

GET /correntistas/_settings

右侧面板的结果

{
  "correntistas" : {
    "settings" : {
      "index" : {
        "creation_date" : "1589067537651",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "jm8Kof16TAW7843YkaqWYQ",
        "version" : {
          "created" : "6080499"
        },
        "provided_name" : "correntistas"
      }
    }
  }
}

我如何 运行 Elasticsearch 和 Kibana

docker network create eknetwork

docker run -d --name elasticsearch --net eknetwork -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.4

docker run -d --name kibana --net eknetwork -p 5601:5601 kibana:6.8.4

在我的 , the requirement was kinda prefixed search, ie for text Demetrio1 only searching for de demet required, which worked as I created edge-ngram tokenizer to address this, but in this question, requirement is to provide the infix search for which we will use the ngram tokenizer 在我们的自定义分析器中。

下面是一步一步的例子

索引定义

{
  "settings": {
    "index.max_ngram_diff" :10,
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",  --> note this
          "min_gram": 2,
          "max_gram": 8
        }
      },
      "analyzer": {
        "autocomplete": { 
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete", 
        "search_analyzer": "standard" 
      }
    }
  }
}

索引示例文档

{
    "title" : "Demetrio1"
}

搜索查询

{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

搜索结果带来示例文档:)

 "hits": [
            {
                "_index": "ngram",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.47766083,
                "_source": {
                    "title": "Demetrio1"
                }
            }
        ]