ElasticSearch 查询未返回预期结果

ElasticSearch query is not returning the expected result

我有一个 json 结构如下:

{"DocumentName":"es","DocumentId":"2","Content": [{"PageNo":1,"Text": "The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."},{"PageNo":2,"Text": "The query string is processed using the same analyzer that was applied to the field during indexing."}]}

我需要获取 Content.Text 字段的词干分析结果。为此,我在创建 index.It 时创建了一个映射,如下所示:

curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d"{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "my_stemmer"]
                }
            },
            "filter": {
                "my_stemmer": {
                    "type": "stemmer",
                    "name": "english"
                }
            }
        }
    }
}, {
    "mappings": {
        "properties": {
            "DocumentName": {
                "type": "text"
            },
            "DocumentId": {
                "type": "keyword"
            },
            "Content": {
                "properties": {
                    "PageNo": {
                        "type": "integer"
                    },
                    "Text": "_all": {
                        "type": "text",
                        "analyzer": "my_analyzer",
                        "search_analyzer": "my_analyzer"
                    }
                }
            }
        }
    }
}
}"

我检查了创建的分析器:

curl -X GET "localhost:9200/myindex/_analyze?pretty" -H "Content-Type: application/json" -d"{\"analyzer\":\"my_analyzer\",\"text\":\"indexing\"}"

它给出了结果:

{
  "tokens" : [
    {
      "token" : "index",
      "start_offset" : 0,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 0
    }
  ]
}

但是在将 json 上传到索引后,当我尝试搜索 "index" 时,它返回了 0 个结果。

res = requests.get('http://localhost:9200') 
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res= es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})

任何帮助都会很大appreciated.Thank你提前。

忽略我的评论。词干分析器正在工作。尝试以下操作:

映射:

curl -X DELETE "localhost:9200/myindex"

curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d'
{ 
    "settings":{ 
       "analysis":{ 
          "analyzer":{ 
             "english_exact":{ 
                "tokenizer":"standard",
                "filter":[ 
                   "lowercase"
                ]
             }
          }
       }
    },
    "mappings":{ 
       "properties":{ 
          "DocumentName":{ 
             "type":"text"
          },
          "DocumentId":{ 
             "type":"keyword"
          },
          "Content":{ 
             "properties":{ 
                "PageNo":{ 
                   "type":"integer"
                },
                "Text":{ 
                   "type":"text",
                   "analyzer":"english",
                   "fields":{ 
                      "exact":{ 
                         "type":"text",
                         "analyzer":"english_exact"
                      }
                   }
                }
             }
          }
       }
    }
 }'

数据:

curl -XPOST "localhost:9200/myindex/_doc/1" -H "Content-Type: application/json" -d'
{ 
   "DocumentName":"es",
   "DocumentId":"2",
   "Content":[ 
      { 
         "PageNo":1,
         "Text":"The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."
      },
      { 
         "PageNo":2,
         "Text":"The query string is processed using the same analyzer that was applied to the field during indexing."
      }
   ]
}'

查询:

curl -XGET 'localhost:9200/myindex/_search?pretty' -H "Content-Type: application/json"  -d '
{ 
   "query":{ 
      "simple_query_string":{ 
         "fields":[ 
            "Content.Text"
         ],
         "query":"index"
      }
   }
}'

正如预期的那样,只返回了一份文件。我还测试了以下词干,它们都适用于建议的映射:apply(应用),texts(文本),use(使用).

Python 示例:

import requests
from elasticsearch import Elasticsearch

res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res = es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})

print(res)

在 Elasticsearch 7.4 上测试。