在搜索唯一 ID 时,它给出了多个计数
while searching for unique id it is giving multiple count
我有 500 万个文档,每个文档都有唯一的 customerid 作为映射 ID。在搜索唯一客户时,它会返回 1992 年的文档。每个唯一 ID 都会发生这种情况,给出差异计数,因为它应该只给出一个文档。
我在 kibana 中执行了以下查询:
GET /my_index/_search
{
"query": {
"match": {
"customerid": "e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
}
}
}
它为我提供了以下唯一客户 ID 的结果:
{
"took" : 20,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1992,
"relation" : "eq"
},
"max_score" : 59.505646,
"hits" : [
....
....
....
下面是我的索引映射:
{
"pb_2409" : {
"mappings" : {
"dynamic_date_formats" : [
"yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
],
"dynamic_templates" : [
{
"objects" : {
"match_mapping_type" : "object",
"mapping" : {
"type" : "nested"
}
}
}
],
"properties" : {
"customerid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
我是不是漏掉了什么?
将 customerid 类型更改为 keyword 并在索引设置中添加 normalizer。
"settings": {
"analysis": {
"normalizer": {
"my_custom_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
然后将 "normalizer": "my_custom_normalizer"
添加到 customerid 字段(以防您要搜索区分大小写的 ID)
"properties" : {
"customerid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256,
"normalizer": "my_custom_normalizer"
}
}
}
您的搜索查询将如下所示
GET /my_index/_search
{
"query": {
"term": {
"customerid.keyword": {
"value":"e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
}
}
}
}
您的新映射:
PUT /index
{
"pb_2409": {
"mappings": {
"dynamic_date_formats": [
"yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
],
"dynamic_templates": [
{
"objects": {
"match_mapping_type": "object",
"mapping": {
"type": "nested"
}
}
}
],
"properties": {
"customerid": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "my_custom_normalizer"
}
}
}
}
},
"settings": {
"analysis": {
"normalizer": {
"my_custom_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
}
}
https://www.elastic.co/blog/strings-are-dead-long-live-strings
希望有帮助
我有 500 万个文档,每个文档都有唯一的 customerid 作为映射 ID。在搜索唯一客户时,它会返回 1992 年的文档。每个唯一 ID 都会发生这种情况,给出差异计数,因为它应该只给出一个文档。
我在 kibana 中执行了以下查询:
GET /my_index/_search
{
"query": {
"match": {
"customerid": "e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
}
}
}
它为我提供了以下唯一客户 ID 的结果:
{
"took" : 20,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1992,
"relation" : "eq"
},
"max_score" : 59.505646,
"hits" : [
....
....
....
下面是我的索引映射:
{
"pb_2409" : {
"mappings" : {
"dynamic_date_formats" : [
"yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
],
"dynamic_templates" : [
{
"objects" : {
"match_mapping_type" : "object",
"mapping" : {
"type" : "nested"
}
}
}
],
"properties" : {
"customerid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
我是不是漏掉了什么?
将 customerid 类型更改为 keyword 并在索引设置中添加 normalizer。
"settings": {
"analysis": {
"normalizer": {
"my_custom_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
然后将 "normalizer": "my_custom_normalizer"
添加到 customerid 字段(以防您要搜索区分大小写的 ID)
"properties" : {
"customerid" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256,
"normalizer": "my_custom_normalizer"
}
}
}
您的搜索查询将如下所示
GET /my_index/_search
{
"query": {
"term": {
"customerid.keyword": {
"value":"e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
}
}
}
}
您的新映射:
PUT /index
{
"pb_2409": {
"mappings": {
"dynamic_date_formats": [
"yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
],
"dynamic_templates": [
{
"objects": {
"match_mapping_type": "object",
"mapping": {
"type": "nested"
}
}
}
],
"properties": {
"customerid": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "my_custom_normalizer"
}
}
}
}
},
"settings": {
"analysis": {
"normalizer": {
"my_custom_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
}
}
https://www.elastic.co/blog/strings-are-dead-long-live-strings 希望有帮助