Elasticsearch 过滤聚合，返回的分桶键没有具体拆分

Question

我有许多文件资产存储在多个文件夹中。我想要做的是运行对这组文件名的文本字符串查询，return 匹配的文件参数，以及它在每个文件夹中出现的频率。但是对于附加的查询，我没有得到每个过滤结果的完整文件名参数：

查询如下：

  "aggs": {
    "filenames": {
      "filter": {
        "term": {"filename": "foo"} 
      },
      "aggs": {
        "files_count": {
          "terms": {
            "field": "filename",
            "size": 100
          },
          "aggs": {
            "folder_count": {
              "terms": {
                "field": "folder"
              }
            }
          }
        }
      }
    }
  },
  "size": 0
}

结果看起来像这样：

"aggregations": {
        "filenames": {
            "doc_count": 1218,
            "files_count": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                    {
                        "key": "foo",
                        "doc_count": 1218,
                        "folder_count": {
                            "doc_count_error_upper_bound": 0,
                            "sum_other_doc_count": 1139,
                            "buckets": [
                                {
                                    "key": "1575569706838",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1575656106314",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1575742506771",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1575828907233",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1575915306570",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1576001707455",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1576088108154",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1576174506235",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1576347307560",
                                    "doc_count": 8
                                },
                                {
                                    "key": "1576260907130",
                                    "doc_count": 7
                                }
                            ]
                        }
                    },
...

这是我的索引数据示例：

{
    "screens": {
        "mappings": {
            "properties": {
                "date": {
                    "type": "date"
                },
                "extension": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "filename": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    },
                    "fielddata": true
                },
                "folder": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    },
                    "fielddata": true
                },
                "format": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "path": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        }
    }
}

键：queryString returned 只是文件名字段的一部分或不同片段。为了在此查询中获得完全匹配的文件名，我需要包含什么？理想情况下，而不是 key: queryString，我想用唯一的文件名将它分开，而不是将所有内容匹配在一起。我是否需要过滤结果和文件夹之间的文件名的另一个级别的聚合？我该怎么做？提前致谢。

Answer 1

filename 字段很可能是 text 类型，因此分析并索引到令牌中，这就是为什么你的桶键是这样的。

您需要运行 filename.keyword sub-field 上的术语聚合，如下所示：

{
  "aggs": {
    "filenames": {
      "filter": {
        "term": {
          "filename.keyword": "queryString"          <---- change the field name here
        }
      },
      "aggs": {
        "files_count": {
          "terms": {
            "field": "filename.keyword",             <---- change the field name here
            "size": 100
          },
          "aggs": {
            "folder_count": {
              "terms": {
                "field": "folder.keyword"            <---- change the field name here
              }
            }
          }
        }
      }
    }
  },
  "size": 0
}

Elasticsearch 过滤聚合，返回的分桶键没有具体拆分

Elasticsearch filtered aggregations, returned bucketed keys are not split up specifically

elasticsearch

elasticsearch-aggregation