ElasticSearch - 如何 'collapse' 聚合中的文档

ElasticSearch - how to 'collapse' documents within an aggregation

正在为此苦苦挣扎,如有任何帮助,我们将不胜感激!

我有一个聚合,提供日期直方图桶内按角色、性别和年龄分组的文档计数:

"aggs": {
"period": {
  "date_histogram": {
    "field": "timestamp",
    "fixed_interval": "15m",
    "time_zone": "America/Los_Angeles",
    "order": {
      "_key": "desc"
    }
  },
  "aggs": {
    "role": {
      "terms": {
        "field": "role",
        "size": 3
      },
      "aggs": {
        "gender": {
          "terms": {
            "field": "gender",
            "size": 3
          },
          "aggs": {
            "age": {
              "terms": {
                "field": "age",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

}

每个文档都有一个visitorId,同一日期直方图桶内可能有多个文档具有相同的visitorId。

我希望每个日期直方图桶中只包含唯一的 visitorId。实际上,我想避免对同一位访客进行双重/三重等计数。这可能吗?

Each document has a visitorId, there may be many documents within the same date histogram bucket with the same visitorId.

如果仅对于每个访问者角色,性别和年龄都相同,那么下面的查询(在 visitorId 上添加基数 sub-aggregation)应该有效:

"aggs": {
"period": {
  "date_histogram": {
    "field": "timestamp",
    "fixed_interval": "15m",
    "time_zone": "America/Los_Angeles",
    "order": {
      "_key": "desc"
    }
  },
  "aggs": {
    "role": {
      "terms": {
        "field": "role",
        "size": 3
      },
      "aggs": {
        "gender": {
          "terms": {
            "field": "gender",
            "size": 3
          },
          "aggs": {
            "age": {
              "terms": {
                "field": "age",
                "size": 10
              },"aggs": {
                 "visitors": {
                   "cardinality": {
                    "field": "visitorId"
                    }
                   }
                }
            }
          }
        }
      }
    }
  }
}
}