在 MongoDB 中的聚合管道中获取输入文档中 $group 之后的字段

get the fields in the input document after $group in aggregation pipeline in MongoDB

我面临的问题是,在MongoDB.

中如何在一组操作后访问原始文档并在聚合管道中携带$group之后的字段

例如:[分组,放松,分组]

原文件为:

{
"_id" : ObjectId("361de42f1938e89b179dda42"),
"user_id" : ObjectId("9424021bafbde55512e39b83"),
"candidate_id" : ObjectId("54f65356294160421ead3ca1")
"OVERALL_SCORE" : 150,
"SCORES" : [ 
    { "NAME" : "asd", "OBTAINED_SCORE" : 28}, { "NAME" : "acd", "OBTAINED_SCORE" : 36 }, { "NAME" : "abc", "OBTAINED_SCORE" : 40}
 ]
}

聚合函数:

 db.coll.aggregate([ $group : { _id : { user_id : "$user_id"}, BEST_SCORE : { $max : "$OVERALL_SCORE"}, AVG_SCORE : { $avg : "$OVERALL_SCORE" }}} ])

下面是示例输出(第一组之后):

{
"result" : [ 
    {
        "_id" : {
            "user_id" : ObjectId("9424021bafbde55512e39b83")
        },
        "BEST_SCORE" : 150,
        "AVG_SCORE" : 132
    }
],
"ok" : 1
 }

问题是:(不知能否实现) 我想要原始文档中的字段(聚合输入)。

例如: 1) 展开原始文档中的 "SCORES" 和下一组 "candidate_id", "user_id".

2) 我希望 "BEST_SCORE"、"AVG_SCORE"(在第一组之后)字段也可以在第二组中访问。

聚合函数应如下所示:

   db.coll.aggregate([ $group : { _id : { user_id : "$user_id"}, BEST_SCORE : { $max : "$OVERALL_SCORE"}, AVG_SCORE : { $avg : "$OVERALL_SCORE" }}}, { $unwind : "$SCORES"}, /*problem is--after group operation "SCORES" field which is in original document not available */ { $group : _id : { NAME: "$SCORES.NAME"}, AVG_OBTAINED_SCORE: { $avg : "$SCORES.OBTAINED_SCORE"}} **/*problem is--this is also in the original document*/** ])

输出应如下所示:

   "BEST_SCORE": 150,                     //after 1st group
  "AVG_SCORE": 132,                       //after 1st group
  "SCORES": [                             //problem --- unwind "SCORES" and then group which is actually will not be available after 1st group (get this from original document)
    {
      "NAME": "abc",
      "AVG_OBTAINED_SCORE": 25.5
    },
    {
      "NAME": "asd",
      "AVG_OBTAINED_SCORE": 24
    },
    {
      "NAME": "acd",
      "AVG_OBTAINED_SCORE": 32
    }
  ]

谁能帮帮我。

谢谢

当与您想要保留组中所有考虑文档的值的内容分组时,您需要使用 $push. Catch is, that this is an array. So you process $unwind twice, and also have two $group 个阶段:

db.coll.aggregate([
    {  "$group" : { 
        "_id": "$user_id", 
        "BEST_SCORE": { "$max": "$OVERALL_SCORE" },
        "AVG_SCORE": { "$avg": "$OVERALL_SCORE" },
        "SCORES": { "$push": "SCORES" }
    }}, 

    // SCORES in an array of arrays. Unwind twice
    { "$unwind": "$SCORES" },
    { "$unwind": "$SCORES" },

    // Group for averages on elements
    { "$group": {
        "_id": {
            "user_id": "$_id",
            "NAME": "$SCORES.name"
        },
        "BEST_SCORE": { "$first": "$BEST_SCORE" },
        "AVG_SCORE": { "$first": "$AVG_SCORE" }
        "AVG_OBTAINED_SCORE": { "$avg": "$SCORES.OBTAINED_SCORE" } 
    }},

    // Group to user_id
    { "$group": {
        "user_id": "$_id.user_id",
        "BEST_SCORE": { "$first": "$BEST_SCORE" },
        "AVG_SCORE": { "$first": "$AVG_SCORE" }
        "SCORES": { "$push": {
            "NAME": "$_id.NAME",
            "AVG_OBTAINED_SCORE": "$AVG_OBTAINED_SCORE"
        }}     
    }}
])

您可能会考虑在第一个 $group 之前使用 $unwind,但如果您这样做,则计算的平均值将受到数组中存在的元素数量的影响 "un-wound"。所以这里"double $unwind"是一个必须的过程。