在 MongoDB 聚合中,如何将空值放入分隔字段,将其他值放入不同的字段?

How can I put null values to separate field and others to different to field in MongoDB aggregation?

我的 collection 中有以下文档。

{
"_id" : ObjectId("55961a28bffebcb8058b4570"),
"title" : "BackOffice 2",
"cts" : NumberLong(1435900456),
"todo_items" : [
    {
        "id" : "55961a42bffebcb7058b4570",
        "task_desc" : "test 1",
        "completed_by" : "557fccb5bffebcf7048b457c",
        "completed_date" : NumberLong(1436161096)
    },
    {
        "id" : "559639afbffebcc7098b45a6",
        "task_desc" : "test 2",
        "completed_by" : "557fccb5bffebcf7048b457c",
        "completed_date" : NumberLong(1435911809)
    },
    {
        "id" : "559a22f5bffebcb0048b476c",
        "task_desc" : "test 3",
    }
],
"uts" : NumberLong(1436164853)
}

我需要一个聚合查询来执行以下操作,如果有字段 "completed_by" 和 "completed_date" 并且如果有一个不为空的值推入 "completed" 数组字段, 否则将它们推入 "incomplete" 字段。

以下是我想要的示例结果。

{
  "_id" : ObjectId("55961a28bffebcb8058b4570"),
  "completed" : [
     {
       "id":"557fccb5bffebcf7048b457c",
       "title":"test 1",
       "completed_by" : "557fccb5bffebcf7048b457c",
       "completed_date" : NumberLong(1436161096)
     },
     {
       "id":"557fccb5bffebcf7048b457c",
       "title":"test 1",
       "completed_by" : "557fccb5bffebcf7048b457c",
       "completed_date" : NumberLong(1436161096)
     }
    ],
 "incomplete":[
     {
       "id" : "559a22f5bffebcb0048b476c",
       "title" : "test 3"
     }
   ]
}

只要您的 "array" 项目有 "distinct" 标识符(他们有),有几种方法可以解决这个问题;

首先,实际上没有"aggregating accross documents":

db.collection.aggregate([
    { "$project": {
        "title": 1,
        "cts": 1,
        "completed": { "$setDifference": [
            { "$map": {
                "input": "$todo_items",
                "as": "i",
                "in": {
                    "$cond": [
                        "$$i.completed_date",
                        "$$i",
                        false
                    ]
                }
            }},
           [false]
        ]},
        "incomplete": { "$setDifference": [
            { "$map": {
                "input": "$todo_items",
                "as": "i",
                "in": {
                    "$cond": [
                        "$$i.completed_date",
                        false,
                        "$$i"
                    ]
                }
            }},
           [false]
        ]}
    }}
])

这要求您在服务器上至少有 MongoDB 2.6 可用才能使用所需的 $map and $setDifference operators. It's pretty fast considering that all the work is done in a single $project 阶段。

仅在 "aggregating across documents" 时才应使用的替代方案适用于支持聚合框架的所有版本 post MongoDB 2.2:

db.collection.aggregate([
    { "$unwind": "$todo_items" },
    { "$group": {
        "_id": "$_id",
        "title": { "$first": "$title" },
        "cts": { "$first": "$cts" },
        "completed": { 
            "$addToSet": {
                "$cond": [
                    "$todo_items.completed_date",
                    "$todo_items",
                    null
                ]
            }
        },
        "incomplete": {
            "$addToSet": {
                "$cond": [
                    "$todo_items.completed_date",
                    null,
                    "$todo_items",
                ]
            }
        }
    }},
    { "$unwind": "$completed" },
    { "$match": { "completed": { "$ne": null } } },
    { "$group": {
        "_id": "$_id",
        "title": { "$first": "$title" },
        "cts": { "$first": "$cts" },
        "completed": { "$push": "$completed" },
        "incomplete": { "$first": "$incomplete" }
    }}
    { "$unwind": "$incomplete" },
    { "$match": { "incomplete": { "$ne": null } } },
    { "$group": {
        "_id": "$_id",
        "title": { "$first": "$title" },
        "cts": { "$first": "$cts" },
        "completed": { "$first": "$completed" },
        "incomplete": { "$push": "$incomplete" }
    }}
])

这还不是全部,因为您需要满足数组可能最终为空的情况。但这并不是真正的教训,因为 MongoDB 2.6 已经发行了几年。

在汇总中,您不能真正排除 "null/false" 结果,但您可以"filter" 它们。

此外,除非您实际上是 "aggregating accross documents",否则使用 $unwind 处理数组的第二种形式会带来 "lot" 的开销。因此,您确实应该在读取每个文档时更改客户端代码中的数组内容。

能否请您检查以下内容:

db.collection.aggregate([
 {$unwind : "$todo_items"},
 {$group: {_id : "$_id" , completed : {{$cond : 
             {
             if : { $and : [ {"todo_items.completed_by" : {$exists: true, $ne : null }},
                              {"todo_items.completed_date" : {$exists : true, $ne : null}} ] } }, 
            then : {$push : {"old_completed" : "$todo_items"}}, 
            else: {$push : {"old_incompleted" : "$todo_items"}}
             } } } },
 {$project: {_id : "$_id", completed : "$completed.old_completed" ,
             incompleted : "$completed.old_incompleted"}}
 ]);