MongoDB: $pull / $unset 有多个条件

Question

示例文档：

{
  _id: 42,
  foo: {
    bar: [1, 2, 3, 3, 4, 5, 5]
  }
}

查询：

我愿意"remove all entries from foo.bar that are $lt: 4 and the first matching entry that matches $eq: 5"。 重要提示： $eq 部分只能删除一个单个条目！

我有一个可行的解决方案，它使用 3 个更新查询，但这对于这个简单的任务来说太多了。不过，这是我到目前为止所做的：

1. 找到匹配 $eq: 5 和 $unset 的第一个条目。（如您所知：$unset 不会删除它。它只是将其设置为 null）：

update(
  { 'foo.bar': 5 },
  { $unset: { 'foo.bar.$': 1 } } 
)

2. $pull所有条目$eq: null，所以原来的5真的没了：

update(
  {},
  { $pull: { 'foo.bar': null } } 
)

3. $pull 所有条目 $lt: 4:

update(
  {},
  { $pull: { 'foo.bar': { $lt: 4 } } } 
)

生成的文档：

{
  _id: 42,
  foo: {
    bar: [4, 5]
  }
}

想法和想法：

扩展查询 1.，这样它将 $unset 个条目 $lt: 4 和一个条目 $eq: 5。之后我们可以执行查询 2. 而不需要查询 3..
将查询 2. 扩展到 $pull 匹配 $or: [{$lt: 4}, {$eq: 5}] 的所有内容。那就不用查询了 3..
将查询 2. 扩展到 $pull $not: { $gte: 4 } 的所有内容。此表达式应匹配 $lt: 4 和 $eq: null.

我已经尝试实现这些查询，但有时它会抱怨查询语法，有时查询确实执行了但什么也没删除。

如果有人对此有可行的解决方案，那就太好了。

Answer 1

不确定我是否理解您的全部意思，但是要 "bulk" 更新文档，您可以随时采用这种方法，除了原始 $pull 并添加一些 "detection" 其中您需要从以下位置删除 "duplicate" 5 的文档：

// Remove less than four first
db.collection.update({},{ "$pull": { "foo.bar": { "$lt": 4 } } },{ "multi": true });

// Initialize Bulk
var bulk = db.collection.initializeOrderdBulkOp(),
    count = 0;

// Detect and cycle documents with duplicate five to be removed
db.collection.aggregate([
    // Project a "reduced" array and calculate if the same size as orig
    { "$project": { 
         "foo.bar": { "$setUnion": [ "$foo.bar", [] ] },
         "same": { "$eq": [
             { "$size": {  "$setUnion": [ "$foo.bar", [] ] } },
             { "$size": "$foo.bar" }
         ] }
    }},
    // Filter the results that were unchanged
    { "$match": { "same": true } }
]).forEach(function(doc) {
    bulk.find({ "_id": doc._id })
        .updateOne({ "$set": { "foo.bar": doc.foo.bar.sort() } });
    count++;

    // Execute per 1000 processed and re-init
    if ( count % 1000 == 0 ) {
        bulk.execute();
        bulk = db.collection.initializeOrderdBulkOp();
    }
});

// Clean up any batched
if ( count % 1000 != 0 )
    bulk.execute();

这将删除小于“4”的任何内容以及所有根据 "set" 长度差异检测到 "duplicate" 的重复项。

如果您只想将 5 的值作为重复值删除，您可以采用类似的逻辑方法来检测和修改，只是不使用 "set operators" 删除任何 "duplicate" 使其成为有效的 "set".

无论如何，某些检测策略会比迭代更新更好，直到 "all but one" 值消失。

当然你可以稍微简化你的语句并删除一个更新操作，这不是很好，因为 $pull 不允许在查询中使用 $or 条件，但我希望你明白了如果适用：

db.collection.update(
    { "foo.bar": 5 },
    { "$unset": { "foo.bar.$": 1 } },
    { "multi": true }
); // same approach

// So include all the values "less than four"
db.collection.update(
    { "foo.bar": { "$in": [1,2,3,null] } },
    { "$pull": { "foo.bar": { "$in": [1,2,3,null] } }},
    { "multi": true }
);

它的处理有点少，但当然这些需要是精确的整数值。否则坚持你正在做的三个更新。比在代码中循环更好。

作为参考，不幸的是 不起作用 的 "nicer" 语法将是这样的：

db.collection.update(
    { 
        "$or": [
            { "foo.bar": { "$lt": 4 } },
            { "foo.bar": null }
        ]
    },
    { 
        "$pull": { 
            "$or": [
                { "foo.bar": { "$lt": 4 } },
                { "foo.bar": null }
            ]
        }
    },
    { "multi": true }
);

可能值得一个 JIRA 问题，但我怀疑主要是因为数组元素不是紧跟在 $pull.

之后的 "first" 参数

Answer 2

您可以使用 Array.prototype.filter() and the Array.prototype.splice() 方法

filter() 方法创建一个包含 foo.bar 值 $lt: 4 的新闻数组，然后使用 splice 方法删除这些值，第一个值等于 5 来自 foo.bar

var idx = [];
db.collection.find().forEach(function(doc){ 
    idx = doc.foo.bar.filter(function(el){  
        return el < 4;
    }); 
    for(var i in idx){   
        doc.foo.bar.splice(doc.foo.bar.indexOf(idx[i]), 1); 
    } 
    doc.foo.bar.splice(doc.foo.bar.indexOf(5), 1); 
    db.collection.save(doc);
} )

MongoDB: $pull / $unset 有多个条件

MongoDB: $pull / $unset with multiple conditions

arrays

mongodb

mongodb-query

示例文档：

查询：

生成的文档：

想法和想法：