如何将 Mongo 数据库聚合结果附加到现有集合？

Question

我正在尝试使用以下代码对现有 Mongo 数据库集合执行多次插入

db.dados_meteo.aggregate( [
                  { $match : { "POM" : "AguiardaBeira" } },
                  { $project : {
                     _id : { $concat: [
                        "0001:",
                      { $substr: [ "$DTM", 0, 4 ] },
                      { $substr: [ "$DTM", 5, 2 ] },
                      { $substr: [ "$DTM", 8, 2 ] },
                      { $substr: [ "$DTM", 11, 2 ] },
                      { $substr: [ "$DTM", 14, 2 ] },
                      { $substr: [ "$DTM", 17, 2 ] }
                       ] },
                    "RNF" : 1, "WET":1,"HMD":1,"TMP":1 } },
                  { $out : "dados_meteo_reloaded" }
              ] )

但每次我更改 $match 参数并进行新聚合时，Mongo DB 会删除以前的文档并插入新结果。

你能帮帮我吗？

Answer 1

简短的回答是"you can't"：

If the collection specified by the $out operation already exists, then upon completion of the aggregation, the $out stage atomically replaces the existing collection with the new results collection. The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.

作为一种解决方法，您可以在聚合后立即将 $out 指定的集合文档复制到 "permanent" 集合，方法有以下几种（虽然没有一种是理想的）：

copyTo() 是最简单的，注意警告。不要为了小效果而使用其他。
使用 JS：db.out.find().forEach(function(doc) {db.target.insert(doc)})
使用 mongoexport / mongoimport

Answer 2

这不是有史以来最漂亮的东西，但作为另一种替代语法（来自 post-processing archive/append 操作）...

db.targetCollection.insertMany(db.runCommand(
{
    aggregate: "sourceCollection",
    pipeline: 
    [
        { $skip: 0 },
        { $limit: 5 },
        { 
            $project:
            {
                myObject: "$$ROOT",
                processedDate: { $add: [new ISODate(), 0] }
            }
        }
    ]
}).result)

我不确定这与 forEach 变体相比如何，但我发现它阅读起来更直观。

Answer 3

开始Mongo 4.2，新的$merge聚合运算符（类似于$out）允许合并聚合管道的结果到指定的集合：

鉴于此输入：

db.source.insert([
  { "_id": "id_1", "a": 34 },
  { "_id": "id_3", "a": 38 },
  { "_id": "id_4", "a": 54 }
])
db.target.insert([
  { "_id": "id_1", "a": 12 },
  { "_id": "id_2", "a": 54 }
])

$merge聚合阶段可以这样使用：

db.source.aggregate([
  // { $whatever aggregation stage, for this example, we just keep records as is }
  { $merge: { into: "target" } }
])

生产：

// > db.target.find()
{ "_id" : "id_1", "a" : 34 }
{ "_id" : "id_2", "a" : 54 }
{ "_id" : "id_3", "a" : 38 }
{ "_id" : "id_4", "a" : 54 }

请注意，$merge 运算符附带 many options 以指定如何合并与现有记录冲突的插入记录。

在这种情况下（使用默认选项），这：

保留目标集合的现有文档（{ "_id": "id_2", "a": 54 }就是这种情况）
将聚合管道输出中的文档不存在时插入到目标集合中（基于 _id - 这是 { "_id" : "id_3", "a" : 38 } 的情况）
在聚合管道生成目标集合中存在的文档时替换目标集合的记录（基于_id - 这是{ "_id": "id_1", "a": 12 }替换为[=的情况23=])

如何将 Mongo 数据库聚合结果附加到现有集合？

How do I append Mongo DB aggregation results to an existing collection?

mongodb

aggregation-framework