Link 文档中数组的每个元素到另一个文档数组中的对应元素 MongoDB

Question

使用 MongoDB 4.2 和 MongoDB Atlas 测试聚合管道。

我有这个 products 集合，其中包含具有以下架构的文档：

 {
    "name": "TestProduct",
    "relatedList": [
      {id:ObjectId("someId")},
      {id:ObjectId("anotherId")}
    ]
 }

然后是这个 cities 集合，包含具有以下架构的文档：

{
        "name": "TestCity",
        "instructionList": [
          { related_id: ObjectId("anotherId"), foo: bar},
          { related_id: ObjectId("someId"), foo: bar}
          { related_id: ObjectId("notUsefulId"), foo: bar}
          ...
        ]
 }

我的objective是把两个集合join起来输出这样的东西（操作是从city文档的instructionList中挑选出每一个相关的对象放到product文档的relatedList中）:

{
        "name": "TestProduct",
        "relatedList": [
          { related_id: ObjectId("someId"), foo: bar},
          { related_id: ObjectId("anotherId"), foo: bar},
        ]
}

我尝试使用 $lookup 运算符进行聚合，例如 :

$lookup:{
  from: 'cities',
  let: {rId:'$relatedList._id'},
  pipeline: [
         {
           $match: {
             $expr: {
               $eq: ["$instructionList.related_id", "$$rId"]
             }
           }
         },
  ]
}

但它不起作用，我对这种复杂的管道语法有点迷茫。

编辑

通过在两个数组上使用 unwind :

    { 
         {$unwind: "$relatedList"},
         {$lookup:{
             from: "cities",
             let: { "rId": "$relatedList.id" },
             pipeline: [
        
                {$unwind:"$instructionList"},
                {$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},

             ],
             as:"instructionList",
         }},

         {$group: {
             _id: "$_id",
             instructionList: {$addToSet:"$instructionList"}

          }}
}

我能够实现我想要的，但是，我根本没有得到干净的结果:

{
 "name": "TestProduct",
 instructionList: [
    [
      {
        "name": "TestCity",
        "instructionList": {
         "related_id":ObjectId("someId")
        }
      }
    ],
    [
      {
        "name": "TestCity",
        "instructionList": {
         "related_id":ObjectId("anotherId")
        }
      }
    ]
 ]
}

我怎样才能把所有的东西都分组到像我原来的问题所说的那样干净？再一次，我完全迷失了聚合框架。

Answer 1

我相信您只需要 $unwind 数组来查找关系，然后 $group 来重新收集它们。也许是这样的：

.aggregeate([
    {$unwind:"relatedList"},
    {$lookup:{
         from:"cities",
         let:{rId:"$relatedList.id"}
         pipeline:[
             {$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
             {$unwind:"$instructionList"},
             {$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
             {$project:{_id:0, instruction:"$instructionList"}}
         ],
         as: "lookedup"
     }},
     {$addFields: {"relatedList.foo":"$lookedup.0.instruction.foo"}},
     {$group: {
                _id:"$_id",
                root: {$first:"$$ROOT"},
                relatedList:{$push:"$relatedList"}
     }},
     {$addFields:{"root.relatedList":"$relatedList"}},
     {$replaceRoot:{newRoot:"$root"}}
])

关于每个阶段的一些信息：

$unwind 为数组的每个元素复制整个文档，用单个元素替换数组
$lookup 然后可以分别考虑每个元素。 $lookup.pipeline:
中的阶段一种。 $match 所以我们只展开具有匹配 ID
的文档 b. $展开数组，以便我们可以考虑单个元素
C。重复 $match 所以我们只剩下匹配的元素（希望只有 1 个）
$addFields 将从查找中检索到的 foo 字段分配给 relatedList
$group 收集所有具有相同 _id 的文档（即从单个原始文档展开），将第一个存储为 'root'，并将所有 relatedList 元素推回数组
$addFields 将 relatedList 移动到根目录
$replaceRoot returns root，现在应该是原始文档，每个 relatedList 元素

foo

Answer 2

the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)

给出了关于 cities 集合的示例文档：

{"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
  "name": "TestCity",
  "instructionList": [
    {"related_id": "a", "foo": "x"},
    {"related_id": "b", "foo": "y"},
    {"related_id": "c", "foo": "z"}
]}

以及关于 products 集合的示例文档：

{"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
  "name": "TestProduct",
  "relatedList": [
    {"id": "a"},
    {"id": "b"}
]}

您可以尝试使用以下聚合管道：

db.products.aggregate([
    {"$lookup":{
        "from": "cities", 
        "let": { "rId": "$relatedList.id" }, 
        "pipeline": [
            {"$unwind":"$instructionList"},
            {"$match":{
                "$expr":{
                    "$in":["$instructionList.related_id", "$$rId"]
                }
            }
        }], 
        "as":"relatedList",
    }}, 
    {"$project":{
        "name":"$name",
        "relatedList":{
            "$map":{
                "input":"$relatedList",
                "as":"x",
                "in":{
                    "related_id":"$$x.instructionList.related_id",
                    "foo":"$$x.instructionList.foo"
                }                
            }
        }
    }}
]);

得到如下结果：

{  "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
   "name": "TestProduct",
   "relatedList": [
          {"related_id": "a", "foo": "x"},
          {"related_id": "b", "foo": "y"}
]}

以上内容在MongoDB v4.2.x.

中测试

But it's not working, I'm a bit lost with this complex pipeline syntax.

这里稍微复杂的原因是因为您有一个数组 relatedList 和一个子文档数组 instructionList。当您使用 $eq 运算符引用 instructionList.related_id（这可能意味着多个值）时，管道不知道要匹配哪一个。

在上面的管道中，我添加了 $unwind stage to turn instructionList into multiple single documents. Afterward, using $in 来表示数组 relatedList 中单个值 instructionList.related_id 的匹配。

Link 文档中数组的每个元素到另一个文档数组中的对应元素 MongoDB

Link each element of array in a document to the corresponding element in an array of another document with MongoDB

pipeline

mongodb

bson

编辑