MongoDB 嵌入式文档：大小限制和聚合性能问题

Question

在MongoDB的文档中，建议将尽可能多的数据放在一个文档中。还建议不要使用基于 ObjectId ref 的子文档，除非这些子文档的数据必须从多个文档中引用。

在我的例子中，我有一个这样的一对多关系：

日志架构：

const model = (mongoose) => {
    const LogSchema = new mongoose.Schema({
        result: { type: String, required: true },
        operation: { type: Date, required: true },
        x: { type: Number, required: true },
        y: { type: Number, required: true },
        z: { type: Number, required: true }
    });
    const model = mongoose.model("Log", LogSchema);
    return model;
};

机器架构：

const model = (mongoose) => {
    const MachineSchema = new mongoose.Schema({
        model: { type: String, required: true },
        description: { type: String, required: true },
        logs: [ mongoose.model("Log").schema ]
    });
    const model = mongoose.model("Machine", MachineSchema);
    return model;
};
module.exports = model;

每个机器会有很多Production_Log个文件（超过一百万）。使用嵌入式文档，我在测试期间很快就达到了每个文档 16mb 的限制，并且我无法再将 Production_Log 文档添加到 Machine 文档。

我的疑惑

在这种情况下，是否应该使用子文档作为 ObjectId 引用而不是嵌入文档？
有没有其他我可以评估的解决方案？
我将访问 Production_Log 文档以使用聚合框架为每个 Machine 生成统计信息.我应该对模式设计有任何额外的考虑吗？

非常感谢您的建议！

Answer 1

请查看此方法是否适合您的需要

Log 集合将生成更多数据，而 Machine 集合永远不会超过 16MB。不要将 Log 集合嵌入到 Machine 集合中，反之亦然。

您修改后的架构会像这样

机器架构：

const model = (mongoose) => {
    const MachineSchema = new mongoose.Schema({
        model: { type: String, required: true },
        description: { type: String, required: true }        
    });
    const model = mongoose.model("Machine", MachineSchema);
    return model;
};
module.exports = model;

日志架构：

const model = (mongoose) => {
    const LogSchema = new mongoose.Schema({
        result: { type: String, required: true },
        operation: { type: Date, required: true },
        x: { type: Number, required: true },
        y: { type: Number, required: true },
        z: { type: Number, required: true },
        machine: [ mongoose.model("Machine").schema ]
    });
    const model = mongoose.model("Log", LogSchema);
    return model;
};

如果我们仍然超过文档的大小 (16MB)，那么在日志模式中，我们可以根据我们生成的日志为每个 Day/Hour/Week 创建一个新文档。

Answer 2

Database normalization不适用于MongoDB

MongoDB 如果将完整信息存储在单个文档中（数据冗余），则扩展性更好。数据库规范化要求将数据拆分到不同的集合中，但是一旦数据增长，就会导致瓶颈问题。

仅使用 LOG 架构：

const model = (mongoose) => {
    const LogSchema = new mongoose.Schema({
        model: { type: String, required: true },
        description: { type: String, required: true },
        result: { type: String, required: true },
        operation: { type: Date, required: true },
        x: { type: Number, required: true },
        y: { type: Number, required: true },
        z: { type: Number, required: true }
    });
    const model = mongoose.model("Log", LogSchema);
    return model;
};

读/写操作以这种方式很好地缩放。

使用 Aggregation 您可以处理数据并计算所需的结果。

MongoDB 嵌入式文档：大小限制和聚合性能问题

MongoDB embedded documents: size limit and aggregation performance concerns

javascript

mongoose

mongodb

node.js

mongoose-schema

Database normalization不适用于MongoDB