如何处理 MongoDB 中的大数据集

Question

我需要帮助来决定哪种模式类型更适合我的 mongodb collection。

假设我想存储一个人拥有的东西的列表。人会相对少，但一个人可以拥有很多东西。假设人们将以数百计，但一个人拥有的东西将以 数十万.

计

我能想到两个方案：

选项 1：

    [{
        id: 1,
        name: "Tom",
        things: [
            {
                name: 'red tie',
                weight: 0.3,
                value: 5
            },
            {
                name: 'carpet',
                weight: 15,
                value: 700
            } //... and 300'000 other things 
        ]
    },
    {
        id: 2,
        name: "Rob",
        things: [
            {
                name: 'can of olives',
                weight: 0.4,
                value: 2
            },
            {
                name: 'Porsche',
                weight: 1500,
                value: 40000
            }// and 170'000 other things
        ]
    }//and 214 oher people]
]

选项 2：

[
    {
        name: 'red tie',
        weight: 0.3,
        value: 5,
        owner: {
            name: 'Tom',
            id: 1
        }
    },
    {
        name: 'carpet',
        weight: 15,
        value: 700,
        owner: {
            name: 'Tom',
            id: 1
        }
    },
    {
        name: 'can of olives',
        weight: 0.4,
        value: 2,
        owner: {
            name: 'Rob',
            id: 2
        }
    },
    {
        name: 'Porsche',
        weight: 1500,
        value: 40000,
        owner: {
            name: 'Rob',
            id: 2
        }
    }// and 20'000'000 other things
];

我只会在一次请求中向一个所有者索取东西，绝不会向多个所有者索要东西。
我需要一个分页用于返回的事物列表，所以...
...需要根据其中一个参数

排序

据我了解，第一点表明使用选项 1（仅查询数百个文档而不是数百万个文档）会更有效，但是使用选项 2（限制、跳过和排序方法，而不是 $slice 投影和聚合框架）。

谁能告诉我哪种方式更合适？或者我有什么问题，还有更好的解决方案？

Answer 1

I will only ask for things from one owner in a single request and never ask for things from multiple owners.

I will need a pagination for the returned list of things so...

things will need to be sorted by one of the parameters

通过创建一个集合，其中每个项目都是一个单独的文档，您的要求 2 和 3 会得到更好的满足。对于数组，您将不得不使用聚合框架 $unwind 该数组，这可能会变得非常慢。通过在所述集合的 owner.name 或 owner.id 字段上创建索引，可以轻松优化您的第一个要求，具体取决于您用于查询的索引。

此外，MongoDB 不能很好地处理不断增长的文档。为了阻止用户创建无限增长的文档，MongoDB 每个文档的大小限制为 16MB。当您的每个项目都是几百个字节时，数十万个数组条目将超过该限制。

如何处理 MongoDB 中的大数据集

How to handle large data sets in MongoDB

database

database-design

mongodb

nosql