MongoDB, PyMongo 如何根据 uniques 字段的计数过滤结果?
MongoDB, PyMongo how to filter results by count of uniques field?
MongoDb包含下一组数据
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "b1", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"},
{"user": "some user", "domain": "work.com"} ]
我需要 select 按域过滤的第一个项目,结果不超过 2 个相同的域。
mongo 后查询结果应该类似于
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"}]
只有 2 个具有相同域的结果,必须跳过其他具有相同域的结果。这可能与 $aggregation、$filter 或其他东西有关吗?
是否可以按域分组并仅获取前 N(示例中为 2)个用户数据?示例:
[{"domain": "some.com", "users": [a, b]}]
所以
{"user": "b1", "domain": "some.com"} will be skip
执行 MongoDB 聚合可能会得到想要的结果。
它分为四个阶段:
1、我们按domain
字段分组,累计成data
个同域名的文档
2. 然后,我们拼接数组以设置每个域最多 2 个项目
3. 我们用 $unwind
operator
展平 data
字段
4.我们return原始文件结构用$replaceRoot
运算符
db.collection.aggregate([
{
"$group": {
"_id": "$domain",
"data": { "$push": "$$ROOT" }
}
},
{
"$addFields": {
"data": {
"$slice": [ "$data", 0, 2 ]
}
}
},
{
"$unwind": "$data"
},
{
$replaceRoot: { "newRoot": "$data" }
}
])
MongoDb包含下一组数据
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "b1", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"},
{"user": "some user", "domain": "work.com"} ]
我需要 select 按域过滤的第一个项目,结果不超过 2 个相同的域。 mongo 后查询结果应该类似于
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"}]
只有 2 个具有相同域的结果,必须跳过其他具有相同域的结果。这可能与 $aggregation、$filter 或其他东西有关吗?
是否可以按域分组并仅获取前 N(示例中为 2)个用户数据?示例:
[{"domain": "some.com", "users": [a, b]}]
所以
{"user": "b1", "domain": "some.com"} will be skip
执行 MongoDB 聚合可能会得到想要的结果。
它分为四个阶段:
1、我们按domain
字段分组,累计成data
个同域名的文档
2. 然后,我们拼接数组以设置每个域最多 2 个项目
3. 我们用 $unwind
operator
展平 data
字段
4.我们return原始文件结构用$replaceRoot
运算符
db.collection.aggregate([
{
"$group": {
"_id": "$domain",
"data": { "$push": "$$ROOT" }
}
},
{
"$addFields": {
"data": {
"$slice": [ "$data", 0, 2 ]
}
}
},
{
"$unwind": "$data"
},
{
$replaceRoot: { "newRoot": "$data" }
}
])