Pymongo 和聚合框架查询
Pymongo and Aggregation framework query
我尝试在 pymongo 中编写聚合查询,但我自己做不到,我尝试了文档,尝试了 google,尝试了 Stack Overflow,但我找不到答案。我的数据样本:
{"_id": id, "site": "site A", "weekday": 1, "value": 1}
{"_id": id, "site": "site B", "weekday": 2, "value": 0}
{"_id": id, "site": "site C", "weekday": 3, "value": 1}
{"_id": id, "site": "site A", "weekday": 2, "value": 0}
{"_id": id, "site": "site B", "weekday": 3, "value": -1}
{"_id": id, "site": "site C", "weekday": 2, "value": 1}
{"_id": id, "site": "site A", "weekday": 1, "value": -1}
{"_id": id, "site": "site B", "weekday": 3, "value": 1}
而我需要的是:
对于单个站点,假设 "site A",我需要每个工作日的词典列表(总共 7 个),"values" 的计数大于 0、等于 0 且小于 0。所有按工作日排序。
所以我的输出应该是这样的:
{"weekday": 1, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 2, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 3, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 4, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 5, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 6, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 7, "greaterCount": x, "lesserCount": y, "zeroCount": z}
当然,greaterCount、lesserCount 和 zeroCount 的值在不同的工作日会有所不同,我的示例输出中的每个字典中都有 x、y 和 z,因为我很懒。
您在这里寻找的基本上是 $cond
运算符。这是一个 "ternary" 条件,其计算结果为 return 来自逻辑条件 true/false
的值。
在这种情况下,每个 "logical" 测试都会查看当前的 "value" 字段并确定 true
到 $gt
whether to return a positive value to $sum
或 0
等测试的位置值改为:
db.collection.aggregate([
{ "$group": {
"_id": "$weekday",
"greaterCount": {
"$sum": {
"$cond": [
{ "$gt": [ "$value", 0 ] },
1,
0
]
}
},
"lesserCount": {
"$sum": {
"$cond": [
{ "$lt": [ "$value", 0 ] },
1,
0
]
}
},
"zeroCount": {
"$sum": {
"$cond": [
{ "$eq": [ "$value", 0 ] },
1,
0
]
}
}
}}
])
在样品上产生:
{ "_id" : 3, "greaterCount" : 2, "lesserCount" : 1, "zeroCount" : 0 }
{ "_id" : 2, "greaterCount" : 1, "lesserCount" : 0, "zeroCount" : 2 }
{ "_id" : 1, "greaterCount" : 1, "lesserCount" : 1, "zeroCount" : 0 }
我尝试在 pymongo 中编写聚合查询,但我自己做不到,我尝试了文档,尝试了 google,尝试了 Stack Overflow,但我找不到答案。我的数据样本:
{"_id": id, "site": "site A", "weekday": 1, "value": 1}
{"_id": id, "site": "site B", "weekday": 2, "value": 0}
{"_id": id, "site": "site C", "weekday": 3, "value": 1}
{"_id": id, "site": "site A", "weekday": 2, "value": 0}
{"_id": id, "site": "site B", "weekday": 3, "value": -1}
{"_id": id, "site": "site C", "weekday": 2, "value": 1}
{"_id": id, "site": "site A", "weekday": 1, "value": -1}
{"_id": id, "site": "site B", "weekday": 3, "value": 1}
而我需要的是:
对于单个站点,假设 "site A",我需要每个工作日的词典列表(总共 7 个),"values" 的计数大于 0、等于 0 且小于 0。所有按工作日排序。
所以我的输出应该是这样的:
{"weekday": 1, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 2, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 3, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 4, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 5, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 6, "greaterCount": x, "lesserCount": y, "zeroCount": z}
{"weekday": 7, "greaterCount": x, "lesserCount": y, "zeroCount": z}
当然,greaterCount、lesserCount 和 zeroCount 的值在不同的工作日会有所不同,我的示例输出中的每个字典中都有 x、y 和 z,因为我很懒。
您在这里寻找的基本上是 $cond
运算符。这是一个 "ternary" 条件,其计算结果为 return 来自逻辑条件 true/false
的值。
在这种情况下,每个 "logical" 测试都会查看当前的 "value" 字段并确定 true
到 $gt
whether to return a positive value to $sum
或 0
等测试的位置值改为:
db.collection.aggregate([
{ "$group": {
"_id": "$weekday",
"greaterCount": {
"$sum": {
"$cond": [
{ "$gt": [ "$value", 0 ] },
1,
0
]
}
},
"lesserCount": {
"$sum": {
"$cond": [
{ "$lt": [ "$value", 0 ] },
1,
0
]
}
},
"zeroCount": {
"$sum": {
"$cond": [
{ "$eq": [ "$value", 0 ] },
1,
0
]
}
}
}}
])
在样品上产生:
{ "_id" : 3, "greaterCount" : 2, "lesserCount" : 1, "zeroCount" : 0 }
{ "_id" : 2, "greaterCount" : 1, "lesserCount" : 0, "zeroCount" : 2 }
{ "_id" : 1, "greaterCount" : 1, "lesserCount" : 1, "zeroCount" : 0 }