在 dplyr 中有条件地计数
Conditionally Count in dplyr
我有一些会员订单数据,我想按订单周汇总。
数据是这样的:
memberorders=data.frame(MemID=c('A','A','B','B','B','C','C','D'),
week = c(1,2,1,4,5,1,4,1),
value = c(10,20,10,10,2,5,30,3))
我正在使用 dplyr group_by MemID
并总结 week<=2
和 week<=4
的“价值”(以查看每个成员在第 1 周内订购了多少- 2和1-4.我目前的代码是:
MemberLTV <- memberorders %>%
group_by(MemID) %>%
summarize(
sum2 = sum(value[week<=2]),
sum4 = sum(value[week<=4]))
我现在正尝试在汇总中添加两个字段,count2 和 count4,它们将计算每个条件(week <=2
和 week <=4
)的实例数。
期望的输出是:
output = data.frame(MemID = c('A','B','C','D'),
sum2 = c(30,10,5,3),
sum4 = c(30,20,35,3),
count2 = c(2,1,1,1),
count4 = c(2,2,2,1))
我猜这只是 sum 函数的一个小调整,但我很难弄明白。
使用 plyr
包可以做到
ddply(memberorders,.(MemID),
summarise,
val1 = sum(value[week<=2]),
val2 = sum(value[week<=4]),
val3 = length(value[week<=2]),
val4 = length(value[week<=4]))
MemID val1 val2 val3 val4
1 A 30 30 2 2
2 B 10 20 1 2
3 C 5 35 1 2
4 D 3 3 1 1
尝试
library(dplyr)
memberorders %>%
group_by(MemID) %>%
summarise(sum2= sum(value[week<=2]), sum4= sum(value[week <=4]),
count2=sum(week<=2), count4= sum(week<=4))
使用之前的两个想法并保持一致:
library(tidyverse)
MemberLTV_2 <- memberorders %>%
group_by(MemID) %>%
summarize(
count2 = length(value[week<=2]),
count4 = length(value[week<=4]),
sum2 = sum(value[week<=2]),
sum4 = sum(value[week<=4])
)
我有一些会员订单数据,我想按订单周汇总。
数据是这样的:
memberorders=data.frame(MemID=c('A','A','B','B','B','C','C','D'),
week = c(1,2,1,4,5,1,4,1),
value = c(10,20,10,10,2,5,30,3))
我正在使用 dplyr group_by MemID
并总结 week<=2
和 week<=4
的“价值”(以查看每个成员在第 1 周内订购了多少- 2和1-4.我目前的代码是:
MemberLTV <- memberorders %>%
group_by(MemID) %>%
summarize(
sum2 = sum(value[week<=2]),
sum4 = sum(value[week<=4]))
我现在正尝试在汇总中添加两个字段,count2 和 count4,它们将计算每个条件(week <=2
和 week <=4
)的实例数。
期望的输出是:
output = data.frame(MemID = c('A','B','C','D'),
sum2 = c(30,10,5,3),
sum4 = c(30,20,35,3),
count2 = c(2,1,1,1),
count4 = c(2,2,2,1))
我猜这只是 sum 函数的一个小调整,但我很难弄明白。
使用 plyr
包可以做到
ddply(memberorders,.(MemID),
summarise,
val1 = sum(value[week<=2]),
val2 = sum(value[week<=4]),
val3 = length(value[week<=2]),
val4 = length(value[week<=4]))
MemID val1 val2 val3 val4
1 A 30 30 2 2
2 B 10 20 1 2
3 C 5 35 1 2
4 D 3 3 1 1
尝试
library(dplyr)
memberorders %>%
group_by(MemID) %>%
summarise(sum2= sum(value[week<=2]), sum4= sum(value[week <=4]),
count2=sum(week<=2), count4= sum(week<=4))
使用之前的两个想法并保持一致:
library(tidyverse)
MemberLTV_2 <- memberorders %>%
group_by(MemID) %>%
summarize(
count2 = length(value[week<=2]),
count4 = length(value[week<=4]),
sum2 = sum(value[week<=2]),
sum4 = sum(value[week<=4])
)