根据条件计算变量的最高百分比
Calculate top % of variable according to condition
样本数据结构如下:
Individ <- data.frame(Participant = c("Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill",
"Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry"),
Time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
Power = c(400, 250, 180, 500, 300, 450, 600, 512, 300, 500, 450, 200, 402, 210, 130, 520, 310, 451, 608, 582, 390, 570, NA, NA))
我计算了两秒、三秒和四秒的滚动平均值 Power
。我知道我可以通过执行以下操作对每个滚动平均值进行子集化(考虑 Participant
的变化):
Individ$TwoSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 2, mean, na.rm = TRUE, fill = NA) )
Individ$ThreeSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 3, mean, na.rm = TRUE, fill = NA) )
Individ$FourSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 4, mean, na.rm = TRUE, fill = NA) )
我现在希望为每个滚动平均值(TwoSec
、ThreeSec
和 FourSec
)找到 Power
的前 5%。我该怎么做才能解释 Name
的变化并进行计算?
我的实际 data.frame
超过 300 万行,因此最好有一个快速的解决方案。
我们可以试试
library(data.table)
library(RcppRoll)
setDT(Individ)[, lapply(2:4, function(n) {
r1 <- roll_mean(Power, n, fill=NA)
r2 <- r1[order(-r1)]
r2[seq(ceiling(.N*0.05))]}) , by = Participant]
样本数据结构如下:
Individ <- data.frame(Participant = c("Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill",
"Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry", "Harry"),
Time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
Power = c(400, 250, 180, 500, 300, 450, 600, 512, 300, 500, 450, 200, 402, 210, 130, 520, 310, 451, 608, 582, 390, 570, NA, NA))
我计算了两秒、三秒和四秒的滚动平均值 Power
。我知道我可以通过执行以下操作对每个滚动平均值进行子集化(考虑 Participant
的变化):
Individ$TwoSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 2, mean, na.rm = TRUE, fill = NA) )
Individ$ThreeSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 3, mean, na.rm = TRUE, fill = NA) )
Individ$FourSec <- ave(Individ$Power, Individ$Participant,
FUN= function(x) rollapply(x, 4, mean, na.rm = TRUE, fill = NA) )
我现在希望为每个滚动平均值(TwoSec
、ThreeSec
和 FourSec
)找到 Power
的前 5%。我该怎么做才能解释 Name
的变化并进行计算?
我的实际 data.frame
超过 300 万行,因此最好有一个快速的解决方案。
我们可以试试
library(data.table)
library(RcppRoll)
setDT(Individ)[, lapply(2:4, function(n) {
r1 <- roll_mean(Power, n, fill=NA)
r2 <- r1[order(-r1)]
r2[seq(ceiling(.N*0.05))]}) , by = Participant]