汇总以在 data.table r 中获得分组依据的均值和第一个值
Summarise to get mean and first value of group by in data.table r
我正在尝试使用 data.table
进行一些聚合以获得某些列的 mean 和 first value。例如:
dt <- data.table(mtcars)
dt[, .(disp = mean(disp, na.rm=T),
hp = mean(hp, na.rm=T)),
by=cyl]
out:
cyl disp hp
1: 6 183.3143 122.28571
2: 4 105.1364 82.63636
3: 8 353.1000 209.21429
desired:
cyl disp hp wt
1: 6 183.3143 122.28571 2.62
2: 4 105.1364 82.63636 2.32
3: 8 353.1000 209.21429 3.44
要自己提取第一行,可以这样做:
dt[, .SD[1], by=cyl][,.(cyl,wt)]
out:
cyl wt
1: 6 2.62
2: 4 2.32
3: 8 3.44
但是如何使用其他聚合函数来实现呢?
在dplyr
中,我会简单地做:
mtcars %>% group_by(cyl) %>%
summarise(disp = mean(disp, na.rm=T),
hp = mean(disp, na.rm=T),
wt = first(wt))
out:
cyl disp hp wt
1 4 105.1364 105.1364 2.32
2 6 183.3143 183.3143 2.62
3 8 353.1000 353.1000 3.44
data.table
还有 first
library(data.table)
dt[, .(disp = mean(disp, na.rm=T),
hp = mean(hp, na.rm=T),
wt = data.table::first(wt)),
by=cyl]
cyl disp hp wt
1: 6 183.3143 122.28571 2.62
2: 4 105.1364 82.63636 2.32
3: 8 353.1000 209.21429 3.44
我正在尝试使用 data.table
进行一些聚合以获得某些列的 mean 和 first value。例如:
dt <- data.table(mtcars)
dt[, .(disp = mean(disp, na.rm=T),
hp = mean(hp, na.rm=T)),
by=cyl]
out:
cyl disp hp
1: 6 183.3143 122.28571
2: 4 105.1364 82.63636
3: 8 353.1000 209.21429
desired:
cyl disp hp wt
1: 6 183.3143 122.28571 2.62
2: 4 105.1364 82.63636 2.32
3: 8 353.1000 209.21429 3.44
要自己提取第一行,可以这样做:
dt[, .SD[1], by=cyl][,.(cyl,wt)]
out:
cyl wt
1: 6 2.62
2: 4 2.32
3: 8 3.44
但是如何使用其他聚合函数来实现呢?
在dplyr
中,我会简单地做:
mtcars %>% group_by(cyl) %>%
summarise(disp = mean(disp, na.rm=T),
hp = mean(disp, na.rm=T),
wt = first(wt))
out:
cyl disp hp wt
1 4 105.1364 105.1364 2.32
2 6 183.3143 183.3143 2.62
3 8 353.1000 353.1000 3.44
data.table
还有 first
library(data.table)
dt[, .(disp = mean(disp, na.rm=T),
hp = mean(hp, na.rm=T),
wt = data.table::first(wt)),
by=cyl]
cyl disp hp wt
1: 6 183.3143 122.28571 2.62
2: 4 105.1364 82.63636 2.32
3: 8 353.1000 209.21429 3.44