使用带有 dplyr 的向量汇总列
Summarizing columns using a vector with dplyr
我想计算某些列(存储在向量中的名称)的平均值,同时针对列进行分组。这是一个可重现的例子:
Cities <- c("London","New_York")
df <- data.frame(Grade = c(rep("Bad",2),rep("Average",4),rep("Good",4)),
London = seq(1,10,1),
New_York = seq(11,20,1),
Shanghai = seq(21,30,1))
> df
Grade London New_York Shanghai
1 Bad 1 11 21
2 Bad 2 12 22
3 Average 3 13 23
4 Average 4 14 24
5 Average 5 15 25
6 Average 6 16 26
7 Good 7 17 27
8 Good 8 18 28
9 Good 9 19 29
10 Good 10 20 30
我想要的输出:
> df %>% group_by(Grade) %>% summarise(London = mean(London), New_York = mean(New_York))
# A tibble: 3 x 3
Grade London New_York
<fct> <dbl> <dbl>
1 Average 4.5 14.5
2 Bad 1.5 11.5
3 Good 8.5 18.5
我想 select 向量 cities
中的元素(不调用它们的名字)在 summarise
中,同时保留它们在向量中的原始名称
你可以这样做:
df %>%
group_by(Grade) %>%
summarise_at(vars(one_of(Cities)), mean)
Grade London New_York
<fct> <dbl> <dbl>
1 Average 4.5 14.5
2 Bad 1.5 11.5
3 Good 8.5 18.5
来自文档:
one_of(): Matches variable names in a character vector.
vars
可以采用 vector
这样的列名。 select-helpers
(matches
、starts_with
、ends_with
当我们要匹配某种模式时使用)。现在,随着当前的实现 vars
更加通用,它可以 select 列,deselect (with -
)
library(dplyr)
df %>%
group_by(Grade) %>%
summarise_at(vars(Cities), mean)
# A tibble: 3 x 3
# Grade London New_York
# <fct> <dbl> <dbl>
#1 Average 4.5 14.5
#2 Bad 1.5 11.5
#3 Good 8.5 18.5
我想计算某些列(存储在向量中的名称)的平均值,同时针对列进行分组。这是一个可重现的例子:
Cities <- c("London","New_York")
df <- data.frame(Grade = c(rep("Bad",2),rep("Average",4),rep("Good",4)),
London = seq(1,10,1),
New_York = seq(11,20,1),
Shanghai = seq(21,30,1))
> df
Grade London New_York Shanghai
1 Bad 1 11 21
2 Bad 2 12 22
3 Average 3 13 23
4 Average 4 14 24
5 Average 5 15 25
6 Average 6 16 26
7 Good 7 17 27
8 Good 8 18 28
9 Good 9 19 29
10 Good 10 20 30
我想要的输出:
> df %>% group_by(Grade) %>% summarise(London = mean(London), New_York = mean(New_York))
# A tibble: 3 x 3
Grade London New_York
<fct> <dbl> <dbl>
1 Average 4.5 14.5
2 Bad 1.5 11.5
3 Good 8.5 18.5
我想 select 向量 cities
中的元素(不调用它们的名字)在 summarise
中,同时保留它们在向量中的原始名称
你可以这样做:
df %>%
group_by(Grade) %>%
summarise_at(vars(one_of(Cities)), mean)
Grade London New_York
<fct> <dbl> <dbl>
1 Average 4.5 14.5
2 Bad 1.5 11.5
3 Good 8.5 18.5
来自文档:
one_of(): Matches variable names in a character vector.
vars
可以采用 vector
这样的列名。 select-helpers
(matches
、starts_with
、ends_with
当我们要匹配某种模式时使用)。现在,随着当前的实现 vars
更加通用,它可以 select 列,deselect (with -
)
library(dplyr)
df %>%
group_by(Grade) %>%
summarise_at(vars(Cities), mean)
# A tibble: 3 x 3
# Grade London New_York
# <fct> <dbl> <dbl>
#1 Average 4.5 14.5
#2 Bad 1.5 11.5
#3 Good 8.5 18.5