使用 mutate_at 和 which.max 对数据框的选定列进行操作
Using mutate_at and which.max to operate on selected columns of a data frame
我正在尝试结合使用 mutate_at
和 which.max
来操作数据框,如下所述。
#This is basically what I want to achieve
df_want <- iris %>% group_by(Species) %>% mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
Sepal.Width = Sepal.Width[which.max(Petal.Width)])
#Here is my attempt at a smarter solution, but it does not work
df_attempt <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x[which.max("Petal.Width")])
#However, this works
df_test <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x + 100)
生成 df_attempt
的代码无效。我收到以下错误消息:
Error in mutate_impl(.data, dots) :
Column `Sepal.Length` must be length 50 (the group size) or one, not 0
有什么想法可以在仍然使用 mutate_at
的同时解决这个问题吗?
标准的 dplyr
方法是:
df_want <- iris %>%
group_by(Species) %>%
mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
Sepal.Width = Sepal.Width[which.max(Petal.Width)])
df_attempt <- iris %>%
group_by(Species) %>%
mutate_at(vars(Sepal.Length, Sepal.Width), funs(.[which.max(Petal.Width)]))
结果:
# A tibble: 150 x 5
# Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 5 3.5 1.4 0.2 setosa
2 5 3.5 1.4 0.2 setosa
3 5 3.5 1.3 0.2 setosa
4 5 3.5 1.5 0.2 setosa
5 5 3.5 1.4 0.2 setosa
6 5 3.5 1.7 0.4 setosa
7 5 3.5 1.4 0.3 setosa
8 5 3.5 1.5 0.2 setosa
9 5 3.5 1.4 0.2 setosa
10 5 3.5 1.5 0.1 setosa
# ... with 140 more rows
> identical(df_want, df_attempt)
[1] TRUE
注:
使用 vars
您可以使用 NSE 引用变量。
使用 funs
你可以用 .
引用每一列,相当于 function(x) x
我正在尝试结合使用 mutate_at
和 which.max
来操作数据框,如下所述。
#This is basically what I want to achieve
df_want <- iris %>% group_by(Species) %>% mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
Sepal.Width = Sepal.Width[which.max(Petal.Width)])
#Here is my attempt at a smarter solution, but it does not work
df_attempt <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x[which.max("Petal.Width")])
#However, this works
df_test <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x + 100)
生成 df_attempt
的代码无效。我收到以下错误消息:
Error in mutate_impl(.data, dots) :
Column `Sepal.Length` must be length 50 (the group size) or one, not 0
有什么想法可以在仍然使用 mutate_at
的同时解决这个问题吗?
标准的 dplyr
方法是:
df_want <- iris %>%
group_by(Species) %>%
mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
Sepal.Width = Sepal.Width[which.max(Petal.Width)])
df_attempt <- iris %>%
group_by(Species) %>%
mutate_at(vars(Sepal.Length, Sepal.Width), funs(.[which.max(Petal.Width)]))
结果:
# A tibble: 150 x 5
# Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 5 3.5 1.4 0.2 setosa
2 5 3.5 1.4 0.2 setosa
3 5 3.5 1.3 0.2 setosa
4 5 3.5 1.5 0.2 setosa
5 5 3.5 1.4 0.2 setosa
6 5 3.5 1.7 0.4 setosa
7 5 3.5 1.4 0.3 setosa
8 5 3.5 1.5 0.2 setosa
9 5 3.5 1.4 0.2 setosa
10 5 3.5 1.5 0.1 setosa
# ... with 140 more rows
> identical(df_want, df_attempt)
[1] TRUE
注:
使用
vars
您可以使用 NSE 引用变量。使用
funs
你可以用.
引用每一列,相当于function(x) x