将参数向量传递给 map 函数
Pass a vector of arguments to map function
我正在尝试创建一个函数来映射嵌套的 tibble。此函数需要采用每行不同的参数向量。
当我对嵌套数据调用 purrr:map2()
时,purrr 尝试遍历参数向量的所有值 和 数据集中的所有行。我该怎么做才能将整个向量作为单个参数传递?
library(tidyverse)
myf <- function(x, params) {
print(params)
x %>%
mutate(new_mpg = mpg + rnorm(n(), params[1], params[2])) %>%
summarise(old = mean(mpg), new = mean(new_mpg)) %>%
as.list()
}
# Calling function with params defined is great!
myf(mtcars, params = c(5, 10))
#> [1] 5 10
#> $old
#> [1] 20.09062
#>
#> $new
#> [1] 25.62049
# Cannot work in purr as vector, tries to loop over param
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
newold = map2(data, c(5, 10), myf),
)
#> [1] 5
#> Warning in rnorm(n(), params[1], params[2]): NAs produced
#> [1] 10
#> Warning in rnorm(n(), params[1], params[2]): NAs produced
#> Error: Problem with `mutate()` column `newold`.
#> ℹ `newold = map2(data, c(5, 10), myf)`.
#> ℹ `newold` must be size 1, not 2.
#> ℹ The error occurred in group 1: cyl = 4.
# New function wrapper with hard-coded params
myf2 <- function(x){
myf(x, c(5, 10))
}
# works great! but not what I need
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
mean = 5,
sd = 10,
newold = map(data, myf2),
)
#> [1] 5 10
#> [1] 5 10
#> [1] 5 10
#> # A tibble: 3 × 5
#> # Groups: cyl [3]
#> cyl data mean sd newold
#> <dbl> <list> <dbl> <dbl> <list>
#> 1 6 <tibble [7 × 10]> 5 10 <named list [2]>
#> 2 4 <tibble [11 × 10]> 5 10 <named list [2]>
#> 3 8 <tibble [14 × 10]> 5 10 <named list [2]>
由 reprex package (v2.0.0)
于 2021-11-29 创建
跳过 group_by()
步骤,只需使用 nest()
- 否则您的数据在嵌套后将保持分组状态,需要取消分组。要使您的函数正常工作,只需将参数作为列表传递即可。
library(tidyverse)
mtcars %>%
nest(data = -cyl) %>%
mutate(
newold = map2_df(data, list(c(5, 10)), myf)
) %>%
unpack(newold)
# A tibble: 3 x 4
cyl data old new
<dbl> <list> <dbl> <dbl>
1 6 <tibble [7 x 10]> 19.7 30.7
2 4 <tibble [11 x 10]> 26.7 31.1
3 8 <tibble [14 x 10]> 15.1 17.0
你不需要map2
。我想你需要的是 map
.
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
newold = map(data, myf, params = c(5, 10)),
)
# [1] 5 10
# [1] 5 10
# [1] 5 10
# # A tibble: 3 x 3
# # Groups: cyl [3]
# cyl data newold
# <dbl> <list> <list>
# 1 6 <tibble [7 x 10]> <named list [2]>
# 2 4 <tibble [11 x 10]> <named list [2]>
# 3 8 <tibble [14 x 10]> <named list [2]>
如果你有多套params
。您可以取消分组数据框,使用 params
添加列表列,然后使用 map2
.
mtcars %>%
group_by(cyl) %>%
nest() %>%
ungroup() %>%
# Add different sets of params
mutate(Params = list(a = c(5, 10), b = c(6, 11), c = c(7, 12))) %>%
mutate(
newold = map2(data, Params, myf)
)
# [1] 5 10
# [1] 6 11
# [1] 7 12
# # A tibble: 3 x 4
# cyl data Params newold
# <dbl> <list> <named list> <list>
# 1 6 <tibble [7 x 10]> <dbl [2]> <named list [2]>
# 2 4 <tibble [11 x 10]> <dbl [2]> <named list [2]>
# 3 8 <tibble [14 x 10]> <dbl [2]> <named list [2]>
我正在尝试创建一个函数来映射嵌套的 tibble。此函数需要采用每行不同的参数向量。
当我对嵌套数据调用 purrr:map2()
时,purrr 尝试遍历参数向量的所有值 和 数据集中的所有行。我该怎么做才能将整个向量作为单个参数传递?
library(tidyverse)
myf <- function(x, params) {
print(params)
x %>%
mutate(new_mpg = mpg + rnorm(n(), params[1], params[2])) %>%
summarise(old = mean(mpg), new = mean(new_mpg)) %>%
as.list()
}
# Calling function with params defined is great!
myf(mtcars, params = c(5, 10))
#> [1] 5 10
#> $old
#> [1] 20.09062
#>
#> $new
#> [1] 25.62049
# Cannot work in purr as vector, tries to loop over param
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
newold = map2(data, c(5, 10), myf),
)
#> [1] 5
#> Warning in rnorm(n(), params[1], params[2]): NAs produced
#> [1] 10
#> Warning in rnorm(n(), params[1], params[2]): NAs produced
#> Error: Problem with `mutate()` column `newold`.
#> ℹ `newold = map2(data, c(5, 10), myf)`.
#> ℹ `newold` must be size 1, not 2.
#> ℹ The error occurred in group 1: cyl = 4.
# New function wrapper with hard-coded params
myf2 <- function(x){
myf(x, c(5, 10))
}
# works great! but not what I need
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
mean = 5,
sd = 10,
newold = map(data, myf2),
)
#> [1] 5 10
#> [1] 5 10
#> [1] 5 10
#> # A tibble: 3 × 5
#> # Groups: cyl [3]
#> cyl data mean sd newold
#> <dbl> <list> <dbl> <dbl> <list>
#> 1 6 <tibble [7 × 10]> 5 10 <named list [2]>
#> 2 4 <tibble [11 × 10]> 5 10 <named list [2]>
#> 3 8 <tibble [14 × 10]> 5 10 <named list [2]>
由 reprex package (v2.0.0)
于 2021-11-29 创建跳过 group_by()
步骤,只需使用 nest()
- 否则您的数据在嵌套后将保持分组状态,需要取消分组。要使您的函数正常工作,只需将参数作为列表传递即可。
library(tidyverse)
mtcars %>%
nest(data = -cyl) %>%
mutate(
newold = map2_df(data, list(c(5, 10)), myf)
) %>%
unpack(newold)
# A tibble: 3 x 4
cyl data old new
<dbl> <list> <dbl> <dbl>
1 6 <tibble [7 x 10]> 19.7 30.7
2 4 <tibble [11 x 10]> 26.7 31.1
3 8 <tibble [14 x 10]> 15.1 17.0
你不需要map2
。我想你需要的是 map
.
mtcars %>%
group_by(cyl) %>% # from base R
nest() %>%
mutate(
newold = map(data, myf, params = c(5, 10)),
)
# [1] 5 10
# [1] 5 10
# [1] 5 10
# # A tibble: 3 x 3
# # Groups: cyl [3]
# cyl data newold
# <dbl> <list> <list>
# 1 6 <tibble [7 x 10]> <named list [2]>
# 2 4 <tibble [11 x 10]> <named list [2]>
# 3 8 <tibble [14 x 10]> <named list [2]>
如果你有多套params
。您可以取消分组数据框,使用 params
添加列表列,然后使用 map2
.
mtcars %>%
group_by(cyl) %>%
nest() %>%
ungroup() %>%
# Add different sets of params
mutate(Params = list(a = c(5, 10), b = c(6, 11), c = c(7, 12))) %>%
mutate(
newold = map2(data, Params, myf)
)
# [1] 5 10
# [1] 6 11
# [1] 7 12
# # A tibble: 3 x 4
# cyl data Params newold
# <dbl> <list> <named list> <list>
# 1 6 <tibble [7 x 10]> <dbl [2]> <named list [2]>
# 2 4 <tibble [11 x 10]> <dbl [2]> <named list [2]>
# 3 8 <tibble [14 x 10]> <dbl [2]> <named list [2]>