将 pmap 与 c(...) 结合使用第 2 部分

Question

我最近一直在探索使用 pmap 函数及其变体的各种应用，我对使用 c(...) 传递所有参数特别感兴趣。以下数据集属于我们与许多知识渊博的用户 earlier today 讨论的另一个问题。我们应该根据 Days 列中的值沿着它们各自的行重复 weight 列中的值以获得以下输出：

df <- tribble(
  ~Name,    ~School,   ~Weight, ~Days,
  "Antoine", "Bach",     0.03,   5,
  "Antoine", "Ken",      0.02,   7,
  "Barbara", "Franklin", 0.04,   3
)

输出：

df %>%
  mutate(map2_dfr(Weight, Days, ~ set_names(rep(.x, .y), 1:.y))) %>%
  select(-c(Weight, Days))

# A tibble: 3 x 9
  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

我的问题是这个输出可以通过各种解决方案实现，但其中一位贡献者提出的以下解决方案引起了我的注意。我想知道如何通过 c(...)

重写它

# This is not my code and it works:

pmap_dfr(df, function(Weight, Days, ...) c(..., setNames(rep(Weight, Days), 1:Days)))

# And I can also rewrite it in the following way which also works:

df %>%
  mutate(data = pmap(list(Weight, Days), ~ setNames(rep(.x, .y), 1:.y))) %>%
  unnest_wider(data)

但我想知道为什么这些都不起作用：

df %>% 
  mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))


df %>% 
  pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days)))

提前非常感谢您，对于冗长的描述，我们深表歉意。

Answer 1

问题似乎是将自定义 anonymous/lambda 函数（function(Weight, Days, ...) - 其中参数的命名与列名相同）与默认的 lambda 函数（~ - 其中参数为 .x、.y（如果只有两个元素或如果超过两个 - ..1、..2、..3 等）。在 OP 的代码中

library(dplyr)
library(purrr)
df %>% 
   mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))

'Weight'、'Days' returns 来自原始数据集而非行的完整列值。如果我们仍然想使用上面的命令，我们需要将每行捕获的数据转换为 tibble 并使用 with

df %>%
     pmap_dfr(., ~ with(as_tibble(list(...)), 
             setNames(rep(Weight, Days), seq_len(Days))))
# A tibble: 3 x 7
#     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1  0.03  0.03  0.03  0.03  0.03 NA    NA   
#2  0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3  0.04  0.04  0.04 NA    NA    NA    NA

如果我们想要其他列，

df %>%
     pmap_dfr(., ~ c(list(...)[-(3:4)], with(as_tibble(list(...)), 
             setNames(rep(Weight, Days), seq_len(Days)))))
# A tibble: 3 x 9
#  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
#2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

或使用rowwise

library(tidyr)
df %>% 
    rowwise %>% 
    mutate(out = list(setNames(rep(Weight, Days), seq_len(Days)))) %>%
    ungroup %>%
    unnest_wider(c(out))  %>%
    select(-Weight, -Days)
# A tibble: 3 x 9
#  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
#2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

Answer 2

这可能不会增加太多价值，但可能有助于理解 lambda 函数中的内容。

pmap_df(df, ~ c(setNames(c(..1, ..2), names(df[1:2])), setNames(rep(..3, ..4), seq_len(..4))))

# A tibble: 3 x 9
  Name    School   `1`   `2`   `3`   `4`   `5`   `6`   `7`  
  <chr>   <chr>    <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Antoine Bach     0.03  0.03  0.03  0.03  0.03  NA    NA   
2 Antoine Ken      0.02  0.02  0.02  0.02  0.02  0.02  0.02 
3 Barbara Franklin 0.04  0.04  0.04  NA    NA    NA    NA

pmap_df就足够了，pmap_dfr可能是多余的
您可以传递特定参数，例如 ..1、..2 等

或者这也行

pmap_df(df, ~ c(list(...)[1:2], setNames(rep(..3, ..4), seq_len(..4))))

# A tibble: 3 x 9
  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

将 pmap 与 c(...) 结合使用第 2 部分

Using pmap with c(...) part 2

r

purrr