在 tidyverse 中,根据现有变量创建 seq() 列
Within tidyverse, create seq() column based on existing variables
目标:在 tidyverse 中,创建一个名为 my_seq 的序列列。每个 seq() 数字应使用现有列 "from"(x 列)和 "to"(y 列)。
自指 "dot" 组合的加分(以及点语法的解释)。
boo <- tribble(
~ x, ~y,
5, 20,
6, 10,
2, 20)
# Desired results should reflect these results in new column:
seq(5, 20, by = 2)
#> [1] 5 7 9 11 13 15 17 19
seq(6, 10, by = 2)
#> [1] 6 8 10
seq(2, 20, by = 2)
#> [1] 2 4 6 8 10 12 14 16 18 20
# These straightforward solutions do not work
boo %>%
mutate(my_seq = seq(x, y, by = 2))
boo %>%
mutate(my_seq = seq(boo$x, boo$y, by = 2))
# The grammar of self-referential dots is super arcane, but
# here are some additional tries. All fail.
boo %>%
mutate(my_seq = map_int(boo, ~seq(.$x, .$y, by = 2)))
boo %>%
mutate(my_seq = seq(.$x, .$y, by = 2))
使用purrr
,你可以使用map2
并行循环遍历x
和y
,这类似于base R中的Map/mapply
但是不同的语法:
boo %>% mutate(my_seq = map2(x, y, seq, by=2))
# A tibble: 3 x 3
# x y my_seq
# <dbl> <dbl> <list>
#1 5 20 <dbl [8]>
#2 6 10 <dbl [3]>
#3 2 20 <dbl [10]>
my_seq
是列表类型的列,我们可以pull
列出来看它的内容:
boo %>% mutate(my_seq = map2(x, y, seq, by=2)) %>% pull(my_seq)
#[[1]]
#[1] 5 7 9 11 13 15 17 19
#[[2]]
#[1] 6 8 10
#[[3]]
# [1] 2 4 6 8 10 12 14 16 18 20
一般情况下,当有多个参数时,也可以使用pmap
library(dplyr)
library(purrr)
res <- boo %>%
mutate(my_seq = pmap(., .f = ~seq(..1, ..2, by = 2)))
res
# A tibble: 3 x 3
# x y my_seq
# <dbl> <dbl> <list>
#1 5.00 20.0 <dbl [8]>
#2 6.00 10.0 <dbl [3]>
#3 2.00 20.0 <dbl [10]>
res$my_seq
#[[1]]
#[1] 5 7 9 11 13 15 17 19
#[[2]]
#[1] 6 8 10
#[[3]]
#[1] 2 4 6 8 10 12 14 16 18 20
目标:在 tidyverse 中,创建一个名为 my_seq 的序列列。每个 seq() 数字应使用现有列 "from"(x 列)和 "to"(y 列)。
自指 "dot" 组合的加分(以及点语法的解释)。
boo <- tribble(
~ x, ~y,
5, 20,
6, 10,
2, 20)
# Desired results should reflect these results in new column:
seq(5, 20, by = 2)
#> [1] 5 7 9 11 13 15 17 19
seq(6, 10, by = 2)
#> [1] 6 8 10
seq(2, 20, by = 2)
#> [1] 2 4 6 8 10 12 14 16 18 20
# These straightforward solutions do not work
boo %>%
mutate(my_seq = seq(x, y, by = 2))
boo %>%
mutate(my_seq = seq(boo$x, boo$y, by = 2))
# The grammar of self-referential dots is super arcane, but
# here are some additional tries. All fail.
boo %>%
mutate(my_seq = map_int(boo, ~seq(.$x, .$y, by = 2)))
boo %>%
mutate(my_seq = seq(.$x, .$y, by = 2))
使用purrr
,你可以使用map2
并行循环遍历x
和y
,这类似于base R中的Map/mapply
但是不同的语法:
boo %>% mutate(my_seq = map2(x, y, seq, by=2))
# A tibble: 3 x 3
# x y my_seq
# <dbl> <dbl> <list>
#1 5 20 <dbl [8]>
#2 6 10 <dbl [3]>
#3 2 20 <dbl [10]>
my_seq
是列表类型的列,我们可以pull
列出来看它的内容:
boo %>% mutate(my_seq = map2(x, y, seq, by=2)) %>% pull(my_seq)
#[[1]]
#[1] 5 7 9 11 13 15 17 19
#[[2]]
#[1] 6 8 10
#[[3]]
# [1] 2 4 6 8 10 12 14 16 18 20
一般情况下,当有多个参数时,也可以使用pmap
library(dplyr)
library(purrr)
res <- boo %>%
mutate(my_seq = pmap(., .f = ~seq(..1, ..2, by = 2)))
res
# A tibble: 3 x 3
# x y my_seq
# <dbl> <dbl> <list>
#1 5.00 20.0 <dbl [8]>
#2 6.00 10.0 <dbl [3]>
#3 2.00 20.0 <dbl [10]>
res$my_seq
#[[1]]
#[1] 5 7 9 11 13 15 17 19
#[[2]]
#[1] 6 8 10
#[[3]]
#[1] 2 4 6 8 10 12 14 16 18 20