我如何在 tibble 中将 `mutate_at` 用于 list-columns?
How can I use `mutate_at` for list-columns in a tibble?
我有一个结构如下的小标题:
df <-
tibble(
x = 1:3,
light_93 = list(1:3, 5:7, 18:20),
light_94 = list(3:5, 9:11, 18:20),
light_95 = list(5:7, 44:46, 30:32))
我想创建多个新列,给出每个 light_
column-list 的平均值。所以我想要这个结果:
out <-
df %>%
mutate(light_93_mean = map_dbl(light_93, mean),
light_94_mean = map_dbl(light_94, mean),
light_95_mean = map_dbl(light_95, mean))
我可以使用 mutate_at
来自动执行此操作吗? (我有数百个 list-columns。)我一时想不出如何让它工作。
指定要在 mutate_at
中的 vars
参数中应用的列,然后在每个列中使用 map
循环遍历 list
并获得 mean
library(dplyr)
library(purrr)
df %>%
mutate_at(vars(starts_with('light')),
list(mean = ~ map_dbl(., mean)))
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31
或将 devel
版本与 across
和 mutate
一起使用
df %>%
mutate(across(starts_with('light'), ~ map_dbl(., mean), names = "{col}_mean"))
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31
也可以应用不同功能的不同列集
df %>%
mutate(across(starts_with('light'), ~ map_dbl(., mean), names = "{col}_mean"),
across(matches('(94|95)$'), ~ map_dbl(., sum), names = "{col}_sum"))
# A tibble: 3 x 9
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean light_94_sum light_95_sum
# <int> <list> <list> <list> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6 12 18
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45 30 135
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31 57 93
在 base R 中,我们可以使用 grep
select 以 "light"
开头的列并计算每个列表的 mean
并添加为新列。
cols <- grep('^light', names(df), value = TRUE)
df[paste0(cols, "_mean")] <- lapply(df[cols], function(x) sapply(x, mean))
df
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31
我有一个结构如下的小标题:
df <-
tibble(
x = 1:3,
light_93 = list(1:3, 5:7, 18:20),
light_94 = list(3:5, 9:11, 18:20),
light_95 = list(5:7, 44:46, 30:32))
我想创建多个新列,给出每个 light_
column-list 的平均值。所以我想要这个结果:
out <-
df %>%
mutate(light_93_mean = map_dbl(light_93, mean),
light_94_mean = map_dbl(light_94, mean),
light_95_mean = map_dbl(light_95, mean))
我可以使用 mutate_at
来自动执行此操作吗? (我有数百个 list-columns。)我一时想不出如何让它工作。
指定要在 mutate_at
中的 vars
参数中应用的列,然后在每个列中使用 map
循环遍历 list
并获得 mean
library(dplyr)
library(purrr)
df %>%
mutate_at(vars(starts_with('light')),
list(mean = ~ map_dbl(., mean)))
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31
或将 devel
版本与 across
和 mutate
df %>%
mutate(across(starts_with('light'), ~ map_dbl(., mean), names = "{col}_mean"))
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31
也可以应用不同功能的不同列集
df %>%
mutate(across(starts_with('light'), ~ map_dbl(., mean), names = "{col}_mean"),
across(matches('(94|95)$'), ~ map_dbl(., sum), names = "{col}_sum"))
# A tibble: 3 x 9
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean light_94_sum light_95_sum
# <int> <list> <list> <list> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6 12 18
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45 30 135
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31 57 93
在 base R 中,我们可以使用 grep
select 以 "light"
开头的列并计算每个列表的 mean
并添加为新列。
cols <- grep('^light', names(df), value = TRUE)
df[paste0(cols, "_mean")] <- lapply(df[cols], function(x) sapply(x, mean))
df
# A tibble: 3 x 7
# x light_93 light_94 light_95 light_93_mean light_94_mean light_95_mean
# <int> <list> <list> <list> <dbl> <dbl> <dbl>
#1 1 <int [3]> <int [3]> <int [3]> 2 4 6
#2 2 <int [3]> <int [3]> <int [3]> 6 10 45
#3 3 <int [3]> <int [3]> <int [3]> 19 19 31