如何从列表列中的列表中提取最大值并将其添加为 tibble 中的新列?
How do I extract the maximum value from a list in a list col and add it as a new column in tibble?
我有一个调查问卷答复的数据集,我想确定对一系列项目给出相同答复的受访者。使用 base::rle 我在新列表列中得到 运行 长度;我想为每个案例提取最大 运行 长度并将这些值添加为新列。
library(tidyverse)
x <- tribble(
~x1, ~x2, ~x3, ~x4, ~x5, ~x6,
1, 1, 1, 1, 1, 1,
3, 3, 3, 2, 5, 3,
3, 3, 3, 3, 3, 3,
4, 4, 5, 5, 5, 5 )
# Add list col of runs
x <- x %>%
rowwise() %>%
mutate(runs = list(base::rle(c(x1, x2, x3, x4, x5, x6))))
# The list col is a list with 2 elements, 'lengths' and 'values'
str(x$runs[1])
#> List of 1
#> $ :List of 2
#> ..$ lengths: int 6
#> ..$ values : num 1
#> ..- attr(*, "class")= chr "rle"
# I can obtain max values of "lengths" for each row
map_int(map(x$runs, "lengths"), max)
#> [1] 6 3 6 4
# But I can't work out how to use 'mutate' to create a new variable containing
# the maximum for each case. I tried the following but it doesn't work.
x <- x %>%
rowwise() %>%
mutate(run_max = map_int(map(x$runs, "lengths"), max))
#> Error: Problem with `mutate()` column `run_max`.
#> i `run_max = map_int(map(x$runs, "lengths"), max)`.
#> i `run_max` must be size 1, not 4.
#> i Did you mean: `run_max = list(map_int(map(x$runs, "lengths"), max))` ?
#> i The error occurred in row 1.
由 reprex package (v2.0.1)
于 2021-09-17 创建
我们需要ungroup
library(dplyr)
library(purrr)
x %>%
rowwise() %>% ungroup %>%
mutate(run_max = map_int(map(runs, "lengths"), max))
# A tibble: 4 x 8
x1 x2 x3 x4 x5 x6 runs run_max
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <int>
1 1 1 1 1 1 1 <rle> 6
2 3 3 3 2 5 3 <rle> 3
3 3 3 3 3 3 3 <rle> 6
4 4 4 5 5 5 5 <rle> 4
或者如果打算用 map
循环,则根本不需要 rowwise
分组
x %>%
ungroup %>%
mutate(run_max = map_int(map(runs, "lengths"), max))
当我们使用rowwise
时,不需要map
提取
x %>%
rowwise %>%
mutate(run_max = max(runs$lengths)) %>%
ungroup
# A tibble: 4 x 8
x1 x2 x3 x4 x5 x6 runs run_max
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <int>
1 1 1 1 1 1 1 <rle> 6
2 3 3 3 2 5 3 <rle> 3
3 3 3 3 3 3 3 <rle> 6
4 4 4 5 5 5 5 <rle> 4
我有一个调查问卷答复的数据集,我想确定对一系列项目给出相同答复的受访者。使用 base::rle 我在新列表列中得到 运行 长度;我想为每个案例提取最大 运行 长度并将这些值添加为新列。
library(tidyverse)
x <- tribble(
~x1, ~x2, ~x3, ~x4, ~x5, ~x6,
1, 1, 1, 1, 1, 1,
3, 3, 3, 2, 5, 3,
3, 3, 3, 3, 3, 3,
4, 4, 5, 5, 5, 5 )
# Add list col of runs
x <- x %>%
rowwise() %>%
mutate(runs = list(base::rle(c(x1, x2, x3, x4, x5, x6))))
# The list col is a list with 2 elements, 'lengths' and 'values'
str(x$runs[1])
#> List of 1
#> $ :List of 2
#> ..$ lengths: int 6
#> ..$ values : num 1
#> ..- attr(*, "class")= chr "rle"
# I can obtain max values of "lengths" for each row
map_int(map(x$runs, "lengths"), max)
#> [1] 6 3 6 4
# But I can't work out how to use 'mutate' to create a new variable containing
# the maximum for each case. I tried the following but it doesn't work.
x <- x %>%
rowwise() %>%
mutate(run_max = map_int(map(x$runs, "lengths"), max))
#> Error: Problem with `mutate()` column `run_max`.
#> i `run_max = map_int(map(x$runs, "lengths"), max)`.
#> i `run_max` must be size 1, not 4.
#> i Did you mean: `run_max = list(map_int(map(x$runs, "lengths"), max))` ?
#> i The error occurred in row 1.
由 reprex package (v2.0.1)
于 2021-09-17 创建我们需要ungroup
library(dplyr)
library(purrr)
x %>%
rowwise() %>% ungroup %>%
mutate(run_max = map_int(map(runs, "lengths"), max))
# A tibble: 4 x 8
x1 x2 x3 x4 x5 x6 runs run_max
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <int>
1 1 1 1 1 1 1 <rle> 6
2 3 3 3 2 5 3 <rle> 3
3 3 3 3 3 3 3 <rle> 6
4 4 4 5 5 5 5 <rle> 4
或者如果打算用 map
rowwise
分组
x %>%
ungroup %>%
mutate(run_max = map_int(map(runs, "lengths"), max))
当我们使用rowwise
时,不需要map
提取
x %>%
rowwise %>%
mutate(run_max = max(runs$lengths)) %>%
ungroup
# A tibble: 4 x 8
x1 x2 x3 x4 x5 x6 runs run_max
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <int>
1 1 1 1 1 1 1 <rle> 6
2 3 3 3 2 5 3 <rle> 3
3 3 3 3 3 3 3 <rle> 6
4 4 4 5 5 5 5 <rle> 4