R嵌套的tibble map2比较
R nested tibble map2 comparisons
我正在尝试使用 map2 来比较嵌套的 tibble 列。这是我的数据格式:
> tbl
# A tibble: 3 x 3
ID data.x data.y
<chr> <list> <list>
1 a <tibble [2 x 2]> <tibble [2 x 2]>
2 b <tibble [2 x 2]> <tibble [2 x 2]>
3 c <tibble [2 x 2]> <tibble [2 x 2]>
data.x 和 data.y 中的小标题从列名的角度来看是相同的,值可能不同。我想从 val 列中获取最大值。我认为这会起作用,但 data.x 最多只有 returns。我不完全理解 map2 是如何工作的。
tbl %>%
mutate(col1 = map2_dbl(data.x, data.y, ~ max(.$val)))
结果应该是:
# A tibble: 3 x 4
ID data.x data.y col1
<chr> <list> <list> <dbl>
1 a <tibble [2 x 2]> <tibble [2 x 2]> 7.
2 b <tibble [2 x 2]> <tibble [2 x 2]> 8.
3 c <tibble [2 x 2]> <tibble [2 x 2]> 8.
数据:
> dput(tbl)
structure(list(ID = c("a", "b", "c"), data.x = list(structure(list(
text = c("Y", "Y"), val = c(1, 1)), .Names = c("text", "val"
), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(text = c("N", "N"), val = c(2, 2)), .Names = c("text",
"val"), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(text = c("Y", "Y"), val = c(3, 3)), .Names = c("text",
"val"), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
))), data.y = list(structure(list(text = c("Y", "Y"), val = c(6,
7)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(text = c("Y", "Y"), val = c(8,
6)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(text = c("N", "N"), val = c(7,
8)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .Names = c("ID", "data.x", "data.y"
))
根据预期输出,我们从 'data.x' 和 'data.y' lists
中提取 data.frame
中的 'val' 列,将它们连接在一起(c
) 并获得 max
值
tbl %>%
mutate(col1 = map2_dbl(data.x, data.y, ~ max(c(.x$val, .y$val))))
# A tibble: 3 x 4
# ID data.x data.y col1
# <chr> <list> <list> <dbl>
#1 a <tibble [2 x 2]> <tibble [2 x 2]> 7.00
#2 b <tibble [2 x 2]> <tibble [2 x 2]> 8.00
#3 c <tibble [2 x 2]> <tibble [2 x 2]> 8.00
对于多列,可以使用 'data'、pmap
tbl %>%
mutate(col1 = pmap_dbl(.[-1], ~ max(c(..1$val, ..2$val))))
我正在尝试使用 map2 来比较嵌套的 tibble 列。这是我的数据格式:
> tbl
# A tibble: 3 x 3
ID data.x data.y
<chr> <list> <list>
1 a <tibble [2 x 2]> <tibble [2 x 2]>
2 b <tibble [2 x 2]> <tibble [2 x 2]>
3 c <tibble [2 x 2]> <tibble [2 x 2]>
data.x 和 data.y 中的小标题从列名的角度来看是相同的,值可能不同。我想从 val 列中获取最大值。我认为这会起作用,但 data.x 最多只有 returns。我不完全理解 map2 是如何工作的。
tbl %>%
mutate(col1 = map2_dbl(data.x, data.y, ~ max(.$val)))
结果应该是:
# A tibble: 3 x 4
ID data.x data.y col1
<chr> <list> <list> <dbl>
1 a <tibble [2 x 2]> <tibble [2 x 2]> 7.
2 b <tibble [2 x 2]> <tibble [2 x 2]> 8.
3 c <tibble [2 x 2]> <tibble [2 x 2]> 8.
数据:
> dput(tbl)
structure(list(ID = c("a", "b", "c"), data.x = list(structure(list(
text = c("Y", "Y"), val = c(1, 1)), .Names = c("text", "val"
), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(text = c("N", "N"), val = c(2, 2)), .Names = c("text",
"val"), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(text = c("Y", "Y"), val = c(3, 3)), .Names = c("text",
"val"), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
))), data.y = list(structure(list(text = c("Y", "Y"), val = c(6,
7)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(text = c("Y", "Y"), val = c(8,
6)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(text = c("N", "N"), val = c(7,
8)), .Names = c("text", "val"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame")))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .Names = c("ID", "data.x", "data.y"
))
根据预期输出,我们从 'data.x' 和 'data.y' lists
中提取 data.frame
中的 'val' 列,将它们连接在一起(c
) 并获得 max
值
tbl %>%
mutate(col1 = map2_dbl(data.x, data.y, ~ max(c(.x$val, .y$val))))
# A tibble: 3 x 4
# ID data.x data.y col1
# <chr> <list> <list> <dbl>
#1 a <tibble [2 x 2]> <tibble [2 x 2]> 7.00
#2 b <tibble [2 x 2]> <tibble [2 x 2]> 8.00
#3 c <tibble [2 x 2]> <tibble [2 x 2]> 8.00
对于多列,可以使用 'data'、pmap
tbl %>%
mutate(col1 = pmap_dbl(.[-1], ~ max(c(..1$val, ..2$val))))