带有列表列的小标题:如果可能,转换为数组
tibble with list columns: Convert to array if possible
我有一个问题如下:
uuu <- structure(list(IsCharacter = c("a", "b"),
ShouldBeCharacter = list("One", "Another"),
IsList = list("Element1", c("Element2", "Element3"))
),
.Names = c("IsCharacter", "ShouldBeCharacter", "IsList"),
row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))
uuu
## A tibble: 2 × 3
# IsCharacter ShouldBeCharacter IsList
# <chr> <list> <list>
#1 a <chr [1]> <chr [1]>
#2 b <chr [1]> <chr [2]>
我想将像 "ShouldBeCharacter" 这样的列转换为类似于 "IsCharacter" 的列,其中所有元素的长度和类型都相同,而其余的列保持不变。
到目前为止,我有以下功能可以解决问题,但对我来说它看起来很老套。我想知道是否有更好的解决方案我没有考虑:
lists_to_atomic <- function(data) {
# Elements of length larger than one should be kept as lists.
# So we compute the maximum length for each column
length_column_elements <- apply(data, 2,
function(x) max(sapply(x, function(y) length(y))))
# to_simplify will contain column names of class list and with all elements of length 1
to_simplify <- colnames(data)[length_column_elements == 1 & sapply(data, class) == "list"]
# Do the conversion
data[,to_simplify] <- tibble::as_tibble(lapply(as.list(data[,to_simplify]), function(x) {do.call(c, x)}))
return(data)
}
这是我得到的结果,注意 ShouldBeCharacter 的类型是如何变化的:
lists_to_atomic(uuu)
## A tibble: 2 × 3
# IsCharacter ShouldBeCharacter IsList
# <chr> <chr> <list>
#1 a One <chr [1]>
#2 b Another <chr [2]>
as_tibble(lapply(as.list(... do.call(c,...)))
行对我来说太复杂了,但我找不到更简单的替代方法。
是否有任何简化可以使我的 lists_to_atomic
函数更可靠?
更新
我没有考虑在列表类型的列和长度为 1 的元素上使用 tidyr::unnest
,但是按照@taavi-p 的回答,我已经能够将函数简化为:
lists_to_atomic <- function(data) {
# Elements of length larger than one should be kept as lists.
# So we compute the maximum length for each column
length_column_elements <- apply(data, 2,
function(x) max(sapply(x, function(y) length(y))))
# to_simplify will contain column names of class list and with all elements of length 1
to_simplify <- colnames(data)[length_column_elements == 1 &
vapply(data,
FUN = function(x) "list" %in% class(x),
FUN.VALUE = logical(1))]
# Do the conversion
data2 <- tidyr::unnest_(data, unnest_cols = to_simplify)
data2 <- data2[, colnames(data)] # Preserve original column order
return(data2)
}
你可以试试:
library(tidyr)
uuu %>% unnest(ShouldBeCharacter)
可以在"R for Data Science"中找到更多如何处理列表列的示例:http://r4ds.had.co.nz/many-models.html#list-columns-1
我有一个问题如下:
uuu <- structure(list(IsCharacter = c("a", "b"),
ShouldBeCharacter = list("One", "Another"),
IsList = list("Element1", c("Element2", "Element3"))
),
.Names = c("IsCharacter", "ShouldBeCharacter", "IsList"),
row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))
uuu
## A tibble: 2 × 3
# IsCharacter ShouldBeCharacter IsList
# <chr> <list> <list>
#1 a <chr [1]> <chr [1]>
#2 b <chr [1]> <chr [2]>
我想将像 "ShouldBeCharacter" 这样的列转换为类似于 "IsCharacter" 的列,其中所有元素的长度和类型都相同,而其余的列保持不变。
到目前为止,我有以下功能可以解决问题,但对我来说它看起来很老套。我想知道是否有更好的解决方案我没有考虑:
lists_to_atomic <- function(data) {
# Elements of length larger than one should be kept as lists.
# So we compute the maximum length for each column
length_column_elements <- apply(data, 2,
function(x) max(sapply(x, function(y) length(y))))
# to_simplify will contain column names of class list and with all elements of length 1
to_simplify <- colnames(data)[length_column_elements == 1 & sapply(data, class) == "list"]
# Do the conversion
data[,to_simplify] <- tibble::as_tibble(lapply(as.list(data[,to_simplify]), function(x) {do.call(c, x)}))
return(data)
}
这是我得到的结果,注意 ShouldBeCharacter 的类型是如何变化的:
lists_to_atomic(uuu)
## A tibble: 2 × 3
# IsCharacter ShouldBeCharacter IsList
# <chr> <chr> <list>
#1 a One <chr [1]>
#2 b Another <chr [2]>
as_tibble(lapply(as.list(... do.call(c,...)))
行对我来说太复杂了,但我找不到更简单的替代方法。
是否有任何简化可以使我的 lists_to_atomic
函数更可靠?
更新
我没有考虑在列表类型的列和长度为 1 的元素上使用 tidyr::unnest
,但是按照@taavi-p 的回答,我已经能够将函数简化为:
lists_to_atomic <- function(data) {
# Elements of length larger than one should be kept as lists.
# So we compute the maximum length for each column
length_column_elements <- apply(data, 2,
function(x) max(sapply(x, function(y) length(y))))
# to_simplify will contain column names of class list and with all elements of length 1
to_simplify <- colnames(data)[length_column_elements == 1 &
vapply(data,
FUN = function(x) "list" %in% class(x),
FUN.VALUE = logical(1))]
# Do the conversion
data2 <- tidyr::unnest_(data, unnest_cols = to_simplify)
data2 <- data2[, colnames(data)] # Preserve original column order
return(data2)
}
你可以试试:
library(tidyr)
uuu %>% unnest(ShouldBeCharacter)
可以在"R for Data Science"中找到更多如何处理列表列的示例:http://r4ds.had.co.nz/many-models.html#list-columns-1