在 R 中嵌入 data.frames 的嵌套列表

Question

设置：

我有一个 tibble（命名数据），其中嵌入了 data.frames 列表。

df1 <- data.frame(name = c("columnName1","columnName2","columnName3"),
                  value = c("yes", 1L, 0L),
                  stringsAsFactors = F)

df2 <- data.frame(name = c("columnName1","columnName2","columnName3"),
                  value = c("no", 1L, 1L),
                  stringsAsFactors = F)

df3 <- data.frame(name = c("columnName1","columnName2","columnName3"),
                  value = c("yes", 0L, 0L),
                  stringsAsFactors = F)

responses = list(df1,
                 df2,
                 df3)

data <- tibble(ids = c(23L, 42L, 84L),
               responses = responses)

请注意，这是数据的简化示例。原始数据来自平面 json 文件并使用 jsonlite::stream_in() 函数加载。

Objective:

我的目标是将此 tibble 转换为另一个 tibble，其中嵌入的 data.frames 作为列展开（转置）；例如，我的目标标题是：

goal <- tibble(ids = c(23L, 42L, 84L),
               columnName1 = c("yes","no","yes"),
               columnName2 = c(1L, 1L, 0L),
               columnName3 = c(0L, 1L, 0L))

# goal tibble
> goal
# A tibble: 3 x 4
    ids columnName1 columnName2 columnName3
  <int> <chr>             <int>       <int>
1    23 yes                   1           0
2    42 no                    1           1
3    84 yes                   0           0

我的不雅解决方案：

使用dplyr::bind_rows()和tidyr::spread():

rdf <- dplyr::bind_rows(data$responses, .id = "id") %>%
  tidyr::spread(key = "name", -id)

goal2 <- cbind(ids = data$ids, rdf[,-1]) %>%
  as.tibble()

比较我的解决方案与目标：

# produced tibble
> goal2
# A tibble: 3 x 4
    ids columnName1 columnName2 columnName3
* <int> <chr>       <chr>       <chr>      
1    23 yes         1           0          
2    42 no          1           1          
3    84 yes         0           0

总体而言，我的解决方案有效但存在一些问题：

我不知道如何通过 bind_rows() 传递唯一 ID，这迫使我创建一个与原始 ID 不匹配的虚拟 ID ("id")编号（"ids"）。这迫使我使用 cbind()（我不喜欢）并手动删除虚拟 ID（在 rdf 上使用 -1 切片）。
列的格式丢失，因为我的方法将整数列转换为字符。

关于如何改进我的解决方案的任何建议（尤其是使用基于 tidyverse 的软件包，如 tidyjson 或 tidyr）？

Answer 1

我们可以用 map 遍历 'responses' 列，spread 到 'wide' 和 convert = TRUE 以便列类型，将其创建为包含 transmute 的列，然后是 unnest

library(tidyverse)
data %>% 
     transmute(ids, ind = map(responses, ~.x %>% 
                                  spread(name, value, convert = TRUE)))  %>%
     unnest
# A tibble: 3 x 4
#    ids columnName1 columnName2 columnName3
#   <int> <chr>             <int>       <int>
#1    23 yes                   1           0
#2    42 no                    1           1
#3    84 yes                   0           0

或者使用 OP 的代码，我们将 list 的名称设置为 'ids' 列，执行 bind_rows 然后 spread

bind_rows(setNames(data$responses, data$ids), .id = 'ids') %>% 
            spread(name, value, convert = TRUE)

在 R 中嵌入 data.frames 的嵌套列表

Embed nested list of data.frames in R

r

dplyr

jsonlite

tidyr