从 tibble 中提取列表以获得整洁的数据

Question

我正在尝试处理其中有小标题的数据，其中我有针对某些观察结果的小标题列表。我想从这个列表中提取数据，以便有一些整洁的数据。

这是一个简短的例子：

data_1 <- tibble(year = c(2015, 2016), 
                 pop = c(100, 200))

data_2 <- tibble(year = c(2015, 2016), 
                 pop = c(300, 500))

data_combined <- list(data_1, data_2)

x <- tibble(country = (c('1', '2')), 
            data = data_combined)

print(x)
# A tibble: 2 x 2
  country data            
  <chr>   <list>          
1 1       <tibble [2 x 2]>
2 2       <tibble [2 x 2]>

print(x$data)
[[1]]
# A tibble: 2 x 2
   year   pop
  <dbl> <dbl>
1  2015   100
2  2016   200

[[2]]
# A tibble: 2 x 2
   year   pop
  <dbl> <dbl>
1  2015   300
2  2016   500

我想要的是以下整洁的数据格式（我不介意它是 data.frame 还是小标题）：

  country year pop
        1 2015 100
        1 2016 200
        2 2015 300
        2 2016 500

我认为最简单的方法是 return y$data 列表，保留国家字段，然后调用：do.call(rbind)。我不知道如何做第一部分。

也许我理解错了，拥有这种格式的数据很有用。如果是这样，并且有一种方法可以有效地处理其中包含小标题列表的小标题，那么我将欢迎任何相关信息。

所有这一切的背景是我正在尝试处理由此 API: https://cran.r-project.org/web/packages/eia/eia.pdf 生成的数据。 API 限制为每次调用仅生成 100 行。我假设出于这个原因，作者使用了这种数据格式来允许人们为每一行获取更多数据。如果您想了解一般示例，请参见下文：

#load libraries
library("eia")
library("dplyr")

#set API key for the session
eia_set_key(key = "[YOUR_KEY_HERE]")

#select a variable of interest by looking through: eia_cats() -> eia_child_cats(2134384)
anth_production <- eia_cats(2134515) %>% #select data for Anthracite (as a list)
  .$childseries %>% #subset the childseries element of the list
  filter(units == "Million Metric Tons of Oil Equivalent") %>% #filter to only have MMTOe
  .$series_id #subset the IDs to use in the eia_series() call 

#call the eia_series() function of the API
anth_production_tibble <- eia_series(id = anth_production)

anth_production_tibble 现在以我上面在我的可重现示例中生成的相同格式出现。我稍后会写一个函数来处理 100 行限制。

Answer 1

我们可以使用unnest

library(tidyr)
librar(dplyr)
x %>%
   unnest(data)
# A tibble: 4 x 3
#  country  year   pop
#  <chr>   <dbl> <dbl>
#1 1        2015   100
#2 1        2016   200
#3 2        2015   300
#4 2        2016   500

从 tibble 中提取列表以获得整洁的数据

Pull list from tibble for tidy data

r

tibble