如何根据特定值过滤列表并取消嵌套?
How do I filter a list based on a certain value and unnest it?
我正在尝试为我的 Github 存储库创建一个包含所有目录 URL 的数据框。这是我写的:
library(httr)
req <- GET("https://api.github.com/repos/thedivtagguy/daily-data/git/trees/master")
在 urls
中,我希望只存储 type: tree
中那些项目的 URL。这是 req
的样子:
{
"sha": "e5acaf1fd8973e010922ffc4366af68359de8456",
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/trees/e5acaf1fd8973e010922ffc4366af68359de8456",
"tree": [
{
"path": ".gitignore",
"mode": "100644",
"type": "blob",
"sha": "aaf221658979cc888d499702eea62beafa52d4e4",
"size": 631,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/aaf221658979cc888d499702eea62beafa52d4e4"
},
{
"path": "README.md",
"mode": "100644",
"type": "blob",
"sha": "2b0515e164c8a76d877324984f01e4bde6725410",
"size": 1119,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/2b0515e164c8a76d877324984f01e4bde6725410"
},
{
"path": "_config.yml",
"mode": "100644",
"type": "blob",
"sha": "2f7efbeab578c8042531ea7908ee8ffd7589fe46",
"size": 27,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/2f7efbeab578c8042531ea7908ee8ffd7589fe46"
},
{
"path": "byrne_pebbles_8.png",
"mode": "100644",
"type": "blob",
"sha": "95d9552636489a812d164e63776892c91c859785",
"size": 10862,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/95d9552636489a812d164e63776892c91c859785"
},
{
"path": "dd01_kharifAndRabiCrops",
"mode": "040000",
"type": "tree",
"sha": "b0863850cf04b73f76a8ed1f60558c6d340d142a",
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/trees/b0863850cf04b73f76a8ed1f60558c6d340d142a"
}
这是我写的:
reponse <- content(req)$tree
urls <- response %>%
map( ~ ., ~ filter(.x, type == "tree")) %>%
unnest(url)
但这不起作用,我收到此错误:
Error in UseMethod("unnest") :
no applicable method for 'unnest' applied to an object of class "list"
如何过滤以便我只能存储类型为 tree
的项目的 URL?我知道如何在 base R 中做到这一点,但我更喜欢整洁的方法。
我们可以遍历 'response' 嵌套的 list
,并通过与 bind_cols
绑定将命名的内部列表转换为 tibble
- 如果我们使用 map
循环,它将 return list
of tibble
,通过添加后缀 _dfr
,它将 list
s of tibbles 绑定到单个 tibble 输出.然后,在 'type' 列
上执行 filter
ing
library(purrr)
library(dplyr)
map_dfr(response, bind_cols) %>%
filter(type == 'tree')
-输出
# A tibble: 8 × 6
path mode type sha size url
<chr> <chr> <chr> <chr> <int> <chr>
1 dd01_kharifAndRabiCrops 040000 tree b0863850cf04b73f76a8ed1f60558c6d340d142a NA https://api.github.com/repos/thedivta…
2 dd02_commonSenseMedia 040000 tree 93461ea8897a3dcd026a88e62a284ba4a835a219 NA https://api.github.com/repos/thedivta…
3 dd03_cropYieldsD3 040000 tree 63b5de06defae6e8524e40d7942ba61ed2a599fc NA https://api.github.com/repos/thedivta…
4 dd04_digitsPi 040000 tree 0f7e99e051a41fef6cde2706e6ded7538c3b6f8f NA https://api.github.com/repos/thedivta…
5 dd05_indiaR 040000 tree 2bd552a7f9918602df043d54130c3d0b33d27294 NA https://api.github.com/repos/thedivta…
6 dd06_ttDrWho 040000 tree 7a0e0461228e0b39ba202032e09b46ba49e7eab6 NA https://api.github.com/repos/thedivta…
7 dd07_ggWaves 040000 tree 06def68fecdf576d45efa16ba6d98bba27edcc8f NA https://api.github.com/repos/thedivta…
8 resources 040000 tree ed4cb9ff46f19764292dea8e655dd4a500389dda NA https://api.github.com/repos/thedivta…
我正在尝试为我的 Github 存储库创建一个包含所有目录 URL 的数据框。这是我写的:
library(httr)
req <- GET("https://api.github.com/repos/thedivtagguy/daily-data/git/trees/master")
在 urls
中,我希望只存储 type: tree
中那些项目的 URL。这是 req
的样子:
{
"sha": "e5acaf1fd8973e010922ffc4366af68359de8456",
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/trees/e5acaf1fd8973e010922ffc4366af68359de8456",
"tree": [
{
"path": ".gitignore",
"mode": "100644",
"type": "blob",
"sha": "aaf221658979cc888d499702eea62beafa52d4e4",
"size": 631,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/aaf221658979cc888d499702eea62beafa52d4e4"
},
{
"path": "README.md",
"mode": "100644",
"type": "blob",
"sha": "2b0515e164c8a76d877324984f01e4bde6725410",
"size": 1119,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/2b0515e164c8a76d877324984f01e4bde6725410"
},
{
"path": "_config.yml",
"mode": "100644",
"type": "blob",
"sha": "2f7efbeab578c8042531ea7908ee8ffd7589fe46",
"size": 27,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/2f7efbeab578c8042531ea7908ee8ffd7589fe46"
},
{
"path": "byrne_pebbles_8.png",
"mode": "100644",
"type": "blob",
"sha": "95d9552636489a812d164e63776892c91c859785",
"size": 10862,
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/blobs/95d9552636489a812d164e63776892c91c859785"
},
{
"path": "dd01_kharifAndRabiCrops",
"mode": "040000",
"type": "tree",
"sha": "b0863850cf04b73f76a8ed1f60558c6d340d142a",
"url": "https://api.github.com/repos/thedivtagguy/daily-data/git/trees/b0863850cf04b73f76a8ed1f60558c6d340d142a"
}
这是我写的:
reponse <- content(req)$tree
urls <- response %>%
map( ~ ., ~ filter(.x, type == "tree")) %>%
unnest(url)
但这不起作用,我收到此错误:
Error in UseMethod("unnest") :
no applicable method for 'unnest' applied to an object of class "list"
如何过滤以便我只能存储类型为 tree
的项目的 URL?我知道如何在 base R 中做到这一点,但我更喜欢整洁的方法。
我们可以遍历 'response' 嵌套的 list
,并通过与 bind_cols
绑定将命名的内部列表转换为 tibble
- 如果我们使用 map
循环,它将 return list
of tibble
,通过添加后缀 _dfr
,它将 list
s of tibbles 绑定到单个 tibble 输出.然后,在 'type' 列
filter
ing
library(purrr)
library(dplyr)
map_dfr(response, bind_cols) %>%
filter(type == 'tree')
-输出
# A tibble: 8 × 6
path mode type sha size url
<chr> <chr> <chr> <chr> <int> <chr>
1 dd01_kharifAndRabiCrops 040000 tree b0863850cf04b73f76a8ed1f60558c6d340d142a NA https://api.github.com/repos/thedivta…
2 dd02_commonSenseMedia 040000 tree 93461ea8897a3dcd026a88e62a284ba4a835a219 NA https://api.github.com/repos/thedivta…
3 dd03_cropYieldsD3 040000 tree 63b5de06defae6e8524e40d7942ba61ed2a599fc NA https://api.github.com/repos/thedivta…
4 dd04_digitsPi 040000 tree 0f7e99e051a41fef6cde2706e6ded7538c3b6f8f NA https://api.github.com/repos/thedivta…
5 dd05_indiaR 040000 tree 2bd552a7f9918602df043d54130c3d0b33d27294 NA https://api.github.com/repos/thedivta…
6 dd06_ttDrWho 040000 tree 7a0e0461228e0b39ba202032e09b46ba49e7eab6 NA https://api.github.com/repos/thedivta…
7 dd07_ggWaves 040000 tree 06def68fecdf576d45efa16ba6d98bba27edcc8f NA https://api.github.com/repos/thedivta…
8 resources 040000 tree ed4cb9ff46f19764292dea8e655dd4a500389dda NA https://api.github.com/repos/thedivta…