如何在 R 中安排嵌套数据（即具有父级的数据）？

Question

我有一个多层次的数据集：

类别（例如，“国家/地区”）
国家（例如“美国”）
城市（例如“纽约”）
县（例如“曼哈顿”）
地点（例如“时代广场”）

每一行（LVL 1 条目除外）都链接到上一级的父项。

例如：时代广场->曼哈顿->纽约->美国->国家

我的问题：如何对这个数据集进行排序：

df2 <- structure(list(ID = c(3,6,9,11,12,19,411,50,77,83,105),
                      Parent = c(12,12,77,105,19,NA,3,41,19,77,19),
                      Level = c(3,3,3,3,2,1,4,5,2,3,2),
                      Name = c("New York","Boston","Oxford","Vancouver","USA","Countries",
                               "Manhattan","Times Square","UK","London","Canada")),
                 class = "data.frame",
                 row.names = c(NA, -11L))

进入这个：

df2 <- structure(list(ID = c(19,12,3,41,50,6,77,83,9,105,11),
                      Parent = c(NA,19,12,3,41,12,19,77,77,19,105),
                      Level = c(1,2,3,4,5,3,2,3,3,2,3),
                      Name = c("Countries","USA","New York","Manhattan","Times Square",
                               "Boston","UK","London","Oxford","Canada","Vancouver")),
                 class = "data.frame",
                 row.names = c(NA, -11L))

在df2中，列表按级别优先排列，但每个链接的子级别都在正下方。

我尝试了几种 dyplr::arrange() 变体（例如 arrange(Level, Parent)），但都无法解释嵌套数据。我认为解决方案可能是 group_by() 和使用 arrange( ,.by_group = TRUE) 的组合，如此处所做的那样 (R, dplyr - combination of group_by() and arrange() does not produce expected result?)。可惜我自己解决不了。

有人可以帮忙吗？ tidyverse/dplyr 解决方案将是首选:-)

Answer 1

这是一个使用igraph::dfs

的解决方案

library(igraph)

g <- with(na.omit(df2), graph.data.frame(cbind(Parent, ID), directed = TRUE))
 

data.frame(ID = as.integer(names(dfs(g, root = "19")$order))) |>
  left_join(df2)
           
##> + Joining, by = "ID"
##>     ID Parent Level         Name
##> 1   19     NA     1    Countries
##> 2   12     19     2          USA
##> 3    3     12     3     New York
##> 4   41      3     4    Manhattan
##> 5   50     41     5 Times Square
##> 6    6     12     3       Boston
##> 7   77     19     2           UK
##> 8    9     77     3       Oxford
##> 9   83     77     3       London
##> 10 105     19     2       Canada
##> 11  11    105     3    Vancouver

如何在 R 中安排嵌套数据（即具有父级的数据）？

How to arrange nested data (i.e., data with parenting) in R?

r

dplyr

tidyverse