R按组总和对ddply进行排序

Question

我有一个data.frame这样的

x <- data.frame(Category=factor(c("One", "One", "Four", "Two","Two",
"Three", "Two", "Four","Three")),
City=factor(c("D","A","B","B","A","D","A","C","C")),
Frequency=c(10,1,5,2,14,8,20,3,5))

  Category City Frequency
1      One    D        10
2      One    A         1
3     Four    B         5
4      Two    B         2
5      Two    A        14
6    Three    D         8
7      Two    A        20
8     Four    C         3
9    Three    C         5

我想用 sum(Frequency) 创建一个枢轴 table 并像这样使用 ddply 函数：

ddply(x,.(Category,City),summarize,Total=sum(Frequency))
  Category City Total
1     Four    B     5
2     Four    C     3
3      One    A     1
4      One    D    10
5    Three    C     5
6    Three    D     8
7      Two    A    34
8      Two    B     2

但我需要此结果按每个类别组中的总数排序。像这样：

Category City Frequency
1      Two    A        34
2      Two    B         2
3    Three    D        14
4    Three    C         5
5      One    D        10
6      One    A         1
7     Four    B         5
8     Four    C         3

我查看并尝试了排序、排序、排列，但似乎没有任何东西可以满足我的需要。我怎样才能在 R 中做到这一点？

Answer 1

这是一个很好的问题，除了创建 总大小 索引然后按它排序之外，我想不出一个直接的方法。这是一种可能的 data.table 方法，它使用 setorder 函数，该函数将按参考

对数据进行排序

library(data.table)
Res <- setDT(x)[, .(Total = sum(Frequency)), by = .(Category, City)]
setorder(Res[, size := sum(Total), by = Category], -size, -Total, Category)[]
#    Category City Total size
# 1:      Two    A    34   36
# 2:      Two    B     2   36
# 3:    Three    D     8   13
# 4:    Three    C     5   13
# 5:      One    D    10   11
# 6:      One    A     1   11
# 7:     Four    B     5    8
# 8:     Four    C     3    8

或者，如果您深入了解 Hdleyverse，我们可以使用较新的 dplyr 包（如@akrun 所建议的那样）达到类似的结果

library(dplyr)
x %>% 
  group_by(Category, City) %>% 
  summarise(Total = sum(Frequency)) %>% 
  mutate(size= sum(Total)) %>% 
  ungroup %>%
  arrange(-size, -Total, Category)

Answer 2

这是一个基础 R 版本，其中 DF 是您 ddply 调用的结果：

with(DF, DF[order(-ave(Total, Category, FUN=sum), Category, -Total), ])

产生：

  Category City Total
7      Two    A    34
8      Two    B     2
6    Three    D     8
5    Three    C     5
4      One    D    10
3      One    A     1
1     Four    B     5
2     Four    C     3

逻辑与大卫的逻辑基本相同，计算每个 Category 的 Total 的总和，对每个 Category 中的所有行使用该数字（我们用 ave(..., FUN=sum))，然后按那个加上一些决胜局来确保结果按预期出现。

R按组总和对ddply进行排序

R sort summarise ddply by group sum

pivot-table

r

plyr