获取 R 中多个 variables/columns 的分类因子计数

Question

我在 R 中有以下最小示例：

testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing

        one          two
1  Once a week Once a month
2  Once a week Once a month
3       Rarely  Once a week
4 Once a month       Rarely
5 Once a month       Rarely

我希望最终结果是一个数据框，其中一列包含所有可能的分类因素，其余列是每个 column/variable 的计数，如下所示：

categories    one    two
Rarely        1      2
Once a month  2      2
Once a week   2      1

我对 R 库没有任何限制，所以这里最简单的（也许 plyr/dplyr？）。

谢谢。

Answer 1

您可以使用 tidyr 和 dplyr 包整理您的 table 并使用基础 table 函数

计算类别

testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing
#>            one          two
#> 1  Once a week Once a month
#> 2  Once a week Once a month
#> 3       Rarely  Once a week
#> 4 Once a month       Rarely
#> 5 Once a month       Rarely

library(tidyr)
library(dplyr)

testing %>%
  gather("type", "categories") %>%
  table()
#>      categories
#> type  Once a month Once a week Rarely
#>   one            2           2      1
#>   two            2           1      2

# or reorder colum before table
testing %>%
  gather("type", "categories") %>%
  select(categories, type) %>%
  table()
#>               type
#> categories     one two
#>   Once a month   2   2
#>   Once a week    2   1
#>   Rarely         1   2

Answer 2

这是使用 tidyr::gather、tidyr::spread 和 dplyr::count 的另一种方法：

library(dplyr)
library(tidyr)

testing %>%
  gather(measure, value) %>%
  count(measure, value) %>%
  spread(measure, n)

# Source: local data frame [3 x 3]
# 
#          value   one   two
#          (chr) (int) (int)
# 1 Once a month     2     2
# 2  Once a week     2     1
# 3       Rarely     1     2

另外，请参阅有关此主题的fantastic gist。

Answer 3

Table 无需外部包即可工作：

sapply(testing, table)
#             one two
#Once a month   2   2
#Once a week    2   1
#Rarely         1   2

获取 R 中多个 variables/columns 的分类因子计数

Get counts of categorical factors across multiple variables/columns in R

r

plyr

dplyr