是否有一个函数可以在 R 中的 table 内获得多个频率 tables
Is there a function to get multiple frequency tables within a table in R
我想在单个 table 内获得多个频率 tables。
这是我的数据:
df<-read.table(text=" group score night ticket book gender course
A Y 1 0 0 Male M
A Y 1 0 0 Female N
A N 1 1 1 Female N
A Y 2 1 1 Female M
A Y 2 1 1 Male N
A Y 2 0 0 Female N
A N 3 1 0 Male N
B N 3 1 1 Female N
B N 1 0 1 Female M
B Y 1 0 1 Female M
",header=TRUE)
输出为:
Frequency Percent
Group
A 7 70
B 3 30
score
Y 4 40
N 6 60
night
1 5 50
2 3 30
3 2 20
book
0 4 40
1 6 60
gender
Female 7 70
Male 3 30
course
M 4 40
N 6 60
我使用了以下代码:
df%>%
group_by( group, score, night, ticket, book, gender, course) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)
但它没有用。
一个通用的解决方案是在框架的每一列上应用 table
函数。通常 table
returns 一个命名向量,但你想要一个更像框架的表示,所以我们将用 as.data.frame.table
.
来扩充它
lst2 <- lapply(df, function(x) {
out <- as.data.frame.table(table(x))
out$Pct <- 100*out$Freq/sum(out$Freq)
out
})
# or code-golf:
# lapply(df, function(x) transform(as.data.frame.table(table(x)), Pct = 100*Freq/sum(Freq)))
lst2
# $group
# x Freq Pct
# 1 A 7 70
# 2 B 3 30
# $score
# x Freq Pct
# 1 N 4 40
# 2 Y 6 60
# $night
# x Freq Pct
# 1 1 5 50
# 2 2 3 30
# 3 3 2 20
# $ticket
# x Freq Pct
# 1 0 5 50
# 2 1 5 50
# $book
# x Freq Pct
# 1 0 4 40
# 2 1 6 60
# $gender
# x Freq Pct
# 1 Female 7 70
# 2 Male 3 30
# $course
# x Freq Pct
# 1 M 4 40
# 2 N 6 60
您可以将所有这些元素与类似的东西结合起来:
do.call(rbind, c(Map(cbind, nm=names(lst2), lst2), list(make.row.names = FALSE)))
# nm x Freq Pct
# 1 group A 7 70
# 2 group B 3 30
# 3 score N 4 40
# 4 score Y 6 60
# 5 night 1 5 50
# 6 night 2 3 30
# 7 night 3 2 20
# 8 ticket 0 5 50
# 9 ticket 1 5 50
# 10 book 0 4 40
# 11 book 1 6 60
# 12 gender Female 7 70
# 13 gender Male 3 30
# 14 course M 4 40
# 15 course N 6 60
已编辑 以默认删除行名称。
带有 tidyverse
的选项将是
library(purrr)
library(dplyr)
map(names(df), ~ df %>%
count(!!rlang::sym(.x)) %>%
mutate(Pct = 100 * n/sum(n)))
我想在单个 table 内获得多个频率 tables。
这是我的数据:
df<-read.table(text=" group score night ticket book gender course
A Y 1 0 0 Male M
A Y 1 0 0 Female N
A N 1 1 1 Female N
A Y 2 1 1 Female M
A Y 2 1 1 Male N
A Y 2 0 0 Female N
A N 3 1 0 Male N
B N 3 1 1 Female N
B N 1 0 1 Female M
B Y 1 0 1 Female M
",header=TRUE)
输出为:
Frequency Percent
Group
A 7 70
B 3 30
score
Y 4 40
N 6 60
night
1 5 50
2 3 30
3 2 20
book
0 4 40
1 6 60
gender
Female 7 70
Male 3 30
course
M 4 40
N 6 60
我使用了以下代码:
df%>%
group_by( group, score, night, ticket, book, gender, course) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)
但它没有用。
一个通用的解决方案是在框架的每一列上应用 table
函数。通常 table
returns 一个命名向量,但你想要一个更像框架的表示,所以我们将用 as.data.frame.table
.
lst2 <- lapply(df, function(x) {
out <- as.data.frame.table(table(x))
out$Pct <- 100*out$Freq/sum(out$Freq)
out
})
# or code-golf:
# lapply(df, function(x) transform(as.data.frame.table(table(x)), Pct = 100*Freq/sum(Freq)))
lst2
# $group
# x Freq Pct
# 1 A 7 70
# 2 B 3 30
# $score
# x Freq Pct
# 1 N 4 40
# 2 Y 6 60
# $night
# x Freq Pct
# 1 1 5 50
# 2 2 3 30
# 3 3 2 20
# $ticket
# x Freq Pct
# 1 0 5 50
# 2 1 5 50
# $book
# x Freq Pct
# 1 0 4 40
# 2 1 6 60
# $gender
# x Freq Pct
# 1 Female 7 70
# 2 Male 3 30
# $course
# x Freq Pct
# 1 M 4 40
# 2 N 6 60
您可以将所有这些元素与类似的东西结合起来:
do.call(rbind, c(Map(cbind, nm=names(lst2), lst2), list(make.row.names = FALSE)))
# nm x Freq Pct
# 1 group A 7 70
# 2 group B 3 30
# 3 score N 4 40
# 4 score Y 6 60
# 5 night 1 5 50
# 6 night 2 3 30
# 7 night 3 2 20
# 8 ticket 0 5 50
# 9 ticket 1 5 50
# 10 book 0 4 40
# 11 book 1 6 60
# 12 gender Female 7 70
# 13 gender Male 3 30
# 14 course M 4 40
# 15 course N 6 60
已编辑 以默认删除行名称。
带有 tidyverse
的选项将是
library(purrr)
library(dplyr)
map(names(df), ~ df %>%
count(!!rlang::sym(.x)) %>%
mutate(Pct = 100 * n/sum(n)))