基于过滤不同变量组合获取计数的Tidyverse解决方案
Tidyverse Solution to get Counts Based on Filtering Different Combinations of Variables
Library(tidyverse)
使用下面的代码,我想使用 table() 或 dplyr 来获取 Sat 变量(Q1Sat、Q2Sat、Q3Sat)的计数。但是,Q1Sat 与变量 Q1Used 相关,Q2Sat 与 Q2Used 相关,Q3Sat 与 Q3Used 相关。我想过滤掉每个组合的 Used 变量中的 "No",以及 House 变量中的 "No"。
因此,例如,要计算 Q1Sat 的计数,我需要在 Q1used 和 House 中过滤掉 "No"。对于 Q2Sat,我需要在 Q2Used 和 House 中过滤掉 "No",对于 Q3 Sat,我必须在 Q3Used 和 House 中过滤掉 "No"。
使用 Tidyverse,完成此任务的简单方法是什么? (最少的代码量)。如果需要,我想使用最新版本的 Tidyverse 包,包括 dplyr 的开发版本。
Q1Sat<-c("Neutral","Neutral","VSat","Sat","Neutral","Sat","VDis","Sat","Sat","VSat")
Q2Sat<-c("Neutral","VSat","Dis","Dis","VDis","Sat","Sat","VSat","Neutral","Dis")
Q3Sat<-c("Sat","Sat","Diss","Neutral","VSat","VDis","Sat","Sat","Sat","Neutral")
Q3Used<-c("Yes","No","Yes","Yes","Yes","Yes","Yes","Yes","Yes","No")
Q2Used<-c("Yes","Yes","Yes","Yes","No","No","Yes","Yes","Yes","Yes")
Q1Used<-c("Yes","Yes","Yes","No","No","Yes","Yes","Yes","No","Yes")
House<-c("Yes","No","Unsure","Yes","Yes","No","Unsure","Unsure","Yes","Yes")
Test<-data_frame(Q1Sat,Q2Sat,Q3Sat,Q1Used,Q2Used,Q3Used,House)
Test %>%
mutate(q1 = ifelse(Q1Used=="Yes", Q1Sat, NA),
q2 = ifelse(Q2Used=="Yes", Q2Sat, NA),
q3 = ifelse(Q3Used=="Yes", Q3Sat, NA)) %>%
select(q1:q3) %>%
sapply(., table)
$q1
Neutral Sat VDis VSat
2 2 1 2
$q2
Dis Neutral Sat VSat
3 2 1 2
$q3
Diss Neutral Sat VDis VSat
1 1 4 1 1
这是一个使用 data.table
的选项。我们将 'data.frame' 转换为 'data.table' (setDT(Test)
),通过指定 melt
中的 patterns
将其重塑为 'long',按 [= 分组28=] 和 'Sat',获取 'Used' 为 'Yes' 的计数并将其重新整形为 'wide' 格式
library(data.table)
dcast(melt(setDT(Test), measure = patterns("Sat", "Used"),
value.name = c("Sat", "Used"), variable.name = 'Qs')[
Used == "Yes", .N , .(Qs, Sat)], Qs~Sat, fill=0)[, Qs := nm1[Qs][]
# Qs Dis Diss Neutral Sat VDis VSat
#1: Q1 0 0 2 2 1 2
#2: Q2 3 0 2 1 0 2
#3: Q3 0 1 1 4 1 1
此外,我们可以使用 base R
更紧凑地完成此操作
un1 <- unique(unlist(Test[1:3]))
t(mapply(function(x,y) table(factor(x[y == "Yes"], levels = un1)), Test[1:3], Test[4:6]))
或更紧凑
table(col(Test[1:3]), unlist(replace(Test[1:3], Test[4:6]!= "Yes", NA)))
# Dis Diss Neutral Sat VDis VSat
#1 0 0 2 2 1 2
#2 3 0 2 1 0 2
#3 0 1 1 4 1 1
Library(tidyverse)
使用下面的代码,我想使用 table() 或 dplyr 来获取 Sat 变量(Q1Sat、Q2Sat、Q3Sat)的计数。但是,Q1Sat 与变量 Q1Used 相关,Q2Sat 与 Q2Used 相关,Q3Sat 与 Q3Used 相关。我想过滤掉每个组合的 Used 变量中的 "No",以及 House 变量中的 "No"。
因此,例如,要计算 Q1Sat 的计数,我需要在 Q1used 和 House 中过滤掉 "No"。对于 Q2Sat,我需要在 Q2Used 和 House 中过滤掉 "No",对于 Q3 Sat,我必须在 Q3Used 和 House 中过滤掉 "No"。
使用 Tidyverse,完成此任务的简单方法是什么? (最少的代码量)。如果需要,我想使用最新版本的 Tidyverse 包,包括 dplyr 的开发版本。
Q1Sat<-c("Neutral","Neutral","VSat","Sat","Neutral","Sat","VDis","Sat","Sat","VSat")
Q2Sat<-c("Neutral","VSat","Dis","Dis","VDis","Sat","Sat","VSat","Neutral","Dis")
Q3Sat<-c("Sat","Sat","Diss","Neutral","VSat","VDis","Sat","Sat","Sat","Neutral")
Q3Used<-c("Yes","No","Yes","Yes","Yes","Yes","Yes","Yes","Yes","No")
Q2Used<-c("Yes","Yes","Yes","Yes","No","No","Yes","Yes","Yes","Yes")
Q1Used<-c("Yes","Yes","Yes","No","No","Yes","Yes","Yes","No","Yes")
House<-c("Yes","No","Unsure","Yes","Yes","No","Unsure","Unsure","Yes","Yes")
Test<-data_frame(Q1Sat,Q2Sat,Q3Sat,Q1Used,Q2Used,Q3Used,House)
Test %>%
mutate(q1 = ifelse(Q1Used=="Yes", Q1Sat, NA),
q2 = ifelse(Q2Used=="Yes", Q2Sat, NA),
q3 = ifelse(Q3Used=="Yes", Q3Sat, NA)) %>%
select(q1:q3) %>%
sapply(., table)
$q1
Neutral Sat VDis VSat
2 2 1 2
$q2
Dis Neutral Sat VSat
3 2 1 2
$q3
Diss Neutral Sat VDis VSat
1 1 4 1 1
这是一个使用 data.table
的选项。我们将 'data.frame' 转换为 'data.table' (setDT(Test)
),通过指定 melt
中的 patterns
将其重塑为 'long',按 [= 分组28=] 和 'Sat',获取 'Used' 为 'Yes' 的计数并将其重新整形为 'wide' 格式
library(data.table)
dcast(melt(setDT(Test), measure = patterns("Sat", "Used"),
value.name = c("Sat", "Used"), variable.name = 'Qs')[
Used == "Yes", .N , .(Qs, Sat)], Qs~Sat, fill=0)[, Qs := nm1[Qs][]
# Qs Dis Diss Neutral Sat VDis VSat
#1: Q1 0 0 2 2 1 2
#2: Q2 3 0 2 1 0 2
#3: Q3 0 1 1 4 1 1
此外,我们可以使用 base R
un1 <- unique(unlist(Test[1:3]))
t(mapply(function(x,y) table(factor(x[y == "Yes"], levels = un1)), Test[1:3], Test[4:6]))
或更紧凑
table(col(Test[1:3]), unlist(replace(Test[1:3], Test[4:6]!= "Yes", NA)))
# Dis Diss Neutral Sat VDis VSat
#1 0 0 2 2 1 2
#2 3 0 2 1 0 2
#3 0 1 1 4 1 1