带标签的李克特数据的 R 频率 Table
R Frequency Table of Likert Data with Labels
我尝试从共享相同答案类别(李克特类型)的多个变量中创建一个频率 table。具有三个变量(问题 1-3)和 5 个答案类别(-- 到 ++)的结果应该如下所示:
| | -- | - | ~ | + | ++ |
| --------- | --- | --- | --- | --- | --- |
| Question1 | 5% | 20% | 25% | 30% | 20% |
| Question2 | 15% | 10% | 20% | 25% | 30% |
| Question3 | 10% | 30% | 10% | 30% | 20% |
我在 找到了一个有效的解决方案,其中包含包 expss
中的函数,这对创建加权和标记频率 table 非常有帮助。但是我在使用标签时遇到了一些麻烦,因为当变量被标记时这个解决方案似乎不起作用:
1) @GregoryDemin 的 expss
解决方案来自:
# The data we'll also use in the examples below.
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# add value lables for preserving empty categories
val_lab(df) = autonum(1:5)
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(list(each)) %>%
tab_cols(vars(each)) %>%
tab_stat_rpct(total_row_position = "none")
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), NA, paste0(round(x, 0), "%"))
res
输出:
| | 1 | 2 | 3 | 4 | 5 |
| -- | -- | --- | --- | --- | --- |
| q1 | | 20% | 30% | 30% | 20% |
| q2 | | 10% | 20% | 40% | 30% |
| q3 | | 30% | 10% | 30% | 30% |
看起来不错 - 尽管 NA 应该是零,不是吗?。我们如何确保未使用的类别显示 0% 而不是 NA?
2) 现在我们添加一些 variable/value 标签:
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# Label variables and categories
df %<>% apply_labels(q1 = "Question 1",
q2 = "Question 2",
q3 = "Question 3",
q1 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q2 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q3 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1))
# add value lables for preserving empty categories
#val_lab(df) = autonum(1:5) # we labelled before, so no need for that anymore
# Now for the table
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(list(each)) %>%
tab_cols(vars(each)) %>%
tab_stat_rpct(total_row_position = "none")
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), NA, paste0(round(x, 0), "%"))
res
输出:
| | Question 1 | | | | | Question 2 | | | |
| | strongly disagree | disagree | neutral | agree | strongly agree | strongly disagree | disagree | neutral | agree |
| -- | ----------------- | -------- | ------- | ----- | -------------- | ----------------- | -------- | ------- | ----- |
| q1 | | 20% | 30% | 30% | 20% | | | | |
| q2 | | | | | | | 10% | 20% | 40% |
| q3 | | | | | | | | | |
| Question 3 | | | | |
strongly agree | strongly disagree | disagree | neutral | agree | strongly agree |
-------------- | ----------------- | -------- | ------- | ----- | -------------- |
| | | | | |
30% | | | | | |
| | 30% | 10% | 30% | 30% |
变量不再堆叠,而是并排站立。如果我们添加变量标签,这个可行的解决方案似乎会中断。您知道如何防止这种情况发生吗?
中的方法仅适用于没有变量标签的变量。零百分比的通用代码:
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# Label variables and categories
df %<>% apply_labels(q1 = "Question 1",
q2 = "Question 2",
q3 = "Question 3",
q1 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q2 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q3 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1))
# add value lables for preserving empty categories
#val_lab(df) = autonum(1:5) # we labelled before, so no need for that anymore
# Now for the table
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(total(label = "|")) %>% # suppress total label
tab_cols(unvr(vars(each))) %>% # remove variable label
tab_stat_rpct(total_row_position = "none", label = var_lab(vars(each))) # use variable label as statistic label
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), "0%", paste0(round(x, 0), "%"))
res
# | | strongly disagree | disagree | neutral | agree | strongly agree |
# | ---------- | ----------------- | -------- | ------- | ----- | -------------- |
# | Question 1 | 0% | 20% | 30% | 30% | 20% |
# | Question 2 | 0% | 10% | 20% | 40% | 30% |
# | Question 3 | 0% | 30% | 10% | 30% | 30% |
我尝试从共享相同答案类别(李克特类型)的多个变量中创建一个频率 table。具有三个变量(问题 1-3)和 5 个答案类别(-- 到 ++)的结果应该如下所示:
| | -- | - | ~ | + | ++ |
| --------- | --- | --- | --- | --- | --- |
| Question1 | 5% | 20% | 25% | 30% | 20% |
| Question2 | 15% | 10% | 20% | 25% | 30% |
| Question3 | 10% | 30% | 10% | 30% | 20% |
我在 找到了一个有效的解决方案,其中包含包 expss
中的函数,这对创建加权和标记频率 table 非常有帮助。但是我在使用标签时遇到了一些麻烦,因为当变量被标记时这个解决方案似乎不起作用:
1) @GregoryDemin 的 expss
解决方案来自:
# The data we'll also use in the examples below.
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# add value lables for preserving empty categories
val_lab(df) = autonum(1:5)
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(list(each)) %>%
tab_cols(vars(each)) %>%
tab_stat_rpct(total_row_position = "none")
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), NA, paste0(round(x, 0), "%"))
res
输出:
| | 1 | 2 | 3 | 4 | 5 |
| -- | -- | --- | --- | --- | --- |
| q1 | | 20% | 30% | 30% | 20% |
| q2 | | 10% | 20% | 40% | 30% |
| q3 | | 30% | 10% | 30% | 30% |
看起来不错 - 尽管 NA 应该是零,不是吗?。我们如何确保未使用的类别显示 0% 而不是 NA?
2) 现在我们添加一些 variable/value 标签:
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# Label variables and categories
df %<>% apply_labels(q1 = "Question 1",
q2 = "Question 2",
q3 = "Question 3",
q1 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q2 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q3 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1))
# add value lables for preserving empty categories
#val_lab(df) = autonum(1:5) # we labelled before, so no need for that anymore
# Now for the table
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(list(each)) %>%
tab_cols(vars(each)) %>%
tab_stat_rpct(total_row_position = "none")
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), NA, paste0(round(x, 0), "%"))
res
输出:
| | Question 1 | | | | | Question 2 | | | |
| | strongly disagree | disagree | neutral | agree | strongly agree | strongly disagree | disagree | neutral | agree |
| -- | ----------------- | -------- | ------- | ----- | -------------- | ----------------- | -------- | ------- | ----- |
| q1 | | 20% | 30% | 30% | 20% | | | | |
| q2 | | | | | | | 10% | 20% | 40% |
| q3 | | | | | | | | | |
| Question 3 | | | | |
strongly agree | strongly disagree | disagree | neutral | agree | strongly agree |
-------------- | ----------------- | -------- | ------- | ----- | -------------- |
| | | | | |
30% | | | | | |
| | 30% | 10% | 30% | 30% |
变量不再堆叠,而是并排站立。如果我们添加变量标签,这个可行的解决方案似乎会中断。您知道如何防止这种情况发生吗?
中的方法仅适用于没有变量标签的变量。零百分比的通用代码:
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)
library(expss)
# Label variables and categories
df %<>% apply_labels(q1 = "Question 1",
q2 = "Question 2",
q3 = "Question 3",
q1 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q2 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1),
q3 = c("strongly agree" = 5, "agree" = 4, "neutral" = 3, "disagree" = 2, "strongly disagree" = 1))
# add value lables for preserving empty categories
#val_lab(df) = autonum(1:5) # we labelled before, so no need for that anymore
# Now for the table
res = df
for(each in colnames(df)){
res = res %>%
tab_cells(total(label = "|")) %>% # suppress total label
tab_cols(unvr(vars(each))) %>% # remove variable label
tab_stat_rpct(total_row_position = "none", label = var_lab(vars(each))) # use variable label as statistic label
}
res = res %>% tab_pivot()
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), "0%", paste0(round(x, 0), "%"))
res
# | | strongly disagree | disagree | neutral | agree | strongly agree |
# | ---------- | ----------------- | -------- | ------- | ----- | -------------- |
# | Question 1 | 0% | 20% | 30% | 30% | 20% |
# | Question 2 | 0% | 10% | 20% | 40% | 30% |
# | Question 3 | 0% | 30% | 10% | 30% | 30% |