使用因素:在 R 中重新排序、重新标记和重新编码
working with factors: reordering, relabeling and recoding in R
我遇到了一些麻烦 "factoring" 字符变量和 labeling/recoding 不同的级别。使用 tidyverse
或 Base R
例如 recode
或 fct_collapse
等是否有一种有效的方法(即更少的编码)......谢谢
#This is what I have (a character variable)
x <- c("No", "Yes", "No2", "No3", "Maybe", "undecided",
"probably", "dont know", NA)
x
#I want a factor with three ordered levels as follows:
#where No = c("No", "No2", "No3")
#Yes = c("Yes")
#other = c("Maybe", "undecided", "probably")
#NA = c("dont know", NA)
# and the levels would be 0 = "No", 1 = "Yes" and 2 = "Maybe"
#that is:
#xfact
# [1] No Yes other
# Levels: No Yes other
#
# as.integer(xfact)
# [1] 0, 1, 2```
应该这样做:
library(tidyverse)
x <- c("No", "Yes", "No2", "No3", "Maybe", "undecided",
"probably", "dont know", NA)
na_if(x, "dont know") %>%
fct_collapse(
no = c("No", "No2", "No3"),
yes = c("Yes"),
other = c("Maybe", "undecided", "probably")
) %>%
fct_inorder()
#> [1] no yes no no other other other <NA> <NA>
#> Levels: no yes other
由 reprex package (v0.3.0)
于 2020 年 1 月 13 日创建
我遇到了一些麻烦 "factoring" 字符变量和 labeling/recoding 不同的级别。使用 tidyverse
或 Base R
例如 recode
或 fct_collapse
等是否有一种有效的方法(即更少的编码)......谢谢
#This is what I have (a character variable)
x <- c("No", "Yes", "No2", "No3", "Maybe", "undecided",
"probably", "dont know", NA)
x
#I want a factor with three ordered levels as follows:
#where No = c("No", "No2", "No3")
#Yes = c("Yes")
#other = c("Maybe", "undecided", "probably")
#NA = c("dont know", NA)
# and the levels would be 0 = "No", 1 = "Yes" and 2 = "Maybe"
#that is:
#xfact
# [1] No Yes other
# Levels: No Yes other
#
# as.integer(xfact)
# [1] 0, 1, 2```
应该这样做:
library(tidyverse)
x <- c("No", "Yes", "No2", "No3", "Maybe", "undecided",
"probably", "dont know", NA)
na_if(x, "dont know") %>%
fct_collapse(
no = c("No", "No2", "No3"),
yes = c("Yes"),
other = c("Maybe", "undecided", "probably")
) %>%
fct_inorder()
#> [1] no yes no no other other other <NA> <NA>
#> Levels: no yes other
由 reprex package (v0.3.0)
于 2020 年 1 月 13 日创建