将现有数据帧变量转换为 Tidyverse 中的因子
Convert existing dataframe variable to factor in Tidyverse
我知道这个问题有很多版本,但我正在寻找一个具体的解决方案。当你在 dataframe 中有一个现有的字符变量时,是否有一种简单的方法可以使用 tidyverse 格式将该变量转换为因子?例如,下面的第二行代码不会对因子水平重新排序,但最后一行会。我如何使第二行工作?在某些情况下这会很有用——导入和修改现有数据集。非常感谢!
df <- data.frame(x = c(1,2), y = c('post','pre')) %>%
as_factor(y, levels = c('pre','post'))
df$y <- factor(df$y, levels = c('pre', 'post'))
我们可以使用 fct_relevel
从 forcats
library(dplyr)
library(forcats)
df1 <- data.frame(x = c(1,2), y = c('post','pre')) %>%
mutate(y = fct_relevel(y, 'pre', 'post'))
-输出
> df1$y
[1] post pre
Levels: pre post
关于 as_factor
的使用,根据文档
Compared to base R, when x is a character, this function creates levels in the order in which they appear, which will be the same on every platform.
即post
,然后是 pre
> as_factor(c('post','pre'))
[1] post pre
Levels: post pre
而以下选项将不起作用,因为 as_factor
中没有名为 levels
的参数
> as_factor(c('post','pre'), "pre", "post")
Error: 2 components of `...` were not used.
We detected these problematic arguments:
* `..1`
* `..2`
Did you misspecify an argument?
Run `rlang::last_error()` to see where the error occurred.
> as_factor(c('post','pre'), levels = c("pre", "post"))
Error: 1 components of `...` were not used.
We detected these problematic arguments:
* `levels`
Did you misspecify an argument?
Run `rlang::last_error()` to see where the error occurred.
此外,在tidyverse
中,我们需要提取带有pull
或.$
的列,否则必须修改mutate
中的列。
我们也可以使用 relevel
:
df <- data.frame(x = c(1,2), y = c('post','pre'))
library(dplyr)
df <- df %>%
mutate(y = relevel(as.factor(y), 'pre', 'post'))
df$y
levels(df$y)
x y
1 1 post
2 2 pre
> df$y
[1] post pre
Levels: pre post
> levels(df$y)
[1] "pre" "post"
我知道这个问题有很多版本,但我正在寻找一个具体的解决方案。当你在 dataframe 中有一个现有的字符变量时,是否有一种简单的方法可以使用 tidyverse 格式将该变量转换为因子?例如,下面的第二行代码不会对因子水平重新排序,但最后一行会。我如何使第二行工作?在某些情况下这会很有用——导入和修改现有数据集。非常感谢!
df <- data.frame(x = c(1,2), y = c('post','pre')) %>%
as_factor(y, levels = c('pre','post'))
df$y <- factor(df$y, levels = c('pre', 'post'))
我们可以使用 fct_relevel
从 forcats
library(dplyr)
library(forcats)
df1 <- data.frame(x = c(1,2), y = c('post','pre')) %>%
mutate(y = fct_relevel(y, 'pre', 'post'))
-输出
> df1$y
[1] post pre
Levels: pre post
关于 as_factor
的使用,根据文档
Compared to base R, when x is a character, this function creates levels in the order in which they appear, which will be the same on every platform.
即post
,然后是 pre
> as_factor(c('post','pre'))
[1] post pre
Levels: post pre
而以下选项将不起作用,因为 as_factor
levels
的参数
> as_factor(c('post','pre'), "pre", "post")
Error: 2 components of `...` were not used.
We detected these problematic arguments:
* `..1`
* `..2`
Did you misspecify an argument?
Run `rlang::last_error()` to see where the error occurred.
> as_factor(c('post','pre'), levels = c("pre", "post"))
Error: 1 components of `...` were not used.
We detected these problematic arguments:
* `levels`
Did you misspecify an argument?
Run `rlang::last_error()` to see where the error occurred.
此外,在tidyverse
中,我们需要提取带有pull
或.$
的列,否则必须修改mutate
中的列。
我们也可以使用 relevel
:
df <- data.frame(x = c(1,2), y = c('post','pre'))
library(dplyr)
df <- df %>%
mutate(y = relevel(as.factor(y), 'pre', 'post'))
df$y
levels(df$y)
x y
1 1 post
2 2 pre
> df$y
[1] post pre
Levels: pre post
> levels(df$y)
[1] "pre" "post"