有没有办法在 R 中使用 dplyr 根据另一个列的值创建一个新列?
Is there a way to create a new column based on the values of another one using dplyr in R?
我一直在使用 base R,但我想使用 dplyr
。这就是我一直在做的事情:
data$newvariable <- 0
data$newvariable[data$oldvariable=="happy"] <- "good"
data$newvariable[data$oldvariable=="unhappy"] <- "bad"
data$newvariable[data$oldvariable=="depressed"] <- "super_bad"
在dplyr
中,我们可以使用case_when
在oldvariable
的基础上给newvariable
赋新值。
library(dplyr)
data = data.frame(
oldvariable = c("happy", "unhappy", "depressed")
)
data %>%
mutate(newvariable = case_when(
oldvariable == "happy" ~ "good",
oldvariable == "unhappy" ~ "bad",
oldvariable == "depressed" ~ "super_bad"
))
#> oldvariable newvariable
#> 1 happy good
#> 2 unhappy bad
#> 3 depressed super_bad
如果旧变量是一个因素,并且您不介意新变量是一个:
library(dplyr)
set.seed(111)
data = data.frame(
oldvariable=sample(c("happy","unhappy","depressed"),10,replace=TRUE))
data %>% mutate(newvariable=recode_factor(oldvariable,
"happy"="good","unhappy"="bad","depressed"="super_bad"))
oldvariable newvariable
1 unhappy bad
2 depressed super_bad
3 depressed super_bad
4 depressed super_bad
5 happy good
6 depressed super_bad
7 happy good
8 depressed super_bad
9 unhappy bad
10 happy good
我一直在使用 base R,但我想使用 dplyr
。这就是我一直在做的事情:
data$newvariable <- 0
data$newvariable[data$oldvariable=="happy"] <- "good"
data$newvariable[data$oldvariable=="unhappy"] <- "bad"
data$newvariable[data$oldvariable=="depressed"] <- "super_bad"
在dplyr
中,我们可以使用case_when
在oldvariable
的基础上给newvariable
赋新值。
library(dplyr)
data = data.frame(
oldvariable = c("happy", "unhappy", "depressed")
)
data %>%
mutate(newvariable = case_when(
oldvariable == "happy" ~ "good",
oldvariable == "unhappy" ~ "bad",
oldvariable == "depressed" ~ "super_bad"
))
#> oldvariable newvariable
#> 1 happy good
#> 2 unhappy bad
#> 3 depressed super_bad
如果旧变量是一个因素,并且您不介意新变量是一个:
library(dplyr)
set.seed(111)
data = data.frame(
oldvariable=sample(c("happy","unhappy","depressed"),10,replace=TRUE))
data %>% mutate(newvariable=recode_factor(oldvariable,
"happy"="good","unhappy"="bad","depressed"="super_bad"))
oldvariable newvariable
1 unhappy bad
2 depressed super_bad
3 depressed super_bad
4 depressed super_bad
5 happy good
6 depressed super_bad
7 happy good
8 depressed super_bad
9 unhappy bad
10 happy good