如何使用 mutate 根据 "if, then" 条件定义新变量？

Question

例如，假设我有一个像这样的简单数据集：

  student numerical_score
1     tom            84.7
2   betty            77.3
3    jose            91.5

我想使用 dplyr 中的 mutate 创建一个名为“letter_grade”的附加变量，该变量根据“numerical_score”中的值分配更高的成绩。例如，汤姆的成绩为 B，贝蒂的成绩为 C+，何塞的成绩为 A-。我可以使用 mutate 基于单个条件创建变量，但我不确定如何在此处执行此操作。关于如何编写该代码的任何提示？提前致谢。

Answer 1

可以使用cut/findInterval-

library(dplyr)

df <- df %>%
  mutate(letter_grade = cut(numerical_score, c(0, 40, 60, 80, 90, 95, 100), 
                            c('F', 'D', 'C+', 'B', 'A-', 'A+')))
df

#  student numerical_score letter_grade
#1     tom            84.7            B
#2   betty            77.3           C+
#3    jose            91.5           A-

在这里，我们将 0-40 分为 'F'，40-60 分为 'D'，60-80 分为 'C+'，依此类推。您可以根据您的确切值更改中断和 labels。

另一种选择是根据 case_when -

中的条件单独分配每个等级

df <- df %>%
  mutate(letter_grade = case_when(numerical_score > 95 ~ 'A+', 
                                  numerical_score > 90 ~ 'A-', 
                                  numerical_score > 80 ~ 'B', 
                                  numerical_score > 60 ~ 'C+', 
                                  numerical_score > 500 ~ 'D', 
                                  TRUE ~ 'F'))

如何使用 mutate 根据 "if, then" 条件定义新变量？

How to use mutate to define a new variable based on "if, then" conditions?

r

data-manipulation

dataframe

dplyr