使用基于多个条件的值的 dplyr 改变一个新列;已尝试 lapply 但仍无法正常工作

mutate a new column using dplyr with values based on multiple conditions; Tried lapply but still not working

我有一个包含多列的数据框,我想在其中创建一个包含基于列状态的值的新列。

我是 R 的新手,但我认为这样做是可行的。

我的数据框 str() 是:

我的列状态包含一个故障代码,其值如240:12、05:03:90:312等。但有些代码不是故障代码,只是信息。所以我想创建一个新列来说明哪些代码是错误的,哪些不是。

我知道代码以:

开头

"00","01","02","03","04","05","07","08","09","10","11"," 12","14","15","16","17","20","21","60","240","600"

都不是故障,其他都是故障码。

状态中的值是字符。

我的解决方案是:

dataframe3 %>% 
  mutate(Status_fault = case_when(startsWith(Status,C("00","01","02",
            "03","04","05","07","08","09","10","11",
            "12","14","15","16","17","20","21","60","240","600"))
   ~ "No fault",
    T ~ "fault"))

但这会导致

Error: Problem with mutate() input Status_problem. x object not interpretable as a factor i Input Status_problem is case_when(...).

任何人有解决这个问题的想法吗?我到处搜索堆栈溢出但是我找了这么久,我觉得我不能再思考了......

该问题与使用 lapply 的另一个问题相关联。所以我做了一个新的解决方案:

dataframe3 %>% 
  mutate(Status_problem = case_when(lapply(c('00','01','02','03','04','05','07','08','09','10','11','12','14','15','16','17','20','21','60','240','600'),starts_with, X = Status)
   ~ "No fault",
    T ~ "fault"))

不幸的是,这导致:

Error: Problem with mutate() input Status_problem. x c("'c("00", "01", "02", "03", "04", "05", "07", "08", "09", "10", ' is not a function, character or symbol", "' "11", "12", "14", "15", "16", "17", "20", "21", "60", "240", ' is not a function, character or symbol", "' "600")' is not a function, character or symbol") i Input Status_problem is case_when(...).

有人看出我做错了什么吗?

试试这个:

noFaultCodes = c("00","01","02", "03","04","05","07","08","09","10","11",
                 "12","14","15","16","17","20","21","60","240","600")    
dataframe3 %>% mutate(Status_fault = ifelse(gsub(':.*', '', Status) %in% noFaultCodes,
                                        "No fault", "fault"))

gsub() 会删除 Status 列中 : 之后的所有内容。 %in% 检查修剪后的字符串是否在我们创建的名为 noFaultCodes.

的集合中