R data.table 包 - 使用 := 运算符在列中添加值
R data.table package - adding values in columns using := operator
问题
我有一个 data.frame,我想根据其他列的数据将数据放在一个列中。
所以这是我的 data.frame 的例子(简化版):
Fertilization=c("N0","N0","N0","N0","N2","N2","N2","N2")
Sowing=c("S1","S1","S2","S2","S1","S1","S2","S2")
FoliarRank=c("F2","F3","F2","F3","F2","F3","F2","F3")
New_FoliarRank=rep(0,length(Fertilization))
DT=data.frame(Fertilization,Sowing,FoliarRank,New_FoliarRank)
我想根据施肥、播种、FoliarRank 列中的条件为 New_FoliarRank 赋值。
例如:
- 如果施肥=="N0"、播种=="S1"和FoliarRank=="F2",则New_FoliarRank=="F3*"
- 如果施肥=="N0"、播种=="S1"和FoliarRank=="F3",则New_FoliarRank=="F2*"
至于解决方案:
- 我可以用一堆for/if让它工作,但它会很慢而且不会
非常"R-ish",也许即使我"apply"它
- 据我所知,我可以使用 := 运算符
{data.table} 包。它可能会好得多。其实,已经
在 Stack Overflow 上的其他地方讨论 “将数值替换为
NA 基于其他列的条件 但我找不到使这个 post 的解决方案起作用的方法。而且我不明白为什么,即使在查看时也是如此?":= “。我遗漏了一些东西,也许很明显,所以我想我可以问一下。抱歉重复。
我尝试过的一些解决方案:
library(data.table)
DT[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2", New_FoliarRank:="F3*"] # seems to be same script as other post
DT[ , New_FoliarRank:= {Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2"; "F3*"}] # adapted from another post; doesn't work either
它给了我 return:
Error in `:=`(New_FoliarRank, "F3*") :
Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").
提出的解决方案(另一个解决方案在下面 posted)
# Initial vectors (no need for New_FoliarRank)
Fertilization=c("N0","N0","N0","N0","N2","N2","N2","N2")
Sowing=c("S1","S1","S2","S2","S1","S1","S2","S2")
FoliarRank=c("F2","F3","F2","F3","F2","F3","F2","F3")
# Actually I was missing the class of DT (data.table instead of data.frame)
DT=data.table(Fertilization,Sowing,FoliarRank)
library(data.table)
# And I shouldn't have created New_FoliarRank (esp. in with numerical values), as it is created "on the spot"
setDT(DT)[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2", New_FoliarRank := "F3*"]
setDT(DT)[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F3", New_FoliarRank := "F2*"]
您可以使用因子:
library(data.table)
setDT(DT)
DT[, New_FoliarRank := interaction(Fertilization, Sowing, FoliarRank)]
#check levels
levels(DT[, New_FoliarRank])
#assign new labels
DT[, New_FoliarRank := factor(New_FoliarRank,
levels = levels(New_FoliarRank),
labels = c("012", "212", "022", "222", "013", "213", "023", "223"))]
# Fertilization Sowing FoliarRank New_FoliarRank
#1: N0 S1 F2 012
#2: N0 S1 F3 013
#3: N0 S2 F2 022
#4: N0 S2 F3 023
#5: N2 S1 F2 212
#6: N2 S1 F3 213
#7: N2 S2 F2 222
#8: N2 S2 F3 223
问题
我有一个 data.frame,我想根据其他列的数据将数据放在一个列中。
所以这是我的 data.frame 的例子(简化版):
Fertilization=c("N0","N0","N0","N0","N2","N2","N2","N2")
Sowing=c("S1","S1","S2","S2","S1","S1","S2","S2")
FoliarRank=c("F2","F3","F2","F3","F2","F3","F2","F3")
New_FoliarRank=rep(0,length(Fertilization))
DT=data.frame(Fertilization,Sowing,FoliarRank,New_FoliarRank)
我想根据施肥、播种、FoliarRank 列中的条件为 New_FoliarRank 赋值。 例如:
- 如果施肥=="N0"、播种=="S1"和FoliarRank=="F2",则New_FoliarRank=="F3*"
- 如果施肥=="N0"、播种=="S1"和FoliarRank=="F3",则New_FoliarRank=="F2*"
至于解决方案:
- 我可以用一堆for/if让它工作,但它会很慢而且不会 非常"R-ish",也许即使我"apply"它
- 据我所知,我可以使用 := 运算符 {data.table} 包。它可能会好得多。其实,已经 在 Stack Overflow 上的其他地方讨论 “将数值替换为 NA 基于其他列的条件 但我找不到使这个 post 的解决方案起作用的方法。而且我不明白为什么,即使在查看时也是如此?":= “。我遗漏了一些东西,也许很明显,所以我想我可以问一下。抱歉重复。
我尝试过的一些解决方案:
library(data.table)
DT[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2", New_FoliarRank:="F3*"] # seems to be same script as other post
DT[ , New_FoliarRank:= {Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2"; "F3*"}] # adapted from another post; doesn't work either
它给了我 return:
Error in `:=`(New_FoliarRank, "F3*") :
Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").
提出的解决方案(另一个解决方案在下面 posted)
# Initial vectors (no need for New_FoliarRank)
Fertilization=c("N0","N0","N0","N0","N2","N2","N2","N2")
Sowing=c("S1","S1","S2","S2","S1","S1","S2","S2")
FoliarRank=c("F2","F3","F2","F3","F2","F3","F2","F3")
# Actually I was missing the class of DT (data.table instead of data.frame)
DT=data.table(Fertilization,Sowing,FoliarRank)
library(data.table)
# And I shouldn't have created New_FoliarRank (esp. in with numerical values), as it is created "on the spot"
setDT(DT)[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F2", New_FoliarRank := "F3*"]
setDT(DT)[Fertilization=="N0" & Sowing=="S1" & FoliarRank=="F3", New_FoliarRank := "F2*"]
您可以使用因子:
library(data.table)
setDT(DT)
DT[, New_FoliarRank := interaction(Fertilization, Sowing, FoliarRank)]
#check levels
levels(DT[, New_FoliarRank])
#assign new labels
DT[, New_FoliarRank := factor(New_FoliarRank,
levels = levels(New_FoliarRank),
labels = c("012", "212", "022", "222", "013", "213", "023", "223"))]
# Fertilization Sowing FoliarRank New_FoliarRank
#1: N0 S1 F2 012
#2: N0 S1 F3 013
#3: N0 S2 F2 022
#4: N0 S2 F3 023
#5: N2 S1 F2 212
#6: N2 S1 F3 213
#7: N2 S2 F2 222
#8: N2 S2 F3 223