R 中 data.table 中的行索引

Question

如何在 R 中控制 data.table 中的行索引？

我想检查一行中的值是否与前一个匹配：

patient    produkt    output
1          Meg        Initiation
1          Meg        Continue
1          Gem        Switch
2          Pol        Initiation
2          Pol        Continue
2          Pol        Continue

我希望输出列是输出（如果这样更容易，可以用数字代替 initiation=0, continue=1, switch=2）。

我找不到如何控制 data.table 中的索引，并且以下内容不起作用

test[ , switcher2 := identical(produkt, produkt[-1]),by=patient]

欢迎任何想法。不过它必须在 data.table 中。

Answer 1

这是使用 devel version on GH

中的新 shift 函数的尝试

我在这里使用了 0:2 符号，因为它写起来更短，但您可以改用单词

test[ , output2 := c(0, (2:1)[(produkt == shift(produkt)) + 1][-1]), by = patient]
#    patient produkt     output output2
# 1:       1     Meg Initiation       0
# 2:       1     Meg   Continue       1
# 3:       1     Gem     Switch       2
# 4:       2     Pol Initiation       0
# 5:       2     Pol   Continue       1
# 6:       2     Pol   Continue       1

我基本上总是从每个组的 0 开始，然后与每个组的先前值进行比较。如果 TRUE 则分配 1。如果 FALSE 则分配 2。

如果您想用文字表达，这里是替代版本

test[ ,output3 := c("Initiation", c("Switch", "Continue")[(produkt == shift(produkt)) + 1][-1]), by = patient]

安装说明：

library(devtools)
install_github("Rdatatable/data.table", build_vignettes = FALSE)

Answer 2

此处使用 diff 的选项。我正在使用 ifelse 将整数值更改为字符。最后，对于每个组，第一个元素被设置为初始值。

setDT(dx)[,output := {
   xx <- ifelse(c(0,diff(as.integer(factor(produkt))))<0,
                "Switch","Continue")
   xx <- as.character(xx)
   xx[1] <- "Initiation"
   xx
   },
patient]

#   patient produkt     output
# 1:       1     Meg Initiation
# 2:       1     Meg   Continue
# 3:       1     Gem     Switch
# 4:       2     Pol Initiation
# 5:       2     Pol   Continue
# 6:       2     Pol   Continue

R 中 data.table 中的行索引

Row indices in data.table in R

r

data.table