在 R 中的 data.table 中按组执行 := 时如何更改目标列的类型？

Question

我正在尝试对类型为 'integer' 的现有列执行 := by group，其中新值的类型为 'double'，但失败了。

我的方案是根据其他列中的值将表示时间的列转变为 POSIXct。我可以修改 data.table 的创建作为解决方法，但我仍然对如何实际更改列的类型感兴趣，正如错误消息中所建议的那样。

这是我的问题的一个简单玩具示例：

db = data.table(id=rep(1:2, each=5), x=1:10, y=runif(10))
db
id  x          y
 1:  1  1 0.47154470
 2:  1  2 0.03325867
 3:  1  3 0.56784494
 4:  1  4 0.47936031
 5:  1  5 0.96318208
 6:  2  6 0.83257416
 7:  2  7 0.10659533
 8:  2  8 0.23103810
 9:  2  9 0.02900567
10:  2 10 0.38346531

db[, x:=mean(y), by=id]   

Error in `[.data.table`(db, , `:=`(x, mean(y)), by = id) : 
Type of RHS ('double') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)

Answer 1

在将 'mean(y)' 分配给 'x' 作为 class 之前，我们可以将 'x' 列的 class 转换为 'numeric' =21=] 是 'integer'。如果我们将 'x' 替换为任何其他数字变量（包括 'x'）的 mean，这可能会有用。

db[, x:= as.numeric(x)][, x:= mean(y), by=id][]

或者分配给一个新列，然后更改列名

setnames(db[, x1:= mean(y),by=id][,x:=NULL],'x1', 'x')

或者我们可以将 'x' 分配给 'NULL'，然后创建 'x' 作为 'y' 的 mean（@David Arenburg 的建议）

db[, x:=NULL][, x:= mean(y), by= id][]

在 R 中的 data.table 中按组执行 := 时如何更改目标列的类型？

How to change type of target column when doing := by group in a data.table in R?

types

r

data.table