内部连接期间的向量减法
Vector-wise substraction during inner join
我有以下数据集
structure(list(Time = c(0L, 0L, 0L, 0L, 0L, 200L, 200L, 200L,
200L, 200L, 400L, 400L, 400L, 400L, 400L), AgentID = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), AC = c("c",
"c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c",
"c"), Layer = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), Type = c("b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b"), Data = c(0, 0, 0, 0, 0, 0.117073864,
0.13028602, 0.11111003, -0.11538354, 0.07852934, 0.24280901,
0.24271743, 0.21535376, -0.2213944, 0.23355752), SimulationID = c(4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), discountFactor = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), N = c(80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L)), row.names = c(NA, -15L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x55843aed0a70>)
我想做的是用 on=.(Layer,Type,AC,AgentID,Time,SimulationID,N,discountFactor)
对行进行分组,然后执行 nodeVelocity:=Data-oldData
,其中 oldData
是具有 oldTime=Time-200
的 Data
列.
我试过了
d[d[,.(Time=Time+200,Data), by=.(Layer,Type,AC,AgentID,SimulationID,N,discountFactor)],
`:=` (nodeVelocity=x.Data-i.Data, iData=i.Data)
,on=.(Layer,Type,AC,AgentID,Time = Time,SimulationID,N,discountFactor)]
然而,在上面的代码中,从它的iData
列可以看出,减法是通过i.Data
的最后一行(i.Data =: oldData)完成的。为什么是这样?我期待一个逐元素的向量减法,但我得到的是向量减去另一个向量的最后一行。
这里有一个选项:
DT[, c("ri", "T2") := .(rowid(rleid(Time, Layer, Type, AC, AgentID, SimulationID, N, discountFactor)),
Time+200L)]
DT[DT, on=.(Time=T2, Layer, Type, AC, AgentID, SimulationID, N, discountFactor, ri),
diff := x.Data - i.Data]
输出:
Time AgentID AC Layer Type Data SimulationID discountFactor N ri T2 diff
1: 0 1 c 0 b 0.00000000 4 0 80 1 200 NA
2: 0 1 c 0 b 0.00000000 4 0 80 2 200 NA
3: 0 1 c 0 b 0.00000000 4 0 80 3 200 NA
4: 0 1 c 0 b 0.00000000 4 0 80 4 200 NA
5: 0 1 c 0 b 0.00000000 4 0 80 5 200 NA
6: 200 1 c 0 b 0.11707386 4 0 80 1 400 0.11707386
7: 200 1 c 0 b 0.13028602 4 0 80 2 400 0.13028602
8: 200 1 c 0 b 0.11111003 4 0 80 3 400 0.11111003
9: 200 1 c 0 b -0.11538354 4 0 80 4 400 -0.11538354
10: 200 1 c 0 b 0.07852934 4 0 80 5 400 0.07852934
11: 400 1 c 0 b 0.24280901 4 0 80 1 600 0.12573515
12: 400 1 c 0 b 0.24271743 4 0 80 2 600 0.11243141
13: 400 1 c 0 b 0.21535376 4 0 80 3 600 0.10424373
14: 400 1 c 0 b -0.22139440 4 0 80 4 600 -0.10601086
15: 400 1 c 0 b 0.23355752 4 0 80 5 600 0.15502818
在您的第二个问题上,这是因为联接中的每个 i
行有多个 x
行。遍历连接结果后,使用最后一行。查看最近的post:
我有以下数据集
structure(list(Time = c(0L, 0L, 0L, 0L, 0L, 200L, 200L, 200L,
200L, 200L, 400L, 400L, 400L, 400L, 400L), AgentID = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), AC = c("c",
"c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c",
"c"), Layer = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), Type = c("b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b"), Data = c(0, 0, 0, 0, 0, 0.117073864,
0.13028602, 0.11111003, -0.11538354, 0.07852934, 0.24280901,
0.24271743, 0.21535376, -0.2213944, 0.23355752), SimulationID = c(4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), discountFactor = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), N = c(80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L)), row.names = c(NA, -15L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x55843aed0a70>)
我想做的是用 on=.(Layer,Type,AC,AgentID,Time,SimulationID,N,discountFactor)
对行进行分组,然后执行 nodeVelocity:=Data-oldData
,其中 oldData
是具有 oldTime=Time-200
的 Data
列.
我试过了
d[d[,.(Time=Time+200,Data), by=.(Layer,Type,AC,AgentID,SimulationID,N,discountFactor)],
`:=` (nodeVelocity=x.Data-i.Data, iData=i.Data)
,on=.(Layer,Type,AC,AgentID,Time = Time,SimulationID,N,discountFactor)]
然而,在上面的代码中,从它的iData
列可以看出,减法是通过i.Data
的最后一行(i.Data =: oldData)完成的。为什么是这样?我期待一个逐元素的向量减法,但我得到的是向量减去另一个向量的最后一行。
这里有一个选项:
DT[, c("ri", "T2") := .(rowid(rleid(Time, Layer, Type, AC, AgentID, SimulationID, N, discountFactor)),
Time+200L)]
DT[DT, on=.(Time=T2, Layer, Type, AC, AgentID, SimulationID, N, discountFactor, ri),
diff := x.Data - i.Data]
输出:
Time AgentID AC Layer Type Data SimulationID discountFactor N ri T2 diff
1: 0 1 c 0 b 0.00000000 4 0 80 1 200 NA
2: 0 1 c 0 b 0.00000000 4 0 80 2 200 NA
3: 0 1 c 0 b 0.00000000 4 0 80 3 200 NA
4: 0 1 c 0 b 0.00000000 4 0 80 4 200 NA
5: 0 1 c 0 b 0.00000000 4 0 80 5 200 NA
6: 200 1 c 0 b 0.11707386 4 0 80 1 400 0.11707386
7: 200 1 c 0 b 0.13028602 4 0 80 2 400 0.13028602
8: 200 1 c 0 b 0.11111003 4 0 80 3 400 0.11111003
9: 200 1 c 0 b -0.11538354 4 0 80 4 400 -0.11538354
10: 200 1 c 0 b 0.07852934 4 0 80 5 400 0.07852934
11: 400 1 c 0 b 0.24280901 4 0 80 1 600 0.12573515
12: 400 1 c 0 b 0.24271743 4 0 80 2 600 0.11243141
13: 400 1 c 0 b 0.21535376 4 0 80 3 600 0.10424373
14: 400 1 c 0 b -0.22139440 4 0 80 4 600 -0.10601086
15: 400 1 c 0 b 0.23355752 4 0 80 5 600 0.15502818
在您的第二个问题上,这是因为联接中的每个 i
行有多个 x
行。遍历连接结果后,使用最后一行。查看最近的post: