带有“%in%”的重复条目

Duplicate entries with `%in%`

我需要对某些行的数据求和,如

DT[(Rows %in% Is) & (Cols %in% Js), sum(Values), ]

但问题是IsJs都包含重复的条目,我希望即使在重复的情况下也能进行求和。

Is = c(1,2,2,3)Js = c(4,5,5,6),那么由于条件匹配到(2,5)两次,我想求和相应的Value两次。我该怎么做?

编辑:

     flowSpeed channelCrossSectionInPX   pxToMeters  timeStep     h      w channelCrossSectionInUM
1: 1.732085e-05                    -475 3.999282e-06 0.1265946 4e-04 0.0038            -0.001899659
2: 1.732085e-05                    -474 3.999282e-06 0.1265946 4e-04 0.0038            -0.001895660
3: 1.732085e-05                    -473 3.999282e-06 0.1265946 4e-04 0.0038            -0.001891660
4: 1.732085e-05                    -472 3.999282e-06 0.1265946 4e-04 0.0038            -0.001887661
5: 1.978421e-05                    -471 3.999282e-06 0.1265946 4e-04 0.0038            -0.001883662
6: 1.978421e-05                    -470 3.999282e-06 0.1265946 4e-04 0.0038            -0.001879663
DT[channelCrossSectionInPX %in% c(-472, -472), sum(h),]
[1] 4e-04

在数据中,只有一个 -472 实例和 %in% return 匹配 rhs 向量的 lhs 列值的逻辑向量。如果我们想 replicate,创建一个索引,即。使用 match

library(data.table)
DT[match(c(-472, -472), channelCrossSectionInPX), sum(h)]
[1] 8e-04

match 有效,因为它 return 是匹配值的索引

> DT[, match(c(-472, -472), channelCrossSectionInPX)]
[1] 4 4
> DT[match(c(-472, -472), channelCrossSectionInPX)]
      flowSpeed channelCrossSectionInPX   pxToMeters  timeStep     h      w channelCrossSectionInUM
1: 1.732085e-05                    -472 3.999282e-06 0.1265946 4e-04 0.0038            -0.001887661
2: 1.732085e-05                    -472 3.999282e-06 0.1265946 4e-04 0.0038            -0.001887661

数据

DT <- structure(list(flowSpeed = c(1.732085e-05, 1.732085e-05, 1.732085e-05, 
1.732085e-05, 1.978421e-05, 1.978421e-05), channelCrossSectionInPX = -475:-470, 
    pxToMeters = c(3.999282e-06, 3.999282e-06, 3.999282e-06, 
    3.999282e-06, 3.999282e-06, 3.999282e-06), timeStep = c(0.1265946, 
    0.1265946, 0.1265946, 0.1265946, 0.1265946, 0.1265946), h = c(4e-04, 
    4e-04, 4e-04, 4e-04, 4e-04, 4e-04), w = c(0.0038, 0.0038, 
    0.0038, 0.0038, 0.0038, 0.0038), channelCrossSectionInUM = c(-0.001899659, 
    -0.00189566, -0.00189166, -0.001887661, -0.001883662, -0.001879663
    )), class = c("data.table", "data.frame"), row.names = c(NA, 
-6L), index = structure(integer(0), "`__channelCrossSectionInPX`" = integer(0)))

你可以试试:

sum(DT[Is,Js,with=FALSE])

例如 mtcars:

library(data.table)

DT <- as.data.table(mtcars)

Is = c(1,2,2,3)
Js = c(4,5,5,6)

DT[Is,Js,with=FALSE]
#>       hp  drat  drat    wt
#>    <num> <num> <num> <num>
#> 1:   110  3.90  3.90 2.620
#> 2:   110  3.90  3.90 2.875
#> 3:   110  3.90  3.90 2.875
#> 4:    93  3.85  3.85 2.320

sum(DT[Is,Js,with=FALSE])
#> [1] 464.79