根据使用 data.table 的条件计算过去一年的行数

Question

根据this previous post，我可以通过以下方式添加一个包含过去一年出现次数的列：

df[, boundary := date - 365]
df[, counts := df[df, .N,  on = .(id, date < date, date > boundary), by = .EACHI]$N]

这对我来说很好用。但是，我想通过仅计算另一列具有特定值的出现次数来做到这一点。例如，给定这样的数据集

id  type  date
ny  0     2021-09-27   
ny  0     2021-09-09
ny  1     2021-08-01
ny  1     2021-07-07
ch  0     2020-04-01
ch  1     2020-03-01 
ch  0     2020-02-01

我只想计算 type = 1 所在的行数。我怎样才能修改上面的功能来做到这一点？我尝试过类似的方法，但它不起作用：

df[, counts := df[df, .N(type = 1),  on = .(id, date < date, date > boundary), by = .EACHI]$N]

编辑：上述数据集的预期输出为：

id  type  date         counts
ny  0     2021-09-27   2
ny  0     2021-09-09   2
ny  1     2021-08-01   1  
ny  1     2021-07-07   0  
ch  0     2020-04-01   1
ch  1     2020-03-01   0 
ch  0     2020-02-01   0

Answer 1

您可以计算 sum(type == 1) 而不是 .N。

setDT(df)
df[, boundary := date - 365]

df[, counts := df[df, sum(type == 1),  
      on = .(id, date < date, date > boundary), by = .EACHI]$V1]
df[is.na(counts), counts := 0]
df

#   id type       date   boundary counts
#1: ny    0 2021-09-27 2020-09-27      2
#2: ny    0 2021-09-09 2020-09-09      2
#3: ny    1 2021-08-01 2020-08-01      1
#4: ny    1 2021-07-07 2020-07-07      0
#5: ch    0 2020-04-01 2019-04-02      1
#6: ch    1 2020-03-01 2019-03-02      0
#7: ch    0 2020-02-01 2019-02-01      0

根据使用 data.table 的条件计算过去一年的行数

Count rows in the past year according to a condition using data.table

r

data.table