在 R 中按组创建总和滞后变量

Create summed lagged variable by group in R

如有任何帮助,我们将不胜感激!

本质上,我需要一个变量,在考虑日期变量的同时按组对先前观察的数量求和。

例如:

my current data:

ID <- c("A", "A", "A","A", "B", "B", "B") 
YEAR <- c(1900, 1901, 1902, 1903, 1900, 1901, 1902) 
CASH <- c(1, 2, 3, 1, 0, 1, 0) 
DF <- data.frame(ID, YEAR, CASH) 
print(DF)

what I would like my data to look like:

ID <- c("A", "A", "A","A", "B", "B", "B") 
YEAR <- c(1900, 1901, 1902, 1903, 1900, 1901, 1902) 
CASH <- c(1, 2, 3, 1, 0, 1, 0)
PREV_CASH <- c(NA, 1, 3, 6, NA, NA, 1)
DF2 <- data.frame(ID, YEAR, CASH, PREV_CASH)
print(DF2)

我想对每组上一年的现金金额求和。

按'ID'

分组后,我们可以使用'CASH'的cumsumlag
library(dplyr)
DF %>%
    group_by(ID) %>%
    mutate(PREV_CASH = lag(cumsum(CASH)), PREV_CASH = replace(PREV_CASH, PREV_CASH==0, NA))
#       ID  YEAR  CASH PREV_CASH 
#    <fctr> <dbl> <dbl>     <dbl>
#1      A  1900     1        NA
#2      A  1901     2         1
#3      A  1902     3         3
#4      A  1903     1         6
#5      B  1900     0        NA
#6      B  1901     1        NA
#7      B  1902     0         1