是否有一个 R 函数可以根据条件计算数据框中先前日期的数量

Question

我想计算每个学生在最近一次缺勤之前的缺勤次数，并将这些计数作为一列添加到数据框中。

 Student ID       Absent Date       Subject        

    4567           08/30/2018          M
    4567           09/22/2019          M
    8345           09/01/2019          S
    8345           03/30/2019         PE         
    8345           07/18/2017          M
    5601           01/08/2019         SS

这是期望的输出：

 Student ID       Absent Date       Subject       Previous Absence            

    4567           08/30/2018          M                 1
    4567           09/22/2019          M                 1
    8345           09/01/2019          S                 2
    8345           03/30/2019         PE                 2        
    8345           07/18/2017          M                 2
    5601           01/08/2019         SS                 0

然后我想计算每个学生以前在数学方面的缺勤次数 (M)，并将这些计数作为一列添加到数据框中。

 Student ID       Absent Date       Subject       Previous Absence            

    4567           08/30/2018          M                 1
    4567           09/22/2019          M                 1
    8345           09/01/2019          S                 2
    8345           03/30/2019         PE                 2        
    8345           07/18/2017          M                 2
    5601           01/08/2019         SS                 0

期望的输出：

 Student ID  Absent Date  Subject  Prior Absence  Prior M Absence              

    4567      08/30/2018       M           1            1
    4567      09/22/2019       M           1            1
    8345      09/01/2019       S           2            0
    8345      03/30/2019      PE           2            0        
    8345      07/18/2017       M           2            0
    5601      01/08/2019      SS           0            0

谢谢！

Answer 1

这假设数据已经按 Absent_Date 排序（至少在每个 Student_ID 中）：

library(dplyr)
df %>%
  group_by(Student_ID) %>%
  mutate(
    n_prior_absence = n() - 1,
    n_prior_absence_math = sum(head(Subject, -1) == "M")
  )
# # A tibble: 6 × 5
# # Groups:   Student_ID [3]
#   Student_ID Absent_Date Subject n_prior_absence n_prior_absence_math
#        <int> <chr>       <chr>             <dbl>                <int>
# 1       4567 08/30/2018  M                     1                    1
# 2       4567 09/22/2019  M                     1                    1
# 3       8345 09/01/2019  S                     2                    0
# 4       8345 03/30/2019  PE                    2                    0
# 5       8345 07/18/2017  M                     2                    0
# 6       5601 01/08/2019  SS                    0                    0

使用此数据：

df = read.table(text = 'Student_ID       Absent_Date       Subject        
4567           08/30/2018          M
4567           09/22/2019          M
8345           09/01/2019          S
8345           03/30/2019         PE         
8345           07/18/2017          M
5601           01/08/2019         SS', header = T)

是否有一个 R 函数可以根据条件计算数据框中先前日期的数量

Is there an R function that counts the number of previous number of dates in a data frame and based on condition

r

date

count

dataframe