在 Tidyverse 中过滤事件数据的时间

Filtering Time to Event Data in Tidyverse

我有一些时间来处理我正在处理的事件数据。我想过滤从受试者第一次进入研究到第一次观察到的事件的数据(不担心第一次事件之后发生的重复事件——只想探索第一次事件的时间)。

我在 filter 函数中使用 between,它过去一直对我有用,但这里有问题,因为有些主题从未有过事件,因此我收到一条错误消息,指出 Error: Expecting a single value: [extent=0].

我想我想要的是一种在开始研究到第一次事件之间过滤主题数据的方法,或者如果没有事件主题,则该主题的所有数据。

这是我的数据的示例:

## data
subject <- c("A", "A", "A", "A", "B", "B", "C", "C", "C", "D", "E", "E", "E", "E", "E", "F", "F", "F", "F", "F")
event <- c(0,0,1,0,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,0)

df <- data.frame(subject, event)

## create index to count the days the subject is in the study
library(tidyverse)

df <- df %>%
    group_by(subject) %>%
    mutate(ID = seq_along(subject))

df

# A tibble: 20 x 3
# Groups:   subject [6]
   subject event    ID
   <fct>   <dbl> <int>
 1 A           0     1
 2 A           0     2
 3 A           1     3
 4 A           0     4
 5 B           0     1
 6 B           0     2
 7 C           0     1
 8 C           0     2
 9 C           1     3
10 D           0     1
11 E           0     1
12 E           1     2
13 E           0     3
14 E           1     4
15 E           1     5
16 F           0     1
17 F           0     2
18 F           0     3
19 F           0     4
20 F           0     5

## filter event times between the start of the trial and when the subject has the event for the first time

df %>%
    group_by(subject) %>%
    filter(., between(row_number(), 
        left = which(ID == 1),
        right = which(event == 1)))

最后一部分是我出错的地方。

这就是你想要的吗?

df2 <- df %>%
  group_by(subject) %>%
  filter(cumsum(event) == 0 | (cumsum(event) == 1 & event == 1))

结果:

# A tibble: 16 x 2
# Groups:   subject [6]
   subject event
   <fct>   <dbl>
 1 A           0
 2 A           0
 3 A           1
 4 B           0
 5 B           0
 6 C           0
 7 C           0
 8 C           1
 9 D           0
10 E           0
11 E           1
12 F           0
13 F           0
14 F           0
15 F           0
16 F           0