如何聚合按标识符分组的时间段?

How to aggregate time periods grouped by identificators?

data <- read.table(text=
"ID1    ID2 From    To
12  127 20090701    20090703
12  127 20090704    20090711
12  127 20090707    20100831
12  127 20100901    99991231
18  880 19740401    20091129
18  880 20100608    99991231
12  127 20080102    20080305
12  127 20080306    20080329
12  128 20080620    20090204"
, header=T)

我想将上面的数据框转换成下面的形式:

 result <- read.table(text=
"ID1    ID2 From    To
12  127 20080102    20080329
12  127 20090701    99991231
12  128 20080620    20090204
18  880 19740401    20091129
18  880 20100608    99991231"
, header=T)

简而言之,就是按照 ID1、ID2 和受试者持续活跃的唯一时间段(不是一天休息)进行分组。这是在不需要的情况下删除多行(从日期 1 到日期 2 的一个连续 activity 时间段)。

感谢指出解决方案。

首先,转换日期:

df$From <- as.Date(as.character(df$From), format = "%Y%m%d")
df$To <- as.Date(as.character(df$To), format = "%Y%m%d")

想出一种方法如下:

library(dplyr)
data$From <- as.Date(as.character(data$From), format = "%Y%m%d")
data$To <- as.Date(as.character(data$To), format = "%Y%m%d")
data <- data %>% arrange(ID2, From) %>% mutate(Difference=9999)
marker <- 1
for (i in 2:length(data$ID1)){
  if(data$ID2[i]!=data$ID2[i-1]) marker=i
  else{
    data$Difference[i]=difftime(data$From[i], data$To[marker])
    if(data$Difference[i]>1) marker=i
    else if(data$To[i]>data$To[marker]) data$To[marker]=data$To[i]
  }
}
data <- filter(data, Difference>1)
data <- data[,-which(colnames(data)=="Difference")]

谁能提供除 for i loop 之外的其他解决方案?