根据值拆分 ID

Question

我有一个很大的 data.frame，有 p 列和 n 行。
我想更改 ID，即如果获得值 1，则拆分框架。然而，对于每个 ID，该值可能会出现多次，因此变得棘手。
我正在考虑下订单，所以每次 df$Value==1 那么该行应该有 df$order==1(next,2... until a df$value==1 again)

# Example data
df <- data.frame(ID= c(rep(1,3), rep(2,7), rep(3,5)),
             Value= c(0,0,1,
                      0,0,1,0,1,1,0,
                      0,0,1,0,1))

# Desired result
df <- data.frame(ID= c(rep(1,3), rep(2,3), rep(2.1,2), rep(2.2,1),rep(2.3,1), rep(3,3), rep(3.1,2)),
             Value= c(0,0,1,
                      0,0,1,
                      0,1,
                      1,
                      0,
                      0,0,1,
                      0,1))

# Alternative desired result
df <- data.frame(ID= c(rep(2,3), rep(2.1,2), rep(2.2,1),rep(2.3,1), rep(3,3), rep(3.1,2)),
             Value= c(0,0,1,
                      0,1,
                      1,
                      0,
                      0,0,1,
                      0,1))

我试过这样做：

df %>% group_by(ID) %>% mutate(Order= seq(from=Value[1], to=which(Value==1), by=1))

但它并没有真正给我想要的东西。
有什么建议吗？

Answer 1

这是一个使用 data.table

的选项

library(data.table)
setDT(df)[, ID := seq(0, 1, by = 0.1)[shift(cumsum(Value==1), fill=0)+1] + ID, ID]

或同dplyr

library(dplyr)
df %>%
  group_by(ID) %>%
  mutate(ID1 = seq(0, 1, by = 0.1)[lag(cumsum(Value==1), default=0)+1] + ID) %>%
  ungroup() %>%
  mutate(ID = ID1) %>%
  select(-ID1)
# A tibble: 15 × 2
#      ID Value
#   <dbl> <dbl>
#1    1.0     0
#2    1.0     0
#3    1.0     1
#4    2.0     0
#5    2.0     0
#6    2.0     1
#7    2.1     0
#8    2.1     1
#9    2.2     1
#10   2.3     0
#11   3.0     0
#12   3.0     0
#13   3.0     1
#14   3.1     0
#15   3.1     1

根据值拆分 ID

Splitting ID based on values

r

dplyr

tidyr