R 如何创建计数器并在每次重置前将其打印到新列

R how to create counter and make it print to new column before every reset

假设我有以下数据框

  df
R1 R2
0  0 
1  1
1  1
0  1
1  1 
0  0
0  1
1  0           
0  0
1  0
1  0
1  1        

而且我希望创建一个计数器,它在每一列中单独计算“1”的出现次数,每次遇到 0 后重置,并在新列中输出计数。即在第 1 行中,它将在第一步重置,然后计数到 1,然后计数到 2,然后重置,然后计数 1,然后重置、重置等,第 1 列的所需输出为:

  df
R1(Counted) 
N/A   
N/A  
2  
N/A  
1   
N/A  
N/A  
1             
N/A  
N/A  
N/A  
3          

我想我需要这样的东西:

Counter = 0  
for i = 1:nrow(df){
  if (???==1){
    counter=counter+1
  } else {
    counter=0
  }
}  

但我真的没有使用计数器的经验,也不知道如何让它在重置计数器或类似操作之前将其计数连续打印到新列。

非常感谢任何帮助

我会这样做:

a <- c(0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1)

b <- sequence(rle(as.character(a))$lengths)
b[a == 0] <- NA                 
b[!is.na(dplyr::lead(b))] <- NA # this finds any where the next value isn't NA

b
# NA NA  2 NA  1 NA NA  1 NA NA NA  3

你可以把它变成一个函数,然后 lapply 在你的 data.frame 上同时处理每一列,如果你有不止 1 列要做的话,像这样:

counter <- function(x){

  count <- sequence(rle(as.character(x))$lengths)
  count[x == 0] <- NA
  count[!is.na(dplyr::lead(count))] <- NA

  return(count)
}

df <- data.frame(
  R1 = sample(c(0, 1), 20, T, c(0.2, 0.8)),
  R2 = sample(c(0, 1), 20, T, c(0.7, 0.3))
)

df[paste0(names(df), '_ct')] <- lapply(df, counter)

我们可以在 data.table::rleid 的帮助下创建一个函数,以根据值的每个变化创建组。将所有值变为 NA 除了值为 1 的值,它是组中的最后一个元素。

get_counter <- function(ct) {
   ave(ct, data.table::rleid(ct), FUN = function(x) 
           replace(seq_along(x), x != 1 | seq_along(x) != length(x), NA))
}

此函数可以应用于多个列,使用 lapply

df[paste0("ct_", names(df))] <- lapply(df, get_counter)
df

#   R1 R2 ct_R1 ct_R2
#1   0  0    NA    NA
#2   1  1    NA    NA
#3   1  1     2    NA
#4   0  1    NA    NA
#5   1  1     1     4
#6   0  0    NA    NA
#7   0  1    NA     1
#8   1  0     1    NA
#9   0  0    NA    NA
#10  1  0    NA    NA
#11  1  0    NA    NA
#12  1  1     3     1

数据

df <- structure(list(R1 = c(0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 
1L, 1L), R2 = c(0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L
)), class = "data.frame", row.names = c(NA, -12L))

这是一个(相当复杂的)解决方案,其中每个基数 R 使用一个 while 循环(对于 R1R2)!!

df <- data.frame(R1 = c(0,1,1,0,1,0,0,1,0,1,1,1), R2 = c(0,1,1,1,1,0,1,0,0,0,0,1))

#For R1
mycount <- 0
i <- 1
df$R1_counted <- NA
while(i <= nrow(df)){

  mycount <- mycount + df$R1[i]
  if(df$R1[i] == 0 & i == 1){
    df$R1_counted[i] <- NA
  } else if(df$R1[i] != 0 & i == 1){
    df$R1_counted[i] <- df$R1[i]
  }

  if(df$R1[i] == 0 & i > 1){
    df$R1_counted[i] <- NA
    if(df$R1[i-1] != 0){df$R1_counted[i-1] <- mycount}

    mycount <- 0
  } else if(df$R1[i] != 0 & i > 1){
    df$R1_counted[i] <- NA
  }

  if(i == nrow(df) & df$R1[i] != 0){
    df$R1_counted[i] <- mycount
  }

  i <- i + 1
}


#For R2
mycount <- 0
i <- 1
df$R2_counted <- NA
while(i <= nrow(df)){

  mycount <- mycount + df$R2[i]
  if(df$R2[i] == 0 & i == 1){
    df$R2_counted[i] <- NA
  } else if(df$R2[i] != 0 & i == 1){
    df$R2_counted[i] <- df$R2[i]
  }

  if(df$R2[i] == 0 & i > 1){
    df$R2_counted[i] <- NA
    if(df$R2[i-1] != 0){df$R2_counted[i-1] <- mycount}

    mycount <- 0
  } else if(df$R2[i] != 0 & i > 1){
    df$R2_counted[i] <- NA
  }

  if(i == nrow(df) & df$R2[i] != 0){
    df$R2_counted[i] <- mycount
  }

  i <- i + 1
}

df
#   R1 R2 R1_counted R2_counted
#1   0  0         NA         NA
#2   1  1         NA         NA
#3   1  1          2         NA
#4   0  1         NA         NA
#5   1  1          1          4
#6   0  0         NA         NA
#7   0  1         NA          1
#8   1  0          1         NA
#9   0  0         NA         NA
#10  1  0         NA         NA
#11  1  0         NA         NA
#12  1  1          3          1