使用 R 计算向量中的重复元素(可变长度)

Count repeated elements (variable length) in a vector with R

我们知道向量中会包含重复的元素,规律是

c("A","B","C","D")

但将使用此模式的一个子集,并且始终从 A 开始,并且顺序相同。

一个简单的例子是

c("A","A","B","A","A","B","A","B","C","D")

我们可以这样构造它:

c("A",
"A","B",
"A",
"A","B",
"A","B","C","D")

我想要一个计算模式长度的输出向量:

c(1,2,1,2,4)

这将:

x1 <- c("A","A","B","A","A","B","A","B","C","D")
diff(c(which(x1 == "A"), length(x1)+1))

我不知道您是否有兴趣保留向量的中间列表。如果没有,@nicola 在评论中的回答是最优雅的解决方案。如果是这样,那么我认为调整 this 答案然后考虑长度可能会有用:

inp <- c("A","A","B","A","A","B","A","B","C","D")

split.vec <- function(vec, sep = 0) {
    is.sep <- vec == sep
    split(vec, cumsum(is.sep))
}
out <- split.vec(inp, "A")
sapply(out, length)

# 1 2 3 4 5 
# 1 2 1 2 4