识别向量中的给定模式并添加缺少的元素以获得给定模式的重复
Recognize a given pattern in a vector and add the lacking elements to get the repitition of the given pattern
这个问题与Wide a dataframe and insert missing columns
相关
假设我们有一个 given 模式,其中包含 5 个元素,顺序为:"A", "B", "C", "D", "E"
这个模式重复了 10 次。但有时会缺少一些元素(见图片我的矢量(橙色)。
是否可以在R
中识别重复的模式并填充缺少的元素(见图片我想要的输出)。
我的矢量:
my.vector <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "B", "C",
"D", "E", "B", "C", "D", "E", "B", "C", "D", "E", "B", "C", "D",
"E", "B", "C", "D", "E", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "B")
my.vector
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B"
图解说明:
给定的模式:
我的矢量:
我想要的输出:要添加的红色标记元素
根据 match
ing 索引的 diff
使用 LETTERS[1:5]
、split
创建分组列(或使用任何分组函数,如 tapply
等), 并用 'LETTERS[1:5],
unlistthe
listand
unname`
创建一个 union
unname( unlist(lapply(split(my.vector, cumsum(c(TRUE,
diff(match(my.vector, LETTERS[1:5])) != 1))),
function(x) union(LETTERS[1:5], x))))
-输出
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
或者另一种选择是 complete
library(dplyr)
library(tidyr)
library(data.table)
tibble(col1 = my.vector) %>%
group_by(rn = rowid(col1)) %>%
complete(col1 = LETTERS[1:5]) %>%
ungroup %>%
pull(col1)
-输出
1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
这个问题与Wide a dataframe and insert missing columns
相关假设我们有一个 given 模式,其中包含 5 个元素,顺序为:"A", "B", "C", "D", "E"
这个模式重复了 10 次。但有时会缺少一些元素(见图片我的矢量(橙色)。
是否可以在R
中识别重复的模式并填充缺少的元素(见图片我想要的输出)。
我的矢量:
my.vector <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "B", "C",
"D", "E", "B", "C", "D", "E", "B", "C", "D", "E", "B", "C", "D",
"E", "B", "C", "D", "E", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "B")
my.vector
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B"
图解说明:
给定的模式:
我的矢量:
我想要的输出:要添加的红色标记元素
根据 match
ing 索引的 diff
使用 LETTERS[1:5]
、split
创建分组列(或使用任何分组函数,如 tapply
等), 并用 'LETTERS[1:5],
unlistthe
listand
unname`
union
unname( unlist(lapply(split(my.vector, cumsum(c(TRUE,
diff(match(my.vector, LETTERS[1:5])) != 1))),
function(x) union(LETTERS[1:5], x))))
-输出
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
或者另一种选择是 complete
library(dplyr)
library(tidyr)
library(data.table)
tibble(col1 = my.vector) %>%
group_by(rn = rowid(col1)) %>%
complete(col1 = LETTERS[1:5]) %>%
ungroup %>%
pull(col1)
-输出
1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"