按组复制数据帧

Question

我有以下数据框：

df = structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 
3), index = c(1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 1, 2, 3)), row.names = c(NA, 
-13L), class = c("tbl_df", "tbl", "data.frame"))

我想根据Group列复制列索引，一次每个数字连续出现n次，第二次所有数字作为一组出现n次，其中 n 是组的大小（类似于 rep 与 rep 和 each）。

所以输出看起来像这样（让我们只看第 1 组，因为它太长了）：

第一个选项：

df = structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1), index = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 
4, 4, 4)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", 
"data.frame"))

第二个选项：

df = structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1), index = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 
2, 3, 4)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", 
"data.frame"))

如何使用 group_by 执行此操作？

Answer 1

您可以像这样使用 rep 和 slice

library(dplyr)

选项 1：

df %>%
  group_by(Group) %>%
  slice(rep(seq_len(n()), each = n()))

选项 2：

df %>%
  group_by(Group) %>%
  slice(rep(seq_len(n()), n()))

Answer 2

您可以使用 do 和 lapply 的组合来复制整个组

df %>% group_by(Group) %>% 
  do(lapply(.,rep,times=nrow(.)) %>% as.data.frame())
df %>% group_by(Group) %>% 
  do(lapply(.,rep,each=nrow(.)) %>% as.data.frame())

Answer 3

我们可以使用uncount

library(tidyverse)
df %>% 
  group_by(Group) %>% 
  uncount(n())
# A tibble: 61 x 2
# Groups:   Group [3]
#   Group index
#   <dbl> <dbl>
# 1     1     1
# 2     1     1
# 3     1     1
# 4     1     1
# 5     1     2
# 6     1     2
# 7     1     2
# 8     1     2
# 9     1     3
#10     1     3
# … with 51 more rows

或使用data.table

library(data.table)
setDT(df)[, .SD[rep(seq_len(.N), .N)], Group]

或 base R

do.call(rbind, lapply(split(df, df$Group), 
       function(x) x[rep(seq_len(nrow(x)), nrow(x)),]))

按组复制数据帧

Replicate a data frame by group

replication

r

dplyr