按顺序填充/重新扩展缺失值,而不是插补

Filling / re-expanding missing values in sequence, not imputation

我有一个如下所示的数据框:

library(dplyr)
df <- expand.grid(
    id = 1:3, 
    key = 1:10) %>%
    filter(!(id == 1 & key <= 4)) %>%
    filter(!(id == 2 & key %in% c(1:3, 6, 7, 10))) %>%
    filter(!(id == 3 & key %in% c(1, 2, 4, 5, 7:10))) %>%
    arrange(id, key) %>%
    cbind(value = c(10, 11, 15, 17, 20, 30, 1, 6, 8, 100, 0.2, 0.7))

我想填充/重新扩展 key 的整数序列直到键的最大值,值可以是 NA(这不是插补问题)...

所以对于 id == 3 我想要键 1、2、3、4、5、6.. . 值 == NA

提前致谢!

在你的烟斗中再添加一根具有讽刺意味的名字 complete:

library(tidyr)
df <- expand.grid(
  id = 1:3, 
  key = 1:10) %>%
  filter(!(id == 1 & key <= 4)) %>%
  filter(!(id == 2 & key %in% c(1:3, 6, 7, 10))) %>%
  filter(!(id == 3 & key %in% c(1, 2, 4, 5, 7:10))) %>%
  arrange(id, key) %>%
  cbind(value = c(10, 11, 15, 17, 20, 30, 1, 6, 8, 100, 0.2, 0.7)) %>%
  complete(id, key)
#    id key value
# 1   1   3    NA
# 2   1   4    NA
# 3   1   5  10.0
# 4   1   6  11.0
# 5   1   7  15.0
# 6   1   8  17.0
# 7   1   9  20.0
# 8   1  10  30.0
# 9   2   3    NA
# 10  2   4   1.0

编辑

要超越数据中的键,请使用:

complete(df, id, key = 1:10)

如果你想要一个从 1 开始到每个 id 的最大值 key 的序列:

library(dplyr)
library(tidyr)

df %>% group_by(id) %>% complete(key = seq(max(key)))
## Source: local data frame [25 x 3]
## Groups: id [3]
## 
##       id   key value
##    <int> <int> <dbl>
## 1      1     1    NA
## 2      1     2    NA
## 3      1     3    NA
## 4      1     4    NA
## 5      1     5    10
## 6      1     6    11
## 7      1     7    15
## 8      1     8    17
## 9      1     9    20
## 10     1    10    30
## # ... with 15 more rows