使用另一列作为支持向后和向前填充 "missing values" (NAs)

Filling "missing values" (NAs) backward and forward using another column as support

假设我有以下数据:

input = tibble::tibble(
  group = c(rep("A", 5), rep("B", 5), rep("C", 5)),
  value = c(10, 15, 17, NA, NA, NA, NA, 12, 16, 13, 12, NA, 15, NA, 19),
  gr = c(0.1, 0.05, 0.03, 0.02, 0.05, 0.04, 0.02, 0.6, 0.03, 0.4, 0.01, 0.09, 0.05, -0.03, 0.04)
)

看起来像这样:

> input
# A tibble: 15 x 3
   group value    gr
   <chr> <dbl> <dbl>
 1 A        10  0.1 
 2 A        15  0.05
 3 A        17  0.03
 4 A        NA  0.02
 5 A        NA  0.05
 6 B        NA  0.04
 7 B        NA  0.02
 8 B        12  0.6 
 9 B        16  0.03
10 B        13  0.4 
11 C        12  0.01
12 C        NA  0.09
13 C        15  0.05
14 C        NA -0.03
15 C        19  0.04

我想使用辅助变量(在本例中为 gr)填充每个组的缺失值。对于每个group,填充的方式应该是不同的。比如groupA​​,应该往前做,即value_filled = lag(value) * (1 + gr)。同时,对于 group B 应该向后进行,即 value_filled = lag(value) / (1 + gr)。对于 group C(在这种情况下,缺失值介于两者之间),需要向前填充。

期望的输出是这样的:

desired_output = tibble::tibble(
  group = c(rep("A", 5), rep("B", 5), rep("C", 5)),
  value = c(10, 15, 17, NA, NA, NA, NA, 12, 16, 13, 12, NA, 15, NA, 19),
  gr = c(0.1, 0.05, 0.03, 0.02, 0.05, 0.04, 0.02, 0.6, 0.03, 0.4, 0.01, 0.09, 0.05, -0.03, 0.04),
  value_filled = c(10, 15, 17, 17.3, 18.2, 7.3, 7.5.7, 12, 16, 13, 12, 13, 15, 14.5, 19)
)
> desired_output
# A tibble: 15 x 4
   group value    gr value_filled
   <chr> <dbl> <dbl>        <dbl>
 1 A        10  0.1          10  
 2 A        15  0.05         15  
 3 A        17  0.03         17  
 4 A        NA  0.02         17.3
 5 A        NA  0.05         18.2
 6 B        NA  0.04         7.3
 7 B        NA  0.02         7.5
 8 B        12  0.6          12  
 9 B        16  0.03         16  
10 B        13  0.4          13  
11 C        12  0.01         12  
12 C        NA  0.09         13  
13 C        15  0.05         15  
14 C        NA -0.03         14.5
15 C        19  0.04         19 

我希望这可以在 dplyr 时尚中完成。

你可以做到;

library(tidyverse)
input %>%
  group_by(group) %>%
  mutate(v1 = unlist(accumulate2(value, tail(gr, -1), ~if(is.na(..2)) ..1*(1+..3) else ..2)), 
         v1 = rev(unlist(accumulate2(rev(v1), head(rev(gr), -1), ~if(is.na(..2)) ..1/(1+..3) else ..2))))
# A tibble: 15 x 4
# Groups:   group [3]
   group value    gr    v1
   <chr> <dbl> <dbl> <dbl>
 1 A        10  0.1  10   
 2 A        15  0.05 15   
 3 A        17  0.03 17   
 4 A        NA  0.02 17.3 
 5 A        NA  0.05 18.2 
 6 B        NA  0.04  7.35
 7 B        NA  0.02  7.5 
 8 B        12  0.6  12   
 9 B        16  0.03 16   
10 B        13  0.4  13   
11 C        12  0.01 12   
12 C        NA  0.09 13.1 
13 C        15  0.05 15   
14 C        NA -0.03 14.6 
15 C        19  0.04 19