仅对缺失日期使用滞后函数递增日期

Incrementally increasing date using lag function for missing dates only

在下面的示例中,我想使用前几天和 effort 变量中携带的整数值仅针对缺失日期导出缺失日期。

# Libraries
library("tidyverse")
library("lubridate")

work_start_date <- dmy("2/11/2020")

dta_tasks <- tribble(
  ~task_no, ~task,  ~effort,
  1,   "Task 1", NA,
  1.1, "Task 1.1", 1,
  1.3, "Task 1.3", 1,
  1.4, "Task 1.4", 2,
  1.5, "Task 1.5", 1,
  2,   "Task 2",   NA,
  2.1, "Task 2.1", 2

)

dta_tasks %>%
  arrange(task_no) %>%
  mutate(start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
         start_date = if_else(is.na(start_date), lag(start_date) + days(effort), start_date))

想要的结果

task_no task     effort start_date
<dbl> <chr>     <dbl> <date>    
1     1   Task 1      NA 2020-11-02
2     1.1 Task 1.1      1 2020-11-03
3     1.3 Task 1.3      1 2020-11-04        
4     1.4 Task 1.4      2 2020-11-06        
5     1.5 Task 1.5      1 2020-11-07        
6     2   Task 2       NA 2020-11-08        
7     2.1 Task 2.1      2 2020-11-08  # For NA it has to skip value 

详细说明

在下面代码的上下文中,我想用之前的 计算日期替换 Sys.Date()

dta_tasks %>%
  arrange(task_no) %>%
  mutate(
    start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
    start_date = if_else(is.na(start_date), Sys.Date() + days(effort), start_date)
  )

试试这个:

dta_tasks %>%
    arrange(task_no) %>% 
    mutate(effort_no_na = pmax(effort, 0, na.rm = TRUE)) %>% 
    mutate(cum_effort = cumsum(effort_no_na),
           start_date = work_start_date + days(effort_no_na),
           start_date = if_else(is.na(effort), NA_Date_, start_date)) %>% 
    fill(start_date, .direction =  "up")

想法是使用 cumsum 来跟踪自开始以来的总工作量。由于 NA,有一堆簿记。