仅对缺失日期使用滞后函数递增日期
Incrementally increasing date using lag function for missing dates only
在下面的示例中,我想使用前几天和 effort
变量中携带的整数值仅针对缺失日期导出缺失日期。
# Libraries
library("tidyverse")
library("lubridate")
work_start_date <- dmy("2/11/2020")
dta_tasks <- tribble(
~task_no, ~task, ~effort,
1, "Task 1", NA,
1.1, "Task 1.1", 1,
1.3, "Task 1.3", 1,
1.4, "Task 1.4", 2,
1.5, "Task 1.5", 1,
2, "Task 2", NA,
2.1, "Task 2.1", 2
)
dta_tasks %>%
arrange(task_no) %>%
mutate(start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
start_date = if_else(is.na(start_date), lag(start_date) + days(effort), start_date))
想要的结果
task_no task effort start_date
<dbl> <chr> <dbl> <date>
1 1 Task 1 NA 2020-11-02
2 1.1 Task 1.1 1 2020-11-03
3 1.3 Task 1.3 1 2020-11-04
4 1.4 Task 1.4 2 2020-11-06
5 1.5 Task 1.5 1 2020-11-07
6 2 Task 2 NA 2020-11-08
7 2.1 Task 2.1 2 2020-11-08 # For NA it has to skip value
详细说明
在下面代码的上下文中,我想用之前的 计算日期替换 Sys.Date()
。
dta_tasks %>%
arrange(task_no) %>%
mutate(
start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
start_date = if_else(is.na(start_date), Sys.Date() + days(effort), start_date)
)
试试这个:
dta_tasks %>%
arrange(task_no) %>%
mutate(effort_no_na = pmax(effort, 0, na.rm = TRUE)) %>%
mutate(cum_effort = cumsum(effort_no_na),
start_date = work_start_date + days(effort_no_na),
start_date = if_else(is.na(effort), NA_Date_, start_date)) %>%
fill(start_date, .direction = "up")
想法是使用 cumsum
来跟踪自开始以来的总工作量。由于 NA,有一堆簿记。
在下面的示例中,我想使用前几天和 effort
变量中携带的整数值仅针对缺失日期导出缺失日期。
# Libraries
library("tidyverse")
library("lubridate")
work_start_date <- dmy("2/11/2020")
dta_tasks <- tribble(
~task_no, ~task, ~effort,
1, "Task 1", NA,
1.1, "Task 1.1", 1,
1.3, "Task 1.3", 1,
1.4, "Task 1.4", 2,
1.5, "Task 1.5", 1,
2, "Task 2", NA,
2.1, "Task 2.1", 2
)
dta_tasks %>%
arrange(task_no) %>%
mutate(start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
start_date = if_else(is.na(start_date), lag(start_date) + days(effort), start_date))
想要的结果
task_no task effort start_date
<dbl> <chr> <dbl> <date>
1 1 Task 1 NA 2020-11-02
2 1.1 Task 1.1 1 2020-11-03
3 1.3 Task 1.3 1 2020-11-04
4 1.4 Task 1.4 2 2020-11-06
5 1.5 Task 1.5 1 2020-11-07
6 2 Task 2 NA 2020-11-08
7 2.1 Task 2.1 2 2020-11-08 # For NA it has to skip value
详细说明
在下面代码的上下文中,我想用之前的 计算日期替换 Sys.Date()
。
dta_tasks %>%
arrange(task_no) %>%
mutate(
start_date = if_else(row_number() == 1, work_start_date, NA_Date_),
start_date = if_else(is.na(start_date), Sys.Date() + days(effort), start_date)
)
试试这个:
dta_tasks %>%
arrange(task_no) %>%
mutate(effort_no_na = pmax(effort, 0, na.rm = TRUE)) %>%
mutate(cum_effort = cumsum(effort_no_na),
start_date = work_start_date + days(effort_no_na),
start_date = if_else(is.na(effort), NA_Date_, start_date)) %>%
fill(start_date, .direction = "up")
想法是使用 cumsum
来跟踪自开始以来的总工作量。由于 NA,有一堆簿记。