从非连续 table 计算经过的天数
calculate elapsed days from non-sequential table
我有一个table的加油,a
,像这样:
a = setDT(structure(list(date = structure(c(NA, 16837, 16843, 16847, 16852,
16854, 16858, 16862, 16867, 16871, 16874), class = "Date"), km = c(NA,
NA, 421, 351, 286, 350, 414, 332, 401, 321, 350)), .Names = c("date",
"km"), class = c("data.table", "data.frame"), row.names = c(NA,
-11L)), key = "date")
它有加油日期和加油后行驶的公里数。我还得到了不同的 table 轮胎压力调整和换油日期,actions
,像这样:
actions = setDT(structure(list(date = structure(c(16841, 16843, 16858, 16869), class = "Date"),
action = structure(c(1L, 2L, 2L, 2L), .Label = c("oil", "tires"
), class = "factor")), .Names = c("date", "action"), row.names = c(NA,
-4L), class = c("data.table", "data.frame")), key = "action")
我需要将油耗(在 a
的真实版本中我也有加仑)与自上次轮胎压力检查和上次换油以来经过的天数联系起来。必须有一个简单的方法来实现这一点,但经过几个小时的尝试,我被卡住了。
这是我试过的:
library(data.table)
library(lubridate)
library(reshape2)
b <- dcast(actions, date ~ action, value.var = "date")
d <- seq(min(a$date, b$date, na.rm = TRUE), max(a$date, b$date, na.rm = TRUE), by = "day")
d <- data.table(date=d)
d <- b[d,]
d$daysOil <- as.double(difftime(d$date, d$date[! is.na(d$oil)], units = "days"))
d$daysOil[which(d$daysOil < 0)] <- NA
如果我尝试计算自上次 "tires" 事件以来经过的天数(更接近 之前 加油日期的天数,事情会变得更加复杂),这就是我被困的地方。
我的预期输出是:
expected
date km daysoil daysTires
1 <NA> NA NA NA
2 2016-02-06 NA NA NA
3 2016-02-12 421 2 0
4 2016-02-16 351 6 4
5 2016-02-21 286 11 9
6 2016-02-23 350 13 11
7 2016-02-27 414 17 0
8 2016-03-02 332 21 4
9 2016-03-07 401 26 9
10 2016-03-11 321 30 2
11 2016-03-14 350 33 5
我会很感激任何解决方案,但最好使用 data.table
或 dplyr
包。
##########编辑##########
如果您能想到更好的信息 (tables) 结构来完成这项任务,我们将不胜感激!
这是一种选择:
actions[, date.copy := date]
cbind(a,
dcast(actions[, .SD[a, .(days = date - date.copy, N = .I), roll = T, on = 'date']
, by = action],
N ~ action, value.var = 'days'))
# date km N oil tires
# 1: <NA> NA 1 NA days NA days
# 2: 2016-02-06 NA 2 NA days NA days
# 3: 2016-02-12 421 3 2 days 0 days
# 4: 2016-02-16 351 4 6 days 4 days
# 5: 2016-02-21 286 5 11 days 9 days
# 6: 2016-02-23 350 6 13 days 11 days
# 7: 2016-02-27 414 7 17 days 0 days
# 8: 2016-03-02 332 8 21 days 4 days
# 9: 2016-03-07 401 9 26 days 9 days
#10: 2016-03-11 321 10 30 days 2 days
#11: 2016-03-14 350 11 33 days 5 days
以上是几件简单的事情 - 运行 分片理解。
我有一个table的加油,a
,像这样:
a = setDT(structure(list(date = structure(c(NA, 16837, 16843, 16847, 16852,
16854, 16858, 16862, 16867, 16871, 16874), class = "Date"), km = c(NA,
NA, 421, 351, 286, 350, 414, 332, 401, 321, 350)), .Names = c("date",
"km"), class = c("data.table", "data.frame"), row.names = c(NA,
-11L)), key = "date")
它有加油日期和加油后行驶的公里数。我还得到了不同的 table 轮胎压力调整和换油日期,actions
,像这样:
actions = setDT(structure(list(date = structure(c(16841, 16843, 16858, 16869), class = "Date"),
action = structure(c(1L, 2L, 2L, 2L), .Label = c("oil", "tires"
), class = "factor")), .Names = c("date", "action"), row.names = c(NA,
-4L), class = c("data.table", "data.frame")), key = "action")
我需要将油耗(在 a
的真实版本中我也有加仑)与自上次轮胎压力检查和上次换油以来经过的天数联系起来。必须有一个简单的方法来实现这一点,但经过几个小时的尝试,我被卡住了。
这是我试过的:
library(data.table)
library(lubridate)
library(reshape2)
b <- dcast(actions, date ~ action, value.var = "date")
d <- seq(min(a$date, b$date, na.rm = TRUE), max(a$date, b$date, na.rm = TRUE), by = "day")
d <- data.table(date=d)
d <- b[d,]
d$daysOil <- as.double(difftime(d$date, d$date[! is.na(d$oil)], units = "days"))
d$daysOil[which(d$daysOil < 0)] <- NA
如果我尝试计算自上次 "tires" 事件以来经过的天数(更接近 之前 加油日期的天数,事情会变得更加复杂),这就是我被困的地方。
我的预期输出是:
expected
date km daysoil daysTires
1 <NA> NA NA NA
2 2016-02-06 NA NA NA
3 2016-02-12 421 2 0
4 2016-02-16 351 6 4
5 2016-02-21 286 11 9
6 2016-02-23 350 13 11
7 2016-02-27 414 17 0
8 2016-03-02 332 21 4
9 2016-03-07 401 26 9
10 2016-03-11 321 30 2
11 2016-03-14 350 33 5
我会很感激任何解决方案,但最好使用 data.table
或 dplyr
包。
##########编辑##########
如果您能想到更好的信息 (tables) 结构来完成这项任务,我们将不胜感激!
这是一种选择:
actions[, date.copy := date]
cbind(a,
dcast(actions[, .SD[a, .(days = date - date.copy, N = .I), roll = T, on = 'date']
, by = action],
N ~ action, value.var = 'days'))
# date km N oil tires
# 1: <NA> NA 1 NA days NA days
# 2: 2016-02-06 NA 2 NA days NA days
# 3: 2016-02-12 421 3 2 days 0 days
# 4: 2016-02-16 351 4 6 days 4 days
# 5: 2016-02-21 286 5 11 days 9 days
# 6: 2016-02-23 350 6 13 days 11 days
# 7: 2016-02-27 414 7 17 days 0 days
# 8: 2016-03-02 332 8 21 days 4 days
# 9: 2016-03-07 401 9 26 days 9 days
#10: 2016-03-11 321 10 30 days 2 days
#11: 2016-03-14 350 11 33 days 5 days
以上是几件简单的事情 - 运行 分片理解。