如何在分组列中按行添加值
How to add values rowwise in a grouped column
我有一些传感器数据,每秒有 100 个数据条目。最后一列是毫秒,目前都是 10。我如何按时间和日期将毫秒按行相加。
testdata <- structure(list(local_date = c("26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017"),
local_time = c("13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24" ),
ms = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)),
.Names = c("local_date", "local_time", "ms"), row.names = c(NA, -200L), class = c("data.table", "data.frame"))
前 100 行都共享相同的时间 (13:58:23) 和日期 (26-06-2017),但它们都有 10 毫秒。结果应该只有一个条目,每秒 10 毫秒,随后的毫秒将添加到前面的毫秒中。
此代码段将创建具有序列的结果:
testdata$ms = rep(seq(from = 10, to = 1000, by = 10), 2)
但由于原始数据不是那么干净,我必须按日期和时间对数据进行分组,然后按行将毫秒数相加。
我更喜欢 data.table
解决方案,但 dplyr
也可以。
听起来你需要一个分组 cumsum
:
library(dplyr)
testdata$ms2 = rep(seq(from = 10, to = 1000, by = 10), 2)
testdata %>%
group_by(local_date, local_time) %>%
mutate(cumsum_ms = cumsum(ms))
local_date local_time ms ms2 cumsum_ms
<chr> <chr> <dbl> <dbl> <dbl>
1 26-06-2017 13:58:23 10 10 10
2 26-06-2017 13:58:23 10 20 20
3 26-06-2017 13:58:23 10 30 30
4 26-06-2017 13:58:23 10 40 40
5 26-06-2017 13:58:23 10 50 50
并添加 data.table
版本:
testdata[, ms := cumsum(ms), by = .(local_time, local_date)]
我有一些传感器数据,每秒有 100 个数据条目。最后一列是毫秒,目前都是 10。我如何按时间和日期将毫秒按行相加。
testdata <- structure(list(local_date = c("26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017"),
local_time = c("13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24" ),
ms = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)),
.Names = c("local_date", "local_time", "ms"), row.names = c(NA, -200L), class = c("data.table", "data.frame"))
前 100 行都共享相同的时间 (13:58:23) 和日期 (26-06-2017),但它们都有 10 毫秒。结果应该只有一个条目,每秒 10 毫秒,随后的毫秒将添加到前面的毫秒中。
此代码段将创建具有序列的结果:
testdata$ms = rep(seq(from = 10, to = 1000, by = 10), 2)
但由于原始数据不是那么干净,我必须按日期和时间对数据进行分组,然后按行将毫秒数相加。
我更喜欢 data.table
解决方案,但 dplyr
也可以。
听起来你需要一个分组 cumsum
:
library(dplyr)
testdata$ms2 = rep(seq(from = 10, to = 1000, by = 10), 2)
testdata %>%
group_by(local_date, local_time) %>%
mutate(cumsum_ms = cumsum(ms))
local_date local_time ms ms2 cumsum_ms
<chr> <chr> <dbl> <dbl> <dbl>
1 26-06-2017 13:58:23 10 10 10
2 26-06-2017 13:58:23 10 20 20
3 26-06-2017 13:58:23 10 30 30
4 26-06-2017 13:58:23 10 40 40
5 26-06-2017 13:58:23 10 50 50
并添加 data.table
版本:
testdata[, ms := cumsum(ms), by = .(local_time, local_date)]