按时间和第二(因子)变量聚合 xts 对象中的数据
Aggregate data in an xts object by time and second (factor) variable
我是时间序列分析和使用 xts(和一般的 R)的新手,所以请原谅问题的基本性质。
我想按时间范围(例如月)和第二个因子变量汇总数据。为了说明我的问题,请参阅以下内容:
require(xts)
# Create example df and convert it to an xts object
date <- sample(seq(as.Date("2015/01/01"), as.Date("2016/12/31"), by="day"),12)
colour <- c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Red", "Red",
"Red", "Blue", "Blue")
value <- sample(1:10, 12, replace = TRUE)
df <- cbind.data.frame(date, colour, value)
df <- xts(df[,-1], order.by = df$date)
这将创建一个示例数据框,如下所示:
colour value
2015-01-30 "Blue" "2"
2015-03-15 "Blue" "9"
2015-03-22 "Blue" "9"
2015-08-13 "Blue" "5"
2015-09-01 "Blue" "8"
2015-11-10 "Red" "7"
2016-04-26 "Blue" "2"
2016-07-06 "Red" "9"
2016-07-07 "Red" "6"
2016-07-08 "Red" "2"
2016-10-01 "Red" "6"
2016-11-07 "Red" "2"
我可以总结 "value" 变量使用:
apply.monthly(df$value, FUN = mean)
给我:
value
2015-01-30 2.000000
2015-03-22 9.000000
2015-08-13 5.000000
2015-09-01 8.000000
2015-11-10 7.000000
2016-04-26 2.000000
2016-07-08 5.666667
2016-10-01 6.000000
2016-11-07 2.000000
但我不太明白如何按(在本例中)颜色变量进行汇总(我希望按月计算每种颜色的总和)。任何帮助将不胜感激。
这个怎么样?
aggregate(as.numeric(df$value),
list(Month = format(index(df), "%Y-%m"),
Colour = df$colour),
mean)
回复您的以下评论:
# You can replace the format with the following to get a year month object
zoo::as.yearmon(index(df))
# Or you can covert to date by using the first of every month
as.Date(paste(format(index(df), "%Y-%m"), "-01", sep = ""))
您可能会在这里找到更多想法:Converting year and month ("yyyy-mm" format) to a date in R?
如果您想在按颜色子集化后处理 xts 对象,可以很容易地在列表中分别处理每个时间序列(按颜色),如下所示:
df <- cbind.data.frame(date, colour, value)
> class(df)
#[1] "data.frame"
# data.frame split (not xts split) to separate data by colour in a list object:
l_out <- split(df, colour)
> class(l_out[[1]])
[1] "data.frame"
mthly_mean <- function(x) {
apply.monthly(as.xts(x[, "value"], x[, "date"]), mean)
}
# Each element in the list is an xts object (not a data.frame) containing the mean of the data for each month:
l_res <- lapply(l_out, FUN = mthly_mean)
# or more succinctly:
# l_res <- lapply(l_out, FUN = function(x) apply.monthly(as.xts(x[, "value"], x[, "date"]), mean))
> l_res
# $Blue
# [,1]
# 2015-01-15 8.0
# 2015-07-21 4.5
# 2016-01-28 5.0
# 2016-04-28 4.0
# 2016-05-08 2.0
#
# $Red
# [,1]
# 2015-11-30 3
# 2016-01-18 7
# 2016-02-25 5
# 2016-04-17 1
# 2016-05-23 6
# 2016-07-14 5
> class(l_res[[1]])
[1] "xts" "zoo"
我是时间序列分析和使用 xts(和一般的 R)的新手,所以请原谅问题的基本性质。
我想按时间范围(例如月)和第二个因子变量汇总数据。为了说明我的问题,请参阅以下内容:
require(xts)
# Create example df and convert it to an xts object
date <- sample(seq(as.Date("2015/01/01"), as.Date("2016/12/31"), by="day"),12)
colour <- c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Red", "Red",
"Red", "Blue", "Blue")
value <- sample(1:10, 12, replace = TRUE)
df <- cbind.data.frame(date, colour, value)
df <- xts(df[,-1], order.by = df$date)
这将创建一个示例数据框,如下所示:
colour value
2015-01-30 "Blue" "2"
2015-03-15 "Blue" "9"
2015-03-22 "Blue" "9"
2015-08-13 "Blue" "5"
2015-09-01 "Blue" "8"
2015-11-10 "Red" "7"
2016-04-26 "Blue" "2"
2016-07-06 "Red" "9"
2016-07-07 "Red" "6"
2016-07-08 "Red" "2"
2016-10-01 "Red" "6"
2016-11-07 "Red" "2"
我可以总结 "value" 变量使用:
apply.monthly(df$value, FUN = mean)
给我:
value
2015-01-30 2.000000
2015-03-22 9.000000
2015-08-13 5.000000
2015-09-01 8.000000
2015-11-10 7.000000
2016-04-26 2.000000
2016-07-08 5.666667
2016-10-01 6.000000
2016-11-07 2.000000
但我不太明白如何按(在本例中)颜色变量进行汇总(我希望按月计算每种颜色的总和)。任何帮助将不胜感激。
这个怎么样?
aggregate(as.numeric(df$value),
list(Month = format(index(df), "%Y-%m"),
Colour = df$colour),
mean)
回复您的以下评论:
# You can replace the format with the following to get a year month object
zoo::as.yearmon(index(df))
# Or you can covert to date by using the first of every month
as.Date(paste(format(index(df), "%Y-%m"), "-01", sep = ""))
您可能会在这里找到更多想法:Converting year and month ("yyyy-mm" format) to a date in R?
如果您想在按颜色子集化后处理 xts 对象,可以很容易地在列表中分别处理每个时间序列(按颜色),如下所示:
df <- cbind.data.frame(date, colour, value)
> class(df)
#[1] "data.frame"
# data.frame split (not xts split) to separate data by colour in a list object:
l_out <- split(df, colour)
> class(l_out[[1]])
[1] "data.frame"
mthly_mean <- function(x) {
apply.monthly(as.xts(x[, "value"], x[, "date"]), mean)
}
# Each element in the list is an xts object (not a data.frame) containing the mean of the data for each month:
l_res <- lapply(l_out, FUN = mthly_mean)
# or more succinctly:
# l_res <- lapply(l_out, FUN = function(x) apply.monthly(as.xts(x[, "value"], x[, "date"]), mean))
> l_res
# $Blue
# [,1]
# 2015-01-15 8.0
# 2015-07-21 4.5
# 2016-01-28 5.0
# 2016-04-28 4.0
# 2016-05-08 2.0
#
# $Red
# [,1]
# 2015-11-30 3
# 2016-01-18 7
# 2016-02-25 5
# 2016-04-17 1
# 2016-05-23 6
# 2016-07-14 5
> class(l_res[[1]])
[1] "xts" "zoo"