将日期转换为时间序列的 month/year 格式

Convert date to month/year format for time series

我有一些水质样本数据。

> dput(GrowingArealog90s[1:10,])
structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516, 
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 =  c(1.51851393987789, 
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207, 
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931, 
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df", 
"data.frame"), row.names = c(NA, -10L))

虽然在 25 年期间遗漏了一些月份,但每月收集此数据。

我知道在将日期转换为不同格式方面有很多帮助,但我一直无法弄清楚。我想创建一个只有 month/year 格式的时间序列,这样我就可以做一些事情,比如按月分解数据和 运行 seasonal kendalls 等等。我已经尝试了很多不同的方法来将我的日期转换为我自己完全困惑的所需格式。我不关心确切的格式,只要它被识别即可 month/year.

我还需要用 NA 填补缺失的月份。

我尝试以数字格式 "yyyymm" 上传 "SampleDate" 列。然后我可以将该数据框与另一个包含我需要的所有日期的数据框合并。

GA90 <- merge(Dates, GrowingArealog90s, by.x = "Date", by.y = "Date", all.x = TRUE)

但是,当我将生成的数据帧转换为时间序列时,它无法识别 12 个月的频率。

 GA90ts <- as.ts(GA90, frequency(12))

> GA90ts
Time Series:
Start = 1 
End = 324 
Frequency = 1 

如有任何帮助,我们将不胜感激。

这里是 zoo 的操作方法。您会收到警告,但目前还可以。您将获得 mon/yy.

系列
series <-structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516,
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 =  c(1.51851393987789,
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207,
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931,
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -10L))

library(zoo)
series <-as.data.frame(series) #to drop dplyr class
series.zoo <-zoo(series[,-1,drop=FALSE],as.yearmon(series[,1]))

最佳做法是让您的系列保持实际日期并仅在您实际需要进行计算时使用 as.yearmonas.yearmon 或按月和年计算 aggregate.zoo

以下是个人喜好问题,但我处理过很多时间序列,我认为 zoo 优于 tsxts。更加灵活。

现在,要填充缺失值,您必须创建一个日期向量。在这里,我使用了一个带有实际日期的 zoo 对象。然后我使用 na.locf,即 "last observation carry forward"。您还可以查看 na.approx.

series.zoo <-zoo(series[,-1,drop=FALSE],(series[,1]))
my.seq <-seq.Date(first(series[,1,drop=FALSE]), last(series[,1,drop=FALSE]),by="month")
merged <-merge.zoo(series.zoo,zoo(,my.seq))
na.locf(merged)

更新

有聚合。

GrowingArealog90s <-structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516,
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 =  c(1.51851393987789,
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207,
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931,
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -10L))

library(zoo);library(xts)
GrowingArealog90s <-as.data.frame(GrowingArealog90s) #to remove dplyr format
GrowingArealog90s.zoo <-zoo(GrowingArealog90s[,-1,drop=FALSE],as.Date(GrowingArealog90s[,1]))

#First aggregate by month. I chose to get the mean per month
GrowingArealog90s.agg <-aggregate(GrowingArealog90s.zoo, as.yearmon, mean) #replace mean with last to get last reading of the month

#Then create a sequence of months and merge it
my.seq <-seq.Date(first(GrowingArealog90s[,1]), last(GrowingArealog90s[,1]),by="month")
merged <-merge.zoo(GrowingArealog90s.agg ,zoo(,as.yearmon(my.seq)))
na.locf(merged)