ts() 频率,用于 30 分钟频率观测的年度数据系列

ts() frequency for a yearly data series of 30 min frequency observations

我想从数据框中创建一个 ts() 对象来预测物理现象。

我的数据在 1 年期间(1-1-2018 到 12-31-2018)有 30 分钟的频率,而且我观察到我的数据有 1 天的季节性。

> head(pleiadesGH.v2[,c("time", "humExt.R", "tempExt", "radExt", "vientoVelo")])
                 time humExt.R  tempExt    radExt vientoVelo
1 2018-01-01 00:00:00       NA       NA        NA         NA
2 2018-01-01 00:30:00 36.78287 16.95125 -10.08125    3.68550
3 2018-01-01 01:00:00 38.56775 16.26350  -9.75000    2.38420
4 2018-01-01 01:30:00 38.76425 15.63470 -10.08125    2.71915
5 2018-01-01 02:00:00 39.61575 15.32030 -10.41250    3.70475
6 2018-01-01 02:30:00 37.48700 15.06485 -10.74375    2.51895

基于这个答案:

https://robjhyndman.com/hyndsight/seasonal-periods/

time series with 10 min frequency in R

我得出结论,我的 ts() 频率应该是 48,因为 1 天有 48 次观察。

ts.freq1 <- ts(data = pleiadesGH.v2[,2:ncol(pleiadesGH.v2)],
           start = c(2018),
           frequency = 48)

但是生成的 ts() 有一个错误的时间索引,如下所示。时间数据应该是2018年到2019年,而不是2400年。

Time Series:
Start = c(2018, 1) 
End = c(2383, 1) 
Frequency = 48 
          humInt.R    humInt.E  tempInt   tempMac  humExt.R    humExt.E     radExt  tempExt vientoVelo
2018.000        NA          NA       NA        NA        NA          NA         NA       NA         NA
2018.021        NA          NA       NA        NA  36.78287 0.004410894  -10.08125 16.95125  3.6855000
2018.042        NA          NA       NA        NA  38.56775 0.004427114   -9.75000 16.26350  2.3842000
2018.062        NA          NA       NA        NA  38.76425 0.004273306  -10.08125 15.63470  2.7191500
2018.083        NA          NA       NA        NA  39.61575 0.004280005  -10.41250 15.32030  3.7047500
2018.104        NA          NA       NA        NA  37.48700 0.003982139  -10.74375 15.06485  2.5189500
2018.125        NA          NA       NA        NA  35.84950 0.003735063  -10.41250 14.77010  3.2235000
2018.146        NA          NA       NA        NA  36.68462 0.003697674   -8.75625 14.25920  1.4409500
2018.167        NA          NA       NA        NA  41.48250 0.003954404  -11.07500 13.39460  1.5064000
2018.188        NA          NA       NA        NA  42.54688 0.003968433   -9.41875 13.06055  3.6701000
2018.208        NA          NA       NA        NA  43.05450 0.003969581   -9.08750 12.88370  1.6103500
2018.229        NA          NA       NA        NA  44.11888 0.004000366   -9.41875 12.62825  1.3485500
2018.250        NA          NA       NA        NA  46.26400 0.004061953   -9.08750 12.13700  1.9491500
2018.271        NA          NA       NA        NA  46.88625 0.004084874   -9.08750 12.01910  2.0569500
2018.292        NA          NA       NA        NA  49.57175 0.004187059   

wrong plot due to time index

我也试过这个频率:

ts.freq1 <- ts(data = pleiadesGH.v2[,2:ncol(pleiadesGH.v2)],
           start = c(2018),
           frequency =  365.25*24*60/30 )

得到以下结果:

Time Series:
Start = c(2018, 1) 
End = c(2018, 17521) 
Frequency = 17532 
          humInt.R    humInt.E  tempInt   tempMac  humExt.R    humExt.E     radExt  tempExt vientoVelo
2018.000        NA          NA       NA        NA        NA          NA         NA       NA         NA
2018.000        NA          NA       NA        NA  36.78287 0.004410894  -10.08125 16.95125  3.6855000
2018.000        NA          NA       NA        NA  38.56775 0.004427114   -9.75000 16.26350  2.3842000
2018.000        NA          NA       NA        NA  38.76425 0.004273306  -10.08125 15.63470  2.7191500
2018.000        NA          NA       NA        NA  39.61575 0.004280005  -10.41250 15.32030  3.7047500
2018.000        NA          NA       NA        NA  37.48700 0.003982139  -10.74375 15.06485  2.5189500
2018.000        NA          NA       NA        NA  35.84950 0.003735063  -10.41250 14.77010  3.2235000
2018.000        NA          NA       NA        NA  36.68462 0.003697674   -8.75625 14.25920  1.4409500
2018.000        NA          NA       NA        NA  41.48250 0.003954404  -11.07500 13.39460  1.5064000
2018.001        NA          NA       NA        NA  42.54688 0.003968433   -9.41875 13.06055  3.6701000
2018.001        NA          NA       NA        NA  43.05450 0.003969581   -9.08750 12.88370  1.6103500
2018.001        NA          NA       NA        NA  44.11888 0.004000366   -9.41875 12.62825  1.3485500
2018.001        NA          NA       NA        NA  46.26400 0.004061953   -9.08750 12.13700  1.9491500

但这隐含地意味着我的季节性是每年一次,但这不是我的objective。在下图中,您可以看到尽管季节性错误,时间索引现在已修复

good index incorrect seasonality

我做错了什么?

解决方法如下:

freq.daily <- 48 # 24 hours *  2 obs per hour

ts.daily <- ts(data = pleiadesGH.v2.interp[,2:ncol(pleiadesGH.v2)],
           start = c(1),
           frequency = freq.daily)
Time Series:
Start = c(1, 1) 
End = c(366, 1) 
Frequency = 48 
            humInt.R    humInt.E  tempInt   tempMac  humExt.R    humExt.E      radExt
  1.000000  74.56250 0.007699896 14.53500 13.625000  36.78287 0.004410894  -10.081250
  1.020833  74.56250 0.007699896 14.53500 13.625000  36.78287 0.004410894  -10.081250
  1.041667  74.56250 0.007699896 14.53500 13.625000  38.56775 0.004427114   -9.750000
  1.062500  74.56250 0.007699896 14.53500 13.625000  38.76425 0.004273306  -10.081250
  1.083333  74.56250 0.007699896 14.53500 13.625000  39.61575 0.004280005  -10.412500
  1.104167  74.56250 0.007699896 14.53500 13.625000  37.48700 0.003982139  -10.743750

因为这是ts简单有效管理日期的方式,从1开始。