结合年月日列以形成 R 中时间序列数据的索引

Combining year month days columns to form index for a time series data in R

我有一个从 excel 文件中读取的降雨数据。它的格式类似于

Year-Month Day1 Day2 Day3
2020-01 0.5 0.6 0.8
2020-02 0 0 1.5
2020-03 5.2 1.0 10.5

我需要它是以下格式,以便在 R 中执行时间序列预测。

DATE RAINFALL
2020-01-01 0.5
2020-01-02 0.6
2020-01-03 0.8

DATE 列必须是日期,RAINFALL 列应包含一年中特定日期的降雨量值。

使用 pivot_longerunite 'YEAR'、'Day' 列从 'wide' 重塑为 'long' 以创建 'DATE'列,然后将类型转换为 Date class with ymd from lubridate

library(dplyr)
library(tidyr)
library(lubridate)
df1 %>% 
    pivot_longer(cols = -YEAR, names_to = 'Day', 
       names_prefix = 'Day', values_to = 'RAINFALL') %>%
    unite(DATE, YEAR, Day, sep= '-') %>%
    mutate(DATE = ymd(DATE))

-输出

# A tibble: 60 x 2
   DATE       RAINFALL
   <date>        <dbl>
 1 1901-01-01      2.7
 2 1901-01-02      0.4
 3 1901-01-03      4.7
 4 1901-01-04     10  
 5 1901-01-05     13  
 6 1901-01-06     16.9
 7 1901-01-07     19.2
 8 1901-01-08     18.3
 9 1901-01-09     15.7
10 1901-01-10     10.6
# … with 50 more rows

数据

df1 <- structure(list(YEAR = c("1901-01", "1902-02", "1903-03", "1904-04", 
"1905-05"), Day1 = c(2.7, 4.1, 3.8, 3, 1.7), Day2 = c(0.4, 3.2, 
5.9, 4.6, 4), Day3 = c(4.7, 7.5, 7.6, 5.5, 7.4), Day4 = c(10, 
10.3, 7.1, 10.3, 9.3), Day5 = c(13, 10, 12.9, 13.6, 11.9), Day6 = c(16.9, 
15.1, 14.9, 16.3, 16.5), Day7 = c(19.2, 18.2, 17.6, 20.2, 20), 
    Day8 = c(18.3, 17.4, 17.3, 18.5, 17.6), Day9 = c(15.7, 15, 
    15.5, 13.9, 14.7), Day10 = c(10.6, 10.2, 12.1, 11.2, 8.4), 
    Day11 = c(4.9, 6.3, 6.9, 5.4, 5.5), Day12 = c(3.5, 3.5, 2.7, 
    4.8, 3.8)), class = "data.frame", row.names = c(NA, -5L))