结合年月日列以形成 R 中时间序列数据的索引
Combining year month days columns to form index for a time series data in R
我有一个从 excel 文件中读取的降雨数据。它的格式类似于
Year-Month
Day1
Day2
Day3
2020-01
0.5
0.6
0.8
2020-02
0
0
1.5
2020-03
5.2
1.0
10.5
我需要它是以下格式,以便在 R 中执行时间序列预测。
DATE
RAINFALL
2020-01-01
0.5
2020-01-02
0.6
2020-01-03
0.8
DATE 列必须是日期,RAINFALL 列应包含一年中特定日期的降雨量值。
使用 pivot_longer
、unite
'YEAR'、'Day' 列从 'wide' 重塑为 'long' 以创建 'DATE'列,然后将类型转换为 Date
class with ymd
from lubridate
library(dplyr)
library(tidyr)
library(lubridate)
df1 %>%
pivot_longer(cols = -YEAR, names_to = 'Day',
names_prefix = 'Day', values_to = 'RAINFALL') %>%
unite(DATE, YEAR, Day, sep= '-') %>%
mutate(DATE = ymd(DATE))
-输出
# A tibble: 60 x 2
DATE RAINFALL
<date> <dbl>
1 1901-01-01 2.7
2 1901-01-02 0.4
3 1901-01-03 4.7
4 1901-01-04 10
5 1901-01-05 13
6 1901-01-06 16.9
7 1901-01-07 19.2
8 1901-01-08 18.3
9 1901-01-09 15.7
10 1901-01-10 10.6
# … with 50 more rows
数据
df1 <- structure(list(YEAR = c("1901-01", "1902-02", "1903-03", "1904-04",
"1905-05"), Day1 = c(2.7, 4.1, 3.8, 3, 1.7), Day2 = c(0.4, 3.2,
5.9, 4.6, 4), Day3 = c(4.7, 7.5, 7.6, 5.5, 7.4), Day4 = c(10,
10.3, 7.1, 10.3, 9.3), Day5 = c(13, 10, 12.9, 13.6, 11.9), Day6 = c(16.9,
15.1, 14.9, 16.3, 16.5), Day7 = c(19.2, 18.2, 17.6, 20.2, 20),
Day8 = c(18.3, 17.4, 17.3, 18.5, 17.6), Day9 = c(15.7, 15,
15.5, 13.9, 14.7), Day10 = c(10.6, 10.2, 12.1, 11.2, 8.4),
Day11 = c(4.9, 6.3, 6.9, 5.4, 5.5), Day12 = c(3.5, 3.5, 2.7,
4.8, 3.8)), class = "data.frame", row.names = c(NA, -5L))
我有一个从 excel 文件中读取的降雨数据。它的格式类似于
Year-Month | Day1 | Day2 | Day3 |
---|---|---|---|
2020-01 | 0.5 | 0.6 | 0.8 |
2020-02 | 0 | 0 | 1.5 |
2020-03 | 5.2 | 1.0 | 10.5 |
我需要它是以下格式,以便在 R 中执行时间序列预测。
DATE | RAINFALL |
---|---|
2020-01-01 | 0.5 |
2020-01-02 | 0.6 |
2020-01-03 | 0.8 |
DATE 列必须是日期,RAINFALL 列应包含一年中特定日期的降雨量值。
使用 pivot_longer
、unite
'YEAR'、'Day' 列从 'wide' 重塑为 'long' 以创建 'DATE'列,然后将类型转换为 Date
class with ymd
from lubridate
library(dplyr)
library(tidyr)
library(lubridate)
df1 %>%
pivot_longer(cols = -YEAR, names_to = 'Day',
names_prefix = 'Day', values_to = 'RAINFALL') %>%
unite(DATE, YEAR, Day, sep= '-') %>%
mutate(DATE = ymd(DATE))
-输出
# A tibble: 60 x 2
DATE RAINFALL
<date> <dbl>
1 1901-01-01 2.7
2 1901-01-02 0.4
3 1901-01-03 4.7
4 1901-01-04 10
5 1901-01-05 13
6 1901-01-06 16.9
7 1901-01-07 19.2
8 1901-01-08 18.3
9 1901-01-09 15.7
10 1901-01-10 10.6
# … with 50 more rows
数据
df1 <- structure(list(YEAR = c("1901-01", "1902-02", "1903-03", "1904-04",
"1905-05"), Day1 = c(2.7, 4.1, 3.8, 3, 1.7), Day2 = c(0.4, 3.2,
5.9, 4.6, 4), Day3 = c(4.7, 7.5, 7.6, 5.5, 7.4), Day4 = c(10,
10.3, 7.1, 10.3, 9.3), Day5 = c(13, 10, 12.9, 13.6, 11.9), Day6 = c(16.9,
15.1, 14.9, 16.3, 16.5), Day7 = c(19.2, 18.2, 17.6, 20.2, 20),
Day8 = c(18.3, 17.4, 17.3, 18.5, 17.6), Day9 = c(15.7, 15,
15.5, 13.9, 14.7), Day10 = c(10.6, 10.2, 12.1, 11.2, 8.4),
Day11 = c(4.9, 6.3, 6.9, 5.4, 5.5), Day12 = c(3.5, 3.5, 2.7,
4.8, 3.8)), class = "data.frame", row.names = c(NA, -5L))