将年度数据转换为 r 中的月度数据
Convert yearly to monthly data in r
我有一个年度数据,在年中有值变化。我想把它变成一个显示价值变化的月度数据。这是我的数据片段。
year value Start date End date
1985 35451 7/1/1985 3/20/1986
1986 45600 3/21/1986 12/23/1986
1987 46089 1/1/1987 10/31/1989
我希望所有的飞蛾都在列中,年份在行中(类似于下面,但在 Jun 之后没有中断):
Jan Feb Mar Apr May Jun
1985 0 0 0 0 0 0
1986 35451 35451 38725 45600 45600 45600
Jul Aug Sep Oct Nov Dec
1985 35451 35451 35451 35451 35451 35451
1986 45600 45600 45600 45600 45600 45726
1986 年 3 月和 12 月具有加权平均值,因为值的变化发生在该月。
感谢并感谢。
实际上,您在这里只需要 seq.Date
和 xtabs
(或您最喜欢的变体),但需要进行大量修改才能使其正常工作。在 Hadleyverse 包中,但如果您愿意,可以在基础或 data.table
中重写:
library(dplyr)
library(tidyr)
library(lubridate)
# Format dates as dates, then,
df %>% mutate_each(funs(mdy), ends_with('date')) %>%
# evaluating each row separately,
rowwise() %>%
# create a list column with a month-wise sequence of dates for each.
mutate(month = list(seq.Date(Start.date, End.date, by = 'month'))) %>%
# Expand list column to long form,
unnest() %>%
# change year column to year of sequence, not label, and reduce month column to month.abb.
mutate(year = year(month), month = month(month, label = TRUE)) %>%
# For each year-month combination,
group_by(year, month) %>%
# take the mean of values, so each has only one row, then
summarise(value = mean(value)) %>%
# spread the result to wide form.
spread(month, value, fill = 0) # or xtabs(value ~ year + month, data = .)
# Source: local data frame [5 x 13]
# Groups: year [5]
#
# year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
# (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
# 1 1985 0 0 0.0 0 0 0 35451 35451 35451 35451 35451 35451
# 2 1986 35451 35451 40525.5 45600 45600 45600 45600 45600 45600 45600 45600 45600
# 3 1987 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 46089 46089
# 4 1988 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 46089 46089
# 5 1989 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 0 0
我有一个年度数据,在年中有值变化。我想把它变成一个显示价值变化的月度数据。这是我的数据片段。
year value Start date End date
1985 35451 7/1/1985 3/20/1986
1986 45600 3/21/1986 12/23/1986
1987 46089 1/1/1987 10/31/1989
我希望所有的飞蛾都在列中,年份在行中(类似于下面,但在 Jun 之后没有中断):
Jan Feb Mar Apr May Jun
1985 0 0 0 0 0 0
1986 35451 35451 38725 45600 45600 45600
Jul Aug Sep Oct Nov Dec
1985 35451 35451 35451 35451 35451 35451
1986 45600 45600 45600 45600 45600 45726
1986 年 3 月和 12 月具有加权平均值,因为值的变化发生在该月。
感谢并感谢。
实际上,您在这里只需要 seq.Date
和 xtabs
(或您最喜欢的变体),但需要进行大量修改才能使其正常工作。在 Hadleyverse 包中,但如果您愿意,可以在基础或 data.table
中重写:
library(dplyr)
library(tidyr)
library(lubridate)
# Format dates as dates, then,
df %>% mutate_each(funs(mdy), ends_with('date')) %>%
# evaluating each row separately,
rowwise() %>%
# create a list column with a month-wise sequence of dates for each.
mutate(month = list(seq.Date(Start.date, End.date, by = 'month'))) %>%
# Expand list column to long form,
unnest() %>%
# change year column to year of sequence, not label, and reduce month column to month.abb.
mutate(year = year(month), month = month(month, label = TRUE)) %>%
# For each year-month combination,
group_by(year, month) %>%
# take the mean of values, so each has only one row, then
summarise(value = mean(value)) %>%
# spread the result to wide form.
spread(month, value, fill = 0) # or xtabs(value ~ year + month, data = .)
# Source: local data frame [5 x 13]
# Groups: year [5]
#
# year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
# (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
# 1 1985 0 0 0.0 0 0 0 35451 35451 35451 35451 35451 35451
# 2 1986 35451 35451 40525.5 45600 45600 45600 45600 45600 45600 45600 45600 45600
# 3 1987 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 46089 46089
# 4 1988 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 46089 46089
# 5 1989 46089 46089 46089.0 46089 46089 46089 46089 46089 46089 46089 0 0