从数据帧生成 TimeSeries 列表
Generate a List of TimeSeries from a dataframe
我有一个如下所示的数据框:
# A tibble: 6 x 4
# Groups: IND_LOC [1]
year_month total_this_month mean_this_month IND_LOC
<S3: yearmon> <dbl> <dbl> <fct>
1 Jan 2013 3960268. 360024. 8_9
2 Feb 2013 3051909. 277446. 8_9
3 Mar 2013 3504636. 318603. 8_9
4 Apr 2013 3234451. 294041. 8_9
5 May 2013 3409146. 284096. 8_9
6 Jun 2013 3619219. 301602. 8_9
最后一列 'IND_LOC' 有 89 个唯一值(1_1、1_2 ... 8_9)
我想生成与这些 'IND_LOC' 值相对应的时间序列列表,使其具有以下结构(这只是一个不同数据集的示例,替换为“$1_1”对于“$Germany”等):
> str(time_series)
List of 9
$ Germany : Time-Series [1:52] from 1960 to 2011: 684721 716424 749838 ...
$ Singapore : Time-Series [1:52] from 1960 to 2011: 7208 7795 8349 ...
$ Finland : Time-Series [1:37] from 1975 to 2011: 85842 86137 86344 ...
非常感谢任何帮助!
我们可以分组 summarise
library(dplyr)
library(lubridate)
df %>%
group_by(IND_LOC) %>%
summarise(time_series = list(ts(total_this_month,
start= c(year(year_month[1]), month(year_month[1])), frequency = 12)))
另一种选择,使用split
和lapply
;并使用 zoo
作为转换为 ts
.
的助手
dat <- read.csv(text="year_month,total_this_month,mean_this_month,IND_LOC
Jan 2013,3960268,360024,8_9
Feb 2013,3051909,277446,8_9
Mar 2013,3504636,318603,8_9
Apr 2013,3234451,294041,8_9
May 2013,3409146,284096,8_9
Jun 2013,3619219,301602,8_9
Jan 2013,3960268,360024,9_9
Feb 2013,3051909,277446,9_9
Mar 2013,3504636,318603,9_9
Apr 2013,3234451,294041,9_9
May 2013,3409146,284096,9_9
Jun 2013,3619219,301602,9_9")
dat$year_month <- as.yearmon(dat$year_month)
library(zoo)
time_series <- lapply(split(dat, dat$IND_LOC),
function(x) as.ts(zoo(x$total_this_month, x$year_month)))
str(time_series)
# List of 2
# $ 8_9: Time-Series [1:6] from 1 to 6: 3234451 3051909 3960268 3619219 3504636
# $ 9_9: Time-Series [1:6] from 1 to 6: 3234451 3051909 3960268 3619219 3504636
sapply(time_series, frequency)
# 8_9 9_9
# 12 12
我有一个如下所示的数据框:
# A tibble: 6 x 4
# Groups: IND_LOC [1]
year_month total_this_month mean_this_month IND_LOC
<S3: yearmon> <dbl> <dbl> <fct>
1 Jan 2013 3960268. 360024. 8_9
2 Feb 2013 3051909. 277446. 8_9
3 Mar 2013 3504636. 318603. 8_9
4 Apr 2013 3234451. 294041. 8_9
5 May 2013 3409146. 284096. 8_9
6 Jun 2013 3619219. 301602. 8_9
最后一列 'IND_LOC' 有 89 个唯一值(1_1、1_2 ... 8_9)
我想生成与这些 'IND_LOC' 值相对应的时间序列列表,使其具有以下结构(这只是一个不同数据集的示例,替换为“$1_1”对于“$Germany”等):
> str(time_series)
List of 9
$ Germany : Time-Series [1:52] from 1960 to 2011: 684721 716424 749838 ...
$ Singapore : Time-Series [1:52] from 1960 to 2011: 7208 7795 8349 ...
$ Finland : Time-Series [1:37] from 1975 to 2011: 85842 86137 86344 ...
非常感谢任何帮助!
我们可以分组 summarise
library(dplyr)
library(lubridate)
df %>%
group_by(IND_LOC) %>%
summarise(time_series = list(ts(total_this_month,
start= c(year(year_month[1]), month(year_month[1])), frequency = 12)))
另一种选择,使用split
和lapply
;并使用 zoo
作为转换为 ts
.
dat <- read.csv(text="year_month,total_this_month,mean_this_month,IND_LOC
Jan 2013,3960268,360024,8_9
Feb 2013,3051909,277446,8_9
Mar 2013,3504636,318603,8_9
Apr 2013,3234451,294041,8_9
May 2013,3409146,284096,8_9
Jun 2013,3619219,301602,8_9
Jan 2013,3960268,360024,9_9
Feb 2013,3051909,277446,9_9
Mar 2013,3504636,318603,9_9
Apr 2013,3234451,294041,9_9
May 2013,3409146,284096,9_9
Jun 2013,3619219,301602,9_9")
dat$year_month <- as.yearmon(dat$year_month)
library(zoo)
time_series <- lapply(split(dat, dat$IND_LOC),
function(x) as.ts(zoo(x$total_this_month, x$year_month)))
str(time_series)
# List of 2
# $ 8_9: Time-Series [1:6] from 1 to 6: 3234451 3051909 3960268 3619219 3504636
# $ 9_9: Time-Series [1:6] from 1 to 6: 3234451 3051909 3960268 3619219 3504636
sapply(time_series, frequency)
# 8_9 9_9
# 12 12