线性插值每年到每季度的数据 - 错误

Question

我有许多国家/地区的年度时间序列数据，我想使用 R 或 Python 进行线性插值以使其成为季度数据。到目前为止在 Whosebug 上讨论的内容没有回答我的问题。

我一直在遵循 Jason Brownlee 详细记录的程序，即： https://machinelearningmastery.com/resample-interpolate-time-series-data-python/

对我来说，它看起来像这样：

YEAR CH  FR   US
2005 200 700  500
2006 300 740  530
2007 450 760  600

在代码方面，我根据需要重写了博客的示例：

def parser(x):
    return datetime.strptime('2005' + x, '%Y')

data = read_csv('data.csv', sep=';', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser = parser)

我收到一条很长的错误消息：

ValueError：未转换的数据仍然存在：+x

1) 如果我不将 +x 添加到解析器定义中，则所有年份每次观察都是相同的。解析器有什么问题？

2)关于如何同时处理多个时间序列（即 CH、FR、US）的任何想法？我不想为了这个准备步骤而拆散我的数据集。

3)如果有人对如何在 R 中执行此操作有建议，我会非常高兴，那里的所有程序似乎都很长，而且没有得到我真正需要的东西。

Answer 1

以下基本 R 解决方案使用 approxfun 创建一个插值函数，并使用年份和季度调用它。插值方式默认为method = "linear".

year_qtr <- function(x, years){
  f <- approxfun(years, x)
  n <- length(years)
  qtrs <- unlist(lapply(years[-n], function(y) y + (0:3)/4))
  qtrs <- c(qtrs, years[n])
  list(x = qtrs, y = f(qtrs))
}

year_qtr(df1$CH, df1$YEAR)
#$x
#[1] 2005.00 2005.25 2005.50 2005.75 2006.00 2006.25 2006.50
#[8] 2006.75 2007.00
#
#$y
#[1] 200.0 225.0 250.0 275.0 300.0 337.5 375.0 412.5 450.0

数据

df1 <- read.table(text = "
YEAR CH  FR   US
2005 200 700  500
2006 300 740  530
2007 450 760  600
", header = TRUE)

线性插值每年到每季度的数据 - 错误

Interpolating yearly to quarterly Data linearly - error

python

interpolation

r

time-series

frequency