正确解释 statsmodels.tsa.ar_models.ar_select_order 函数数组以确定最佳滞后

Question

使用 statsmodels 0.12.0 我试图确定 statsmodels.tsa.ar_models.AutoReg 模型的最佳滞后。我正在使用每月时间步长的美国人口数据，并将最大滞后 12 传递给 statsmodels.tsa.ar_models.ar_select_order 对象进行评估。

from statsmodels.tsa.ar_model import AutoReg, ar_select_order    
df = pd.read_csv('Data\uspopulation.csv', index_col='DATE', parse_dates=True)
df.index.freq = 'MS'
train_data = df.iloc[:84]
test_data = df.iloc[84:]
modelp = ar_select_order(train_data['PopEst'], maxlag=12)

上面的代码 returns 一个 [ 1 2 3 4 5 6 7 8 9 10 11 12] 的 numpy 数组，根据这个 Whosebug，我将其解释为“最佳滞后 p 是 12”问题：. However, evaluating on some metrics (RMSE) I find that my AutoReg fitted models with maxlag=12 are performing worse than lower order models. By trial and error I found that the optimal lag is 8. So I am having difficulty interpreting the resulting numpy array, I have been reading the resources on statsmodels.com/ar_select_order and statsmodels.com/autoregressions 但他们还没有说清楚。

这里有人有意见吗？ python 库的新手，感觉有点迷茫。

Answer 1

The code above returns a numpy array of [ 1 2 3 4 5 6 7 8 9 10 11 12], which I am interpreting as "The optimal lag p is 12" as per this Whosebug question: Whosebug.

是的，没错。它 returns 一个数组而不只是 12 的原因是它还可以搜索不包含所有滞后的模型，如果您设置 glob=True。例如，[ 1 2 3 12] 可能是具有某些年度季节性模式的月度模型的常见结果。

However, evaluating on some metrics (RMSE) I find that my AutoReg fitted models with maxlag=12 are performing worse than lower order models. By trial and error I found that the optimal lag is 8. So I am having difficulty interpreting the resulting numpy array, I have been reading the resources on statsmodels.com/ar_select_order and statsmodels.com/autoregressions but they have not made it clearer.

此函数正在返回使用 information criteria. In particular, the default is BIC or Bayesian information criterion 判断为最佳的模型。如果使用其他的标准，比如最小化out-of-sample RSME，那么肯定有可能发现不同的模型被判断为最优

正确解释 statsmodels.tsa.ar_models.ar_select_order 函数数组以确定最佳滞后

Correctly interpreting statsmodels.tsa.ar_models.ar_select_order function array to determine optimal lag

python

time-series

statsmodels

autoregressive-models