使用 R 中的三种不同技术对 AR(1) 建模的问题： 1. 公式 2.for-loop 3.arima.sim

Question

（添加了可重现的示例）
我尝试建模 Y_t=0.6Y_t-1+V_t (V_t ~ N(0,1)) AR(1) 使用三种不同的技术。

1.公式Y_t=ρ^t[y₁/ρ + ∑_{j=2 到 t} (α+V_j)/ρ^{j Y_t=α+ρY_t-1+V_t的}]得到通过反向替换并使用累积和。

2.经典for循环

3.arima.sim模拟.

虽然我希望得到或多或少相同的东西，但有趣和意想不到的事情发生了：

N <- 1388; ro <- 0.6; a <- 0
set.seed(1)
v <- ts(rnorm(N,0,1))

# Formula technique
y1 <- ts(rep(0,N))  # y[1]:=0 defined
y1[-1] <- ro^(2:N) * (y1[1]/ro + cumsum((a+v[-1]) / ro^(2:N)))

# The classical "for" loop
y2 <- ts(rep(0,N))  # y2[1]:=0 defined    
for (t in 2:N){  y2[t] <- a + ro*y2[t-1]+v[t] }

# arima.sim simulation 
set.seed(1) 
y3 <- arima.sim(model=list(ar=0.6), n=1388, mean=0, sd=sqrt(1)) 
# change n in arima.sim accordingly such that n=N simultaneously with the N definition above


c(mean(y1),sd(y1))
c(mean(y2),sd(y2))
c(mean(y3),sd(y3))

N=1388（和 n=1388）给出：

[1] -0.03713488  1.26102077
[1] -0.03713488  1.26102077
[1] -0.01048798  1.28445899

N=1389（和 n=1389）给出：

[1] Inf NaN
[1] -0.03661779  1.26071373
[1] -0.01048798  1.28445899

标准偏差值符合预期：
StdDev(Y_t)=sqrt[sd(v)²/(1-ρ²) ]

sqrt(sd(v)^2/(1-(0.6)^2)) # 1.284229

公式技巧出现奇点：

c(mean(y1),sd(y1)) 定义为 N<=1388
c(mean(y1),sd(y1)) 对于 N=1389 和 N=1390
是 (Inf, NaN) c(mean(y1),sd(y1)) 对于 N>=1391 是 (NaN, NA)。

问题：
1.为什么N>=1989公式技术失败？
（尽管我们有一个 ρ=0.6 的平稳序列）

2. 为什么我们得到 c(mean(y1),sd(y1))#(Inf, NaN) for N=1389 and N=1390 and c(mean(y1),sd(y1))#(NaN, NA) 对于 N>=1391?

Answer 1

对于 N <- 1389; ro <- 0.6; a <- 0，cumsum((a+v[-1]) / ro^(2:N)) 的最后一个元素大于您 PC 上的最大可表示数，即您的 .Machine$double.xmax。（我在这里假设它最大为 1.797693e+308。因此它表示为 Inf。这导致 y1 的最后一个元素变成 Inf 以及 [= 上的任何合法算术运算17=] 和一个数字产量 Inf。包含 Inf 的向量的 sd 是 NaN:

N <- 1389; ro <- 0.6; a <- 0
set.seed(1)
v <- ts(rnorm(N,0,1))

# Formula technique
y1 <- ts(rep(0,N))  # y[1]:=0 defined
y1[-1] <- ro^(2:N) * (y1[1]/ro + cumsum((a+v[-1]) / ro^(2:N)))

tail((a+v[-1]) / ro^(2:N))
#[1] -8.183997e+306  1.049540e+307 -1.581977e+306 -3.189985e+307 -7.577105e+307
#[6]            Inf
tail(cumsum((a+v[-1]) / ro^(2:N)))
#[1] -1.814422e+307 -7.648815e+306 -9.230793e+306 -4.113064e+307 -1.169017e+308
#[6]            Inf
tail(y1)
#[1] -1.6598499 -0.4198324 -0.3039989 -0.8127365 -1.3859780        Inf

N <- 1390; ro <- 0.6; a <- 0。 (a+v[-1]) / ro^(2:N) 的最后一个值是可表示的，因此没有任何变化。

N <- 1390; ro <- 0.6; a <- 0
set.seed(1)
v <- ts(rnorm(N,0,1))

# Formula technique
y1 <- ts(rep(0,N))  # y[1]:=0 defined
y1[-1] <- ro^(2:N) * (y1[1]/ro + cumsum((a+v[-1]) / ro^(2:N)))

tail((a+v[-1]) / ro^(2:N))
#[1]  1.049540e+307 -1.581977e+306 -3.189985e+307 -7.577105e+307            Inf
#[6] -5.546904e+307
tail(cumsum((a+v[-1]) / ro^(2:N)))
#[1] -7.648815e+306 -9.230793e+306 -4.113064e+307 -1.169017e+308            Inf
#[6]            Inf
tail(y1)
#[1] -0.4198324 -0.3039989 -0.8127365 -1.3859780        Inf        Inf

N <- 1391; ro <- 0.6; a <- 0 (a+v[-1]) / ro^(2:N) 的最后一个值变为 -Inf（小于 .Machine$double.xmin）并且 cumsum 必须添加 Inf 和 -Inf，它给出了一个 NaN，这意味着不是一个数字。 NaN 的 mean 也不是数字， sd(NaN) 是 NA.

N <- 1391; ro <- 0.6; a <- 0
set.seed(1)
v <- ts(rnorm(N,0,1))

# Formula technique
y1 <- ts(rep(0,N))  # y[1]:=0 defined
y1[-1] <- ro^(2:N) * (y1[1]/ro + cumsum((a+v[-1]) / ro^(2:N)))


tail((a+v[-1]) / ro^(2:N))
#[1] -1.581977e+306 -3.189985e+307 -7.577105e+307            Inf -5.546904e+307
#[6]           -Inf
tail(cumsum((a+v[-1]) / ro^(2:N)))
#[1] -9.230793e+306 -4.113064e+307 -1.169017e+308            Inf            Inf
#[6]            NaN
tail(y1)
#[1] -0.3039989 -0.8127365 -1.3859780        Inf        Inf        NaN

使用 R 中的三种不同技术对 AR(1) 建模的问题： 1. 公式 2.for-loop 3.arima.sim

Problems with modeling AR(1) with three different techniques in R: 1. the formula 2.for-loop 3.arima.sim

modeling

for-loop

r

formula

autoregressive-models