根据 P 值绘制滚动系数和颜色

Question

这有点棘手！我是运行滚动 window 回归，我正在收集每个 window 的所有系数。我的目标是绘制系数如何随时间波动。此外，我希望通过在不显着时给出不同的颜色点来在发现系数具有统计显着性（比如 95%）时给出不同的颜色。

我目前拥有的是：

library(plm)
coeff<-NULL
for(e in 1:39){   #44 years total for each country
      paneldata<-pdata.frame(
rbind(
subset(LaggedPannel,Country=="A")[(e):(e+5),],
subset(LaggedPannel,Country=="B")[(e):(e+5),]),
index=c("Country","Year")) #we made our new windowed panel frame


coef<-coef(summary(plm(Y~lag(Y,1),data=paneldata,model="pooling")))[2,1] #gets the coeff from a panel regression
      coeff<-c(coeff,coef)  #store coeffs
    } 
plot(coeff,type="b",col="red")

情节产生了：

例如，假设第二个和第四个系数（图中的项目符号）在统计上不显着；那么他们的颜色应该是绿色的。

Data (LaggedPannel):

                 Age1     Age2     Age3
Australia-1973  261.156  255.699  249.954
Australia-1974  261.305  255.394  251.470
Australia-1975  258.160  253.543  250.538
Australia-1976  262.504  258.066  254.720
Australia-1977  240.086  260.846  258.418
Australia-1978  228.774  238.871  259.449
USA-1973       4100.257 4104.028 4107.409
USA-1974       4135.435 4118.422 4120.286
USA-1975       4171.648 4164.065 4134.525
USA-1976       4208.236 4187.196 4171.167
USA-1977       4240.832 4211.655 4189.650
USA-1978       4286.923 4255.092 4229.701

Answer 1

这里是一些模拟数据。

library(tidyverse)
library(broom)
simfun <- function(a=0.1,B=0.05,n=200,x.sd=1,e.sd=1) {
  x <- rnorm(n, mean=0, sd=x.sd) + runif(100)
  e <-  rnorm(n, mean=0, sd=e.sd)
  y <- a+B*x+e 
  data.frame(x,y)
}

statfun <- function(d) {
  summary(lm(y~x,data=d)) %>% tidy()
}

simdata <- map(seq(50),~statfun(simfun())) %>% enframe() %>% unnest() %>% filter(term == "x")

下面判断哪些系数是"significant".

simdata <- simdata %>% 
  mutate(row_number(),
         Significance = factor(p.value < 0.05))

如果你想使用基plot，你可以这样做：

Significance = simdata$Significance

plot(simdata$estimate, col = ifelse(Significance==TRUE, "blue", "red"), ylab = "coeff")
lines(simdata$estimate)

或者使用 ggplot2，您可以：

ggplot(simdata, aes(name, estimate)) + geom_line() + geom_point(aes(color = Significance), shape = 1) +
  labs(x = "Index", y = "coeff") + theme_bw()

Answer 2

使用额外的向量来存储 p-values，然后根据它们的值与显着性水平 0.05 相比进行着色也解决了这个问题。具体来说：

library(plm)
coeff<-NULL
P_values<-NULL
for(e in 1:39){   #44 years total for each country
      paneldata<-pdata.frame(
rbind(
subset(LaggedPannel,Country=="A")[(e):(e+5),],
subset(LaggedPannel,Country=="B")[(e):(e+5),]),
index=c("Country","Year")) #we made our new windowed panel frame


coef<-coef(summary(plm(Y~lag(Y,1),data=paneldata,model="pooling")))[2,1] #gets the coeff from a panel regression
PV<-coef(summary(plm(Y~lag(Y,1),data=paneldata,model="pooling")))[2,4] #stores the p-values
coeff<-c(coeff,coef)
P_values<-c(P_values,PV)
    } 
plot(coeff,type="b",col="red") #previousplot

 plot(coeff,col=ifelse(P_values<=0.05, "blue", "red"),ylab = "coef",type="b") 
    #new plot based on significant values:

这个答案的唯一问题是，如果您要考虑多个变量，它会非常乏味；那么您将需要创建多个空向量等等。这不是一个快速的方法，但肯定有效。

根据 P 值绘制滚动系数和颜色

Plot rolling coefficients and color based on P-Value

plot

r

rolling-computation