预测与实际情节

Question

我是 R 和统计学的新手，一直无法弄清楚如何在运行多元线性回归后绘制预测值与实际值的关系图。我遇到过类似的问题（只是无法理解代码）。如果您解释代码，我将不胜感激。这是我到目前为止所做的：

# Attach file containing variables and responses
q <- read.csv("C:/Users/A/Documents/Design.csv")
attach(q)
# Run a linear regression
model <- lm(qo~P+P1+P4+I)
# Summary of linear regression results
summary(model)

预测与实际的关系图让我可以图形化地查看我的回归与我的实际数据的吻合程度。

Answer 1

如果你能提供一个reproducible example就更好了，但这是我编的一个例子：

set.seed(101)
dd <- data.frame(x=rnorm(100),y=rnorm(100),
                 z=rnorm(100))
dd$w <- with(dd,
     rnorm(100,mean=x+2*y+z,sd=0.5))

使用 data 参数（好得多）——您几乎不应该使用 attach() ..

 m <- lm(w~x+y+z,dd)
 plot(predict(m),dd$w,
      xlab="predicted",ylab="actual")
 abline(a=0,b=1)

Answer 2

除了predicted vs actual图，您还可以获得一组额外的图，帮助您直观地评估拟合优度。

--- execute previous code by Ben Bolker ---

par(mfrow = c(2, 2))
plot(m)

Answer 3

一个简洁的方法是使用 modelsummary::augment():

library(tidyverse)
library(cowplot)
library(modelsummary)

set.seed(101)
# Using Ben's data above:
dd <- data.frame(x=rnorm(100),y=rnorm(100),
                 z=rnorm(100))
dd$w <- with(dd,rnorm(100,mean=x+2*y+z,sd=0.5))

m <- lm(w~x+y+z,dd)

m %>% augment() %>% 
  ggplot()  + 
  geom_point(aes(.fitted, w)) + 
  geom_smooth(aes(.fitted, w), method = "lm", se = FALSE, color = "lightgrey") + 
labs(x = "Actual", y = "Fitted") + 
  theme_bw()

这尤其适用于深层嵌套回归列表。

为了说明这一点，考虑一些嵌套的回归列表：

Reglist <- list()

Reglist$Reg1 <- dd %>% do(reg = lm(as.formula("w~x*y*z"), data = .)) %>% mutate( Name = "Type 1")
Reglist$Reg2 <- dd %>% do(reg = lm(as.formula("w~x+y*z"), data = .)) %>% mutate( Name = "Type 2")
Reglist$Reg3 <- dd %>% do(reg = lm(as.formula("w~x"), data = .)) %>% mutate( Name = "Type 3")
Reglist$Reg4 <- dd %>% do(reg = lm(as.formula("w~x+z"), data = .)) %>% mutate( Name = "Type 4")

现在是上述整洁绘图框架的力量发挥作用的地方...：

Graph_Creator <- function(Reglist){

  Reglist %>% pull(reg) %>% .[[1]] %>% augment() %>% 
    ggplot()  + 
    geom_point(aes(.fitted, w)) + 
    geom_smooth(aes(.fitted, w), method = "lm", se = FALSE, color = "lightgrey") + 
    labs(x = "Actual", y = "Fitted", 
         title =  paste0("Regression Type: ", Reglist$Name) ) + 
    theme_bw()
}

Reglist %>% map(~Graph_Creator(.)) %>% 
  cowplot::plot_grid(plotlist = ., ncol = 1)

Answer 4

与@Ben Bolker 的解决方案相同，但获取一个 ggplot 对象而不是使用基 R

#first generate the dd data set using the code in Ben's solution, then... 

require(ggpubr)
m <- lm(w~x+y+z,dd)

ggscatter(x = "prediction",
          y = "actual",
          data = data.frame(prediction = predict(m),
                            actual = dd$w)) +
  geom_abline(intercept = 0,
              slope = 1)

预测与实际情节

Predicted vs. Actual plot

plot

r

linear-regression