模型构建并停留在 R 中的错误消息上
Model building and stuck on error message in R
我正在尝试分析下面的数据集
X wt solution process
1 21 1 1
2 36 2 1
3 25 3 1
4 18 4 1
5 22 5 1
6 26 1 2
7 38 2 2
8 27 3 2
9 17 4 2
10 26 5 2
11 16 1 3
12 25 2 3
13 22 3 3
14 18 4 3
15 21 5 3
16 28 1 4
17 35 2 4
18 27 3 4
19 20 4 4
20 24 5 4
数据是平衡的,我相信两者的影响都是固定的。我在 R 中的代码如下
str(wool)
##convert solution and process to factors
wool$solution<-as.factor(wool$solution)
wool$process<-as.factor(wool$process)
m1<-aov(wt~solution*process,data=wool)
plot(m1)
但是,当我尝试绘制模型以检查假设时,出现以下错误:
not plotting observations with leverage one:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20Error in qqnorm.default(rs, main = main, ylab = ylab23, ylim = ylim, ...) :
y is empty or has only NAs
当我在没有交互的情况下绘制模型时,我不确定如何更正此问题
m1<-aov(wt~solution+process,data=wool)
一切正常,但我需要分析交互以查看它是否重要。此外,当我将这些因素保持为数字时,它就起作用了,但这些绝对是分类因素,每个因素都指一种过程和一种用于处理观察结果的解决方案。
感谢任何帮助。
交互效果的模型消耗了19个自由度。方差分析中误差项的自由度为 N - k,其中 N = 观察总数,k = 组效应数。
由于 N 为 20 并且组数加上交互 k 也为 20,因此模型项的自由度为 19。因此,误差项的自由度为 0,杠杆每个观测值的值为 1,这意味着模型中的 19 个参数中的每一个加上总均值都完全依赖于数据框中的 20 个观测值。这解释了返回 plot()
函数的错误消息。
rawData <- "X wt solution process
1 21 1 1
2 36 2 1
3 25 3 1
4 18 4 1
5 22 5 1
6 26 1 2
7 38 2 2
8 27 3 2
9 17 4 2
10 26 5 2
11 16 1 3
12 25 2 3
13 22 3 3
14 18 4 3
15 21 5 3
16 28 1 4
17 35 2 4
18 27 3 4
19 20 4 4
20 24 5 4"
wool <- read.table(text=rawData,header=TRUE)
str(wool)
##convert solution and process to factors
wool$solution<-as.factor(wool$solution)
wool$process<-as.factor(wool$process)
m1<-aov(wt~solution*process,data=wool)
summary(m1)
...输出:
> m1<-aov(wt~solution*process,data=wool)
> summary(m1)
Df Sum Sq Mean Sq
solution 4 500.8 125.20
process 3 136.8 45.60
solution:process 12 87.2 7.27
当我们用 anova(m1)
打印方差分析 table 时,问题就变得很清楚了。
> anova(m1)
Analysis of Variance Table
Response: wt
Df Sum Sq Mean Sq F value Pr(>F)
solution 4 500.8 125.200
process 3 136.8 45.600
solution:process 12 87.2 7.267
Residuals 0 0.0
Warning message:
In anova.lm(m1) :
ANOVA F-tests on an essentially perfect fit are unreliable
自由度不足的问题在我们使用lm()
拟合模型时更加明显
m2 <- lm(wt ~ solution*process,data=wool)
summary(m2)
...输出:
> m2 <- lm(wt ~ solution*process,data=wool)
> summary(m2)
Call:
lm(formula = wt ~ solution * process, data = wool)
Residuals:
ALL 20 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21 NA NA NA
solution2 15 NA NA NA
solution3 4 NA NA NA
solution4 -3 NA NA NA
solution5 1 NA NA NA
process2 5 NA NA NA
process3 -5 NA NA NA
process4 7 NA NA NA
solution2:process2 -3 NA NA NA
solution3:process2 -3 NA NA NA
solution4:process2 -6 NA NA NA
solution5:process2 -1 NA NA NA
solution2:process3 -6 NA NA NA
solution3:process3 2 NA NA NA
solution4:process3 5 NA NA NA
solution5:process3 4 NA NA NA
solution2:process4 -8 NA NA NA
solution3:process4 -5 NA NA NA
solution4:process4 -5 NA NA NA
solution5:process4 -5 NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 19 and 0 DF, p-value: NA
与连续变量的交互作用
关于当分析 运行 使用连续变量 lm()
时代码工作的 OP 问题,对于连续变量,交互效应消耗单个自由度,而不是(解决方案 - 1 ) * (processes - 1) 或两个分类变量之间相互作用的 12 个自由度。
同样,我们可以用 lm()
来证明这一点。
wool$solution<-as.numeric(wool$solution)
wool$process<-as.numeric(wool$process)
m3 <- lm(wt ~ solution*process,data=wool)
summary(m3)
anova(m3)
...输出:
> summary(m3)
Call:
lm(formula = wt ~ solution * process, data = wool)
Residuals:
Min 1Q Median 3Q Max
-11.460 -3.757 -0.180 2.320 12.000
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.9000 8.1454 3.548 0.00268 **
solution -1.5000 2.4559 -0.611 0.54993
process -0.0100 2.9743 -0.003 0.99736
solution:process 0.0300 0.8968 0.033 0.97373
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.341 on 16 degrees of freedom
Multiple R-squared: 0.1123, Adjusted R-squared: -0.05409
F-statistic: 0.675 on 3 and 16 DF, p-value: 0.5798
> anova(m3)
Analysis of Variance Table
Response: wt
Df Sum Sq Mean Sq F value Pr(>F)
solution 1 81.22 81.225 2.0200 0.1744
process 1 0.16 0.160 0.0040 0.9505
solution:process 1 0.05 0.045 0.0011 0.9737
Residuals 16 643.37 40.211
>
我正在尝试分析下面的数据集
X wt solution process
1 21 1 1
2 36 2 1
3 25 3 1
4 18 4 1
5 22 5 1
6 26 1 2
7 38 2 2
8 27 3 2
9 17 4 2
10 26 5 2
11 16 1 3
12 25 2 3
13 22 3 3
14 18 4 3
15 21 5 3
16 28 1 4
17 35 2 4
18 27 3 4
19 20 4 4
20 24 5 4
数据是平衡的,我相信两者的影响都是固定的。我在 R 中的代码如下
str(wool)
##convert solution and process to factors
wool$solution<-as.factor(wool$solution)
wool$process<-as.factor(wool$process)
m1<-aov(wt~solution*process,data=wool)
plot(m1)
但是,当我尝试绘制模型以检查假设时,出现以下错误:
not plotting observations with leverage one:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20Error in qqnorm.default(rs, main = main, ylab = ylab23, ylim = ylim, ...) :
y is empty or has only NAs
当我在没有交互的情况下绘制模型时,我不确定如何更正此问题
m1<-aov(wt~solution+process,data=wool)
一切正常,但我需要分析交互以查看它是否重要。此外,当我将这些因素保持为数字时,它就起作用了,但这些绝对是分类因素,每个因素都指一种过程和一种用于处理观察结果的解决方案。
感谢任何帮助。
交互效果的模型消耗了19个自由度。方差分析中误差项的自由度为 N - k,其中 N = 观察总数,k = 组效应数。
由于 N 为 20 并且组数加上交互 k 也为 20,因此模型项的自由度为 19。因此,误差项的自由度为 0,杠杆每个观测值的值为 1,这意味着模型中的 19 个参数中的每一个加上总均值都完全依赖于数据框中的 20 个观测值。这解释了返回 plot()
函数的错误消息。
rawData <- "X wt solution process
1 21 1 1
2 36 2 1
3 25 3 1
4 18 4 1
5 22 5 1
6 26 1 2
7 38 2 2
8 27 3 2
9 17 4 2
10 26 5 2
11 16 1 3
12 25 2 3
13 22 3 3
14 18 4 3
15 21 5 3
16 28 1 4
17 35 2 4
18 27 3 4
19 20 4 4
20 24 5 4"
wool <- read.table(text=rawData,header=TRUE)
str(wool)
##convert solution and process to factors
wool$solution<-as.factor(wool$solution)
wool$process<-as.factor(wool$process)
m1<-aov(wt~solution*process,data=wool)
summary(m1)
...输出:
> m1<-aov(wt~solution*process,data=wool)
> summary(m1)
Df Sum Sq Mean Sq
solution 4 500.8 125.20
process 3 136.8 45.60
solution:process 12 87.2 7.27
当我们用 anova(m1)
打印方差分析 table 时,问题就变得很清楚了。
> anova(m1)
Analysis of Variance Table
Response: wt
Df Sum Sq Mean Sq F value Pr(>F)
solution 4 500.8 125.200
process 3 136.8 45.600
solution:process 12 87.2 7.267
Residuals 0 0.0
Warning message:
In anova.lm(m1) :
ANOVA F-tests on an essentially perfect fit are unreliable
自由度不足的问题在我们使用lm()
拟合模型时更加明显
m2 <- lm(wt ~ solution*process,data=wool)
summary(m2)
...输出:
> m2 <- lm(wt ~ solution*process,data=wool)
> summary(m2)
Call:
lm(formula = wt ~ solution * process, data = wool)
Residuals:
ALL 20 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21 NA NA NA
solution2 15 NA NA NA
solution3 4 NA NA NA
solution4 -3 NA NA NA
solution5 1 NA NA NA
process2 5 NA NA NA
process3 -5 NA NA NA
process4 7 NA NA NA
solution2:process2 -3 NA NA NA
solution3:process2 -3 NA NA NA
solution4:process2 -6 NA NA NA
solution5:process2 -1 NA NA NA
solution2:process3 -6 NA NA NA
solution3:process3 2 NA NA NA
solution4:process3 5 NA NA NA
solution5:process3 4 NA NA NA
solution2:process4 -8 NA NA NA
solution3:process4 -5 NA NA NA
solution4:process4 -5 NA NA NA
solution5:process4 -5 NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 19 and 0 DF, p-value: NA
与连续变量的交互作用
关于当分析 运行 使用连续变量 lm()
时代码工作的 OP 问题,对于连续变量,交互效应消耗单个自由度,而不是(解决方案 - 1 ) * (processes - 1) 或两个分类变量之间相互作用的 12 个自由度。
同样,我们可以用 lm()
来证明这一点。
wool$solution<-as.numeric(wool$solution)
wool$process<-as.numeric(wool$process)
m3 <- lm(wt ~ solution*process,data=wool)
summary(m3)
anova(m3)
...输出:
> summary(m3)
Call:
lm(formula = wt ~ solution * process, data = wool)
Residuals:
Min 1Q Median 3Q Max
-11.460 -3.757 -0.180 2.320 12.000
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.9000 8.1454 3.548 0.00268 **
solution -1.5000 2.4559 -0.611 0.54993
process -0.0100 2.9743 -0.003 0.99736
solution:process 0.0300 0.8968 0.033 0.97373
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.341 on 16 degrees of freedom
Multiple R-squared: 0.1123, Adjusted R-squared: -0.05409
F-statistic: 0.675 on 3 and 16 DF, p-value: 0.5798
> anova(m3)
Analysis of Variance Table
Response: wt
Df Sum Sq Mean Sq F value Pr(>F)
solution 1 81.22 81.225 2.0200 0.1744
process 1 0.16 0.160 0.0040 0.9505
solution:process 1 0.05 0.045 0.0011 0.9737
Residuals 16 643.37 40.211
>