具有对比的线性模型,包括所有可能的比较
linear models with contrasts including every possible comparison
有谁知道是否可以在 R 中使用 lmFit
或 lm
来计算具有分类变量的线性模型,同时包括类别之间所有可能的比较?例如在此处创建的测试数据中:
set.seed(25)
f <- gl(n = 3, k = 20, labels = c("control", "low", "high"))
mat <- model.matrix(~f, data = data.frame(f = f))
beta <- c(12, 3, 6) #these are the simulated regression coefficient
y <- rnorm(n = 60, mean = mat %*% beta, sd = 2)
m <- lm(y ~ f)
我得到摘要:
summary(m)
Call:
lm(formula = y ~ f)
Residuals:
Min 1Q Median 3Q Max
-4.3505 -1.6114 0.1608 1.1615 5.2010
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4976 0.4629 24.840 < 2e-16 ***
flow 3.0370 0.6546 4.639 2.09e-05 ***
fhigh 6.1630 0.6546 9.415 3.27e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.07 on 57 degrees of freedom
Multiple R-squared: 0.6086, Adjusted R-squared: 0.5949
F-statistic: 44.32 on 2 and 57 DF, p-value: 2.446e-12
这是因为对比项 ("contr.treatment") 将 "high" 与 "control" 以及 "low" 与 "control" 进行比较。
是否也可以得到 "high" 和 "low" 之间的比较?
如果您使用 aov
而不是 lm
,您可以使用 stats
包中的 TukeyHSD
函数:
fit <- aov(y ~ f)
TukeyHSD(fit)
# Tukey multiple comparisons of means
# 95% family-wise confidence level
# Fit: aov(formula = y ~ f)
# $f
# diff lwr upr p adj
# low-control 3.036957 1.461707 4.612207 6.15e-05
# high-control 6.163009 4.587759 7.738259 0.00e+00
# high-low 3.126052 1.550802 4.701302 3.81e-05
如果要使用 lm
对象,可以使用 mosaic
包中的 TukeyHSD
函数:
library(mosaic)
TukeyHSD(m)
或者,正如@ben-bolker 所建议的那样,
library(emmeans)
e1 <- emmeans(m, specs = "f")
pairs(e1)
# contrast estimate SE df t.ratio p.value
# control - low -3.036957 0.6546036 57 -4.639 0.0001
# control - high -6.163009 0.6546036 57 -9.415 <.0001
# low - high -3.126052 0.6546036 57 -4.775 <.0001
# P value adjustment: tukey method for comparing a family of 3 estimates
与lmFit
:
library(limma)
design <- model.matrix(~0 + f)
colnames(design) <- levels(f)
fit <- lmFit(y, design)
contrast.matrix <- makeContrasts(control-low, control-high, low-high,
levels = design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
round(t(rbind(fit2$coefficients, fit2$t, fit2$p.value)), 5)
# [,1] [,2] [,3]
# control - low -3.03696 -4.63938 2e-05
# control - high -6.16301 -9.41487 0e+00
# low - high -3.12605 -4.77549 1e-05
有关详细信息,另请参阅 。
有谁知道是否可以在 R 中使用 lmFit
或 lm
来计算具有分类变量的线性模型,同时包括类别之间所有可能的比较?例如在此处创建的测试数据中:
set.seed(25)
f <- gl(n = 3, k = 20, labels = c("control", "low", "high"))
mat <- model.matrix(~f, data = data.frame(f = f))
beta <- c(12, 3, 6) #these are the simulated regression coefficient
y <- rnorm(n = 60, mean = mat %*% beta, sd = 2)
m <- lm(y ~ f)
我得到摘要:
summary(m)
Call:
lm(formula = y ~ f)
Residuals:
Min 1Q Median 3Q Max
-4.3505 -1.6114 0.1608 1.1615 5.2010
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4976 0.4629 24.840 < 2e-16 ***
flow 3.0370 0.6546 4.639 2.09e-05 ***
fhigh 6.1630 0.6546 9.415 3.27e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.07 on 57 degrees of freedom
Multiple R-squared: 0.6086, Adjusted R-squared: 0.5949
F-statistic: 44.32 on 2 and 57 DF, p-value: 2.446e-12
这是因为对比项 ("contr.treatment") 将 "high" 与 "control" 以及 "low" 与 "control" 进行比较。
是否也可以得到 "high" 和 "low" 之间的比较?
如果您使用 aov
而不是 lm
,您可以使用 stats
包中的 TukeyHSD
函数:
fit <- aov(y ~ f)
TukeyHSD(fit)
# Tukey multiple comparisons of means
# 95% family-wise confidence level
# Fit: aov(formula = y ~ f)
# $f
# diff lwr upr p adj
# low-control 3.036957 1.461707 4.612207 6.15e-05
# high-control 6.163009 4.587759 7.738259 0.00e+00
# high-low 3.126052 1.550802 4.701302 3.81e-05
如果要使用 lm
对象,可以使用 mosaic
包中的 TukeyHSD
函数:
library(mosaic)
TukeyHSD(m)
或者,正如@ben-bolker 所建议的那样,
library(emmeans)
e1 <- emmeans(m, specs = "f")
pairs(e1)
# contrast estimate SE df t.ratio p.value
# control - low -3.036957 0.6546036 57 -4.639 0.0001
# control - high -6.163009 0.6546036 57 -9.415 <.0001
# low - high -3.126052 0.6546036 57 -4.775 <.0001
# P value adjustment: tukey method for comparing a family of 3 estimates
与lmFit
:
library(limma)
design <- model.matrix(~0 + f)
colnames(design) <- levels(f)
fit <- lmFit(y, design)
contrast.matrix <- makeContrasts(control-low, control-high, low-high,
levels = design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
round(t(rbind(fit2$coefficients, fit2$t, fit2$p.value)), 5)
# [,1] [,2] [,3]
# control - low -3.03696 -4.63938 2e-05
# control - high -6.16301 -9.41487 0e+00
# low - high -3.12605 -4.77549 1e-05
有关详细信息,另请参阅