Return R 中 Bartlett 方差同质性检验的方差
Return variance from Bartlett's test of homogeneity of variance in R
Bartlett 检验允许您检验方差在不同组中是否相同。
R 中的 stats
包具有函数 bartlett.test
。这是一个使用 R 中可用的数据集的示例。
bt <- bartlett.test(count ~ spray, data = InsectSprays)
如何从 bartlett.test
得到实际方差?
我好像在对象中找不到这个 bt
names(bt)
[1] "data.name" "method" "p.value" "parameter" "statistic"
您可以使用 var()
自行计算方差。一种方法是使用 summaryBy
。
library(doBy)
summaryBy(count~spray, data=InsectSprays, FUN=var)
但是,您会希望 bartlett.test
提供每组的方差。类似地,在 R 中计算 t.test 也会给出每组的平均值。那么,我们可以从 R 中的 bartlett.test
中提取每组的方差吗?
不,遗憾的是你不能。
方差没有出现在返回对象的结构中。
我阅读了该函数的源代码,您可以通过稍微重写它来提取它,但这比执行您已有的解决方案要多得多。
你可以明白我的意思:
# File src/library/stats/R/bartlett.test.R
# Part of the R package, https://www.R-project.org
#
# Copyright (C) 1995-2015 The R Core Team
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# A copy of the GNU General Public License is available at
# https://www.R-project.org/Licenses/
bartlett.test <- function(x, ...) UseMethod("bartlett.test")
bartlett.test.default <-
function(x, g, ...)
{
LM <- FALSE
if (is.list(x)) {
if (length(x) < 2L)
stop("'x' must be a list with at least 2 elements")
DNAME <- deparse(substitute(x))
if (all(sapply(x, function(obj) inherits(obj, "lm"))))
LM <- TRUE
else
x <- lapply(x, function(x) x <- x[is.finite(x)])
k <- length(x)
}
else {
if (length(x) != length(g))
stop("'x' and 'g' must have the same length")
DNAME <- paste(deparse(substitute(x)), "and",
deparse(substitute(g)))
OK <- complete.cases(x, g)
x <- x[OK]
g <- factor(g[OK])
k <- nlevels(g)
if (k < 2)
stop("all observations are in the same group")
x <- split(x, g)
}
if (LM) {
n <- sapply(x, function(obj) obj$df.resid)
v <- sapply(x, function(obj) sum(obj$residuals^2))
} else {
n <- sapply(x, "length") - 1
if (any(n <= 0))
stop("there must be at least 2 observations in each group")
v <- sapply(x, "var")
}
n.total <- sum(n)
v.total <- sum(n * v) / n.total
STATISTIC <- ((n.total * log(v.total) - sum(n * log(v))) /
(1 + (sum(1 / n) - 1 / n.total) / (3 * (k - 1))))
PARAMETER <- k - 1
PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE)
names(STATISTIC) <- "Bartlett's K-squared"
names(PARAMETER) <- "df"
RVAL <- list(statistic = STATISTIC,
parameter = PARAMETER,
p.value = PVAL,
data.name = DNAME,
method = "Bartlett test of homogeneity of variances")
class(RVAL) <- "htest"
return(RVAL)
}
bartlett.test.formula <-
function(formula, data, subset, na.action, ...)
{
if(missing(formula) || (length(formula) != 3L))
stop("'formula' missing or incorrect")
m <- match.call(expand.dots = FALSE)
if(is.matrix(eval(m$data, parent.frame())))
m$data <- as.data.frame(data)
## need stats:: for non-standard evaluation
m[[1L]] <- quote(stats::model.frame)
mf <- eval(m, parent.frame())
if(length(mf) != 2L)
stop("'formula' should be of the form response ~ group")
DNAME <- paste(names(mf), collapse = " by ")
names(mf) <- NULL
y <- do.call("bartlett.test", as.list(mf))
y$data.name <- DNAME
y
}
这 稍微 比 Hack-R 建议的更容易破解,但他们是对的,像 sapply(split(InsectSprays,spray),function(x) var(x$count))
(在 base R 中完成这一切)可能是更容易。
这里展示的技术是脆弱的,因为它依赖于内置函数的确切形式;如果在 R 的未来版本中对函数进行了微小的更改,它将停止工作。更安全的是 dump()
整个函数并根据您的喜好修改它,然后 source()
结果。
bb <- stats:::bartlett.test.default
bb2 <- body(bb)
## add a line to save the variances
bb2[[12]] <- quote(ESTIMATE <- v)
## add the variances to the return list
bb2[[13]] <- quote(RVAL <- list(statistic = STATISTIC, parameter = PARAMETER, estimate = ESTIMATE, p.value = PVAL, data.name = DNAME, method = "Bartlett test of homogeneity of variances"))
## restore the rest of the function
bb2[14:15] <- body(bb)[13:14]
body(bb) <- bb2
现在把它放回 stats
命名空间中:
assignInNamespace("bartlett.test.default",bb,pos="package:stats")
测试:
(bt <- bartlett.test(count~spray,data=InsectSprays))
## Bartlett test of homogeneity of variances
##
## data: count by spray
## Bartlett's K-squared = 25.96, df = 5, p-value = 9.085e-05
## sample estimates:
## A B C D E F
## 22.272727 18.242424 3.901515 6.265152 3.000000 38.606061
您可以通过 bt$estimate
检索值。
可能值得在 r-devel
上提出这个作为对 bartlett.test
的增强:我能想到的唯一反驳是如果有很多组被测试,输出会很笨重.
Bartlett 检验允许您检验方差在不同组中是否相同。
R 中的 stats
包具有函数 bartlett.test
。这是一个使用 R 中可用的数据集的示例。
bt <- bartlett.test(count ~ spray, data = InsectSprays)
如何从 bartlett.test
得到实际方差?
我好像在对象中找不到这个 bt
names(bt)
[1] "data.name" "method" "p.value" "parameter" "statistic"
您可以使用 var()
自行计算方差。一种方法是使用 summaryBy
。
library(doBy)
summaryBy(count~spray, data=InsectSprays, FUN=var)
但是,您会希望 bartlett.test
提供每组的方差。类似地,在 R 中计算 t.test 也会给出每组的平均值。那么,我们可以从 R 中的 bartlett.test
中提取每组的方差吗?
不,遗憾的是你不能。
方差没有出现在返回对象的结构中。
我阅读了该函数的源代码,您可以通过稍微重写它来提取它,但这比执行您已有的解决方案要多得多。
你可以明白我的意思:
# File src/library/stats/R/bartlett.test.R
# Part of the R package, https://www.R-project.org
#
# Copyright (C) 1995-2015 The R Core Team
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# A copy of the GNU General Public License is available at
# https://www.R-project.org/Licenses/
bartlett.test <- function(x, ...) UseMethod("bartlett.test")
bartlett.test.default <-
function(x, g, ...)
{
LM <- FALSE
if (is.list(x)) {
if (length(x) < 2L)
stop("'x' must be a list with at least 2 elements")
DNAME <- deparse(substitute(x))
if (all(sapply(x, function(obj) inherits(obj, "lm"))))
LM <- TRUE
else
x <- lapply(x, function(x) x <- x[is.finite(x)])
k <- length(x)
}
else {
if (length(x) != length(g))
stop("'x' and 'g' must have the same length")
DNAME <- paste(deparse(substitute(x)), "and",
deparse(substitute(g)))
OK <- complete.cases(x, g)
x <- x[OK]
g <- factor(g[OK])
k <- nlevels(g)
if (k < 2)
stop("all observations are in the same group")
x <- split(x, g)
}
if (LM) {
n <- sapply(x, function(obj) obj$df.resid)
v <- sapply(x, function(obj) sum(obj$residuals^2))
} else {
n <- sapply(x, "length") - 1
if (any(n <= 0))
stop("there must be at least 2 observations in each group")
v <- sapply(x, "var")
}
n.total <- sum(n)
v.total <- sum(n * v) / n.total
STATISTIC <- ((n.total * log(v.total) - sum(n * log(v))) /
(1 + (sum(1 / n) - 1 / n.total) / (3 * (k - 1))))
PARAMETER <- k - 1
PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE)
names(STATISTIC) <- "Bartlett's K-squared"
names(PARAMETER) <- "df"
RVAL <- list(statistic = STATISTIC,
parameter = PARAMETER,
p.value = PVAL,
data.name = DNAME,
method = "Bartlett test of homogeneity of variances")
class(RVAL) <- "htest"
return(RVAL)
}
bartlett.test.formula <-
function(formula, data, subset, na.action, ...)
{
if(missing(formula) || (length(formula) != 3L))
stop("'formula' missing or incorrect")
m <- match.call(expand.dots = FALSE)
if(is.matrix(eval(m$data, parent.frame())))
m$data <- as.data.frame(data)
## need stats:: for non-standard evaluation
m[[1L]] <- quote(stats::model.frame)
mf <- eval(m, parent.frame())
if(length(mf) != 2L)
stop("'formula' should be of the form response ~ group")
DNAME <- paste(names(mf), collapse = " by ")
names(mf) <- NULL
y <- do.call("bartlett.test", as.list(mf))
y$data.name <- DNAME
y
}
这 稍微 比 Hack-R 建议的更容易破解,但他们是对的,像 sapply(split(InsectSprays,spray),function(x) var(x$count))
(在 base R 中完成这一切)可能是更容易。
这里展示的技术是脆弱的,因为它依赖于内置函数的确切形式;如果在 R 的未来版本中对函数进行了微小的更改,它将停止工作。更安全的是 dump()
整个函数并根据您的喜好修改它,然后 source()
结果。
bb <- stats:::bartlett.test.default
bb2 <- body(bb)
## add a line to save the variances
bb2[[12]] <- quote(ESTIMATE <- v)
## add the variances to the return list
bb2[[13]] <- quote(RVAL <- list(statistic = STATISTIC, parameter = PARAMETER, estimate = ESTIMATE, p.value = PVAL, data.name = DNAME, method = "Bartlett test of homogeneity of variances"))
## restore the rest of the function
bb2[14:15] <- body(bb)[13:14]
body(bb) <- bb2
现在把它放回 stats
命名空间中:
assignInNamespace("bartlett.test.default",bb,pos="package:stats")
测试:
(bt <- bartlett.test(count~spray,data=InsectSprays))
## Bartlett test of homogeneity of variances
##
## data: count by spray
## Bartlett's K-squared = 25.96, df = 5, p-value = 9.085e-05
## sample estimates:
## A B C D E F
## 22.272727 18.242424 3.901515 6.265152 3.000000 38.606061
您可以通过 bt$estimate
检索值。
可能值得在 r-devel
上提出这个作为对 bartlett.test
的增强:我能想到的唯一反驳是如果有很多组被测试,输出会很笨重.