在 R 中的 cor.test 内完成时,子集不起作用
Subset does not work when done within cor.test in R
我有一个包含 3 列的 df:
- column_1: 数值
- column_2: 数值
- column_3:具有两组的因子变量,A 和 B
我想计算第 1 列和第 2 列之间的 Spearman 相关性检验,但仅在组之间进行(因此仅在与 A 组匹配的第 1 列和第 2 列的观察值之间计算相关性,这同样适用于 B 组)。
所以我正在使用这些代码行:
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_A")),
data = df, method = c("spearm"))
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_B")),
data = df, method = c("spearm"))
事实是,我在两个测试中得到了相同的结果,所以我猜子集函数不起作用,因为如果我之前对组进行子集化,就像这样:
x <- subset(df, column_3==c("group_A"))
y <- subset(df, column_3==c("group_B"))
然后 运行 cor.test
分别在 x 和 y 上,我得到不同的结果。有人知道这是怎么回事吗?
PS:我收到以下警告,但我认为这与我询问的问题无关:
Warning message:
"In cor.test.default(cor_itir$Nart, cor_itir$Medida, alternative = "two.sided", :cannot compute exact p-value with ties"
使用with
和subset
:
with(subset(df, column_3==c("group_A")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
with(subset(df, column_3==c("group_B")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
编辑
添加数据
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("group_A","group_B"),each=5))
> with(subset(df, column_3==c("group_A")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 4.4409e-15, p-value = 0.01667
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
1
> with(subset(df, column_3==c("group_B")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 10, p-value = 0.45
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.5
通过使用 df$...
提取器并指定 data=
并将 subset()
用作独立函数,您使事情变得有点过于复杂。你可以得到相同的结果,我相信使用类似的东西:
# here's some example data with different correlations between each group
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("a","b"),each=5))
然后只需指定您的论坛,您的 data=
和您的 subset=
内联:
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="a"))
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="b"))
或使用 by
一次性完成
by(df, df$column_3, FUN = function(x) cor.test(~ column_1 + column_2, data = x))
我有一个包含 3 列的 df:
- column_1: 数值
- column_2: 数值
- column_3:具有两组的因子变量,A 和 B
我想计算第 1 列和第 2 列之间的 Spearman 相关性检验,但仅在组之间进行(因此仅在与 A 组匹配的第 1 列和第 2 列的观察值之间计算相关性,这同样适用于 B 组)。 所以我正在使用这些代码行:
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_A")),
data = df, method = c("spearm"))
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_B")),
data = df, method = c("spearm"))
事实是,我在两个测试中得到了相同的结果,所以我猜子集函数不起作用,因为如果我之前对组进行子集化,就像这样:
x <- subset(df, column_3==c("group_A"))
y <- subset(df, column_3==c("group_B"))
然后 运行 cor.test
分别在 x 和 y 上,我得到不同的结果。有人知道这是怎么回事吗?
PS:我收到以下警告,但我认为这与我询问的问题无关:
Warning message:
"In cor.test.default(cor_itir$Nart, cor_itir$Medida, alternative = "two.sided", :cannot compute exact p-value with ties"
使用with
和subset
:
with(subset(df, column_3==c("group_A")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
with(subset(df, column_3==c("group_B")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
编辑
添加数据
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("group_A","group_B"),each=5))
> with(subset(df, column_3==c("group_A")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 4.4409e-15, p-value = 0.01667
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
1
> with(subset(df, column_3==c("group_B")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 10, p-value = 0.45
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.5
通过使用 df$...
提取器并指定 data=
并将 subset()
用作独立函数,您使事情变得有点过于复杂。你可以得到相同的结果,我相信使用类似的东西:
# here's some example data with different correlations between each group
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("a","b"),each=5))
然后只需指定您的论坛,您的 data=
和您的 subset=
内联:
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="a"))
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="b"))
或使用 by
by(df, df$column_3, FUN = function(x) cor.test(~ column_1 + column_2, data = x))