在 R 中的 cor.test 内完成时,子集不起作用

Subset does not work when done within cor.test in R

我有一个包含 3 列的 df:

我想计算第 1 列和第 2 列之间的 Spearman 相关性检验,但仅在组之间进行(因此仅在与 A 组匹配的第 1 列和第 2 列的观察值之间计算相关性,这同样适用于 B 组)。 所以我正在使用这些代码行:

cor.test(df$column_1, df$column_2, alternative = ("two.sided"), 
     subset(df, column_3==c("group_A")),
     data = df, method = c("spearm"))
cor.test(df$column_1, df$column_2, alternative = ("two.sided"), 
         subset(df, column_3==c("group_B")),
         data = df, method = c("spearm"))

事实是,我在两个测试中得到了相同的结果,所以我猜子集函数不起作用,因为如果我之前对组进行子集化,就像这样:

x <- subset(df, column_3==c("group_A"))
y <- subset(df, column_3==c("group_B"))

然后 运行 cor.test 分别在 x 和 y 上,我得到不同的结果。有人知道这是怎么回事吗?

PS:我收到以下警告,但我认为这与我询问的问题无关:

Warning message:
"In cor.test.default(cor_itir$Nart, cor_itir$Medida, alternative = "two.sided",  :cannot compute exact p-value with ties"

使用withsubset:

with(subset(df, column_3==c("group_A")),
     cor.test(column_1, column_2, alternative = ("two.sided"), 
     method = c("spearm")))

with(subset(df, column_3==c("group_B")),
     cor.test(column_1, column_2, alternative = ("two.sided"), 
              method = c("spearm")))

编辑

添加数据

df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("group_A","group_B"),each=5))

> with(subset(df, column_3==c("group_A")),
+      cor.test(column_1, column_2, alternative = ("two.sided"), 
+               method = c("spearman")))

    Spearman's rank correlation rho

data:  column_1 and column_2
S = 4.4409e-15, p-value = 0.01667
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho 
  1 


> with(subset(df, column_3==c("group_B")),
+      cor.test(column_1, column_2, alternative = ("two.sided"), 
+               method = c("spearman")))

    Spearman's rank correlation rho

data:  column_1 and column_2
S = 10, p-value = 0.45
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho 
0.5

通过使用 df$... 提取器并指定 data= 并将 subset() 用作独立函数,您使事情变得有点过于复杂。你可以得到相同的结果,我相信使用类似的东西:

# here's some example data with different correlations between each group
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("a","b"),each=5))

然后只需指定您的论坛,您的 data= 和您的 subset= 内联:

cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="a"))

cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="b"))

或使用 by

一次性完成
by(df, df$column_3, FUN = function(x) cor.test(~ column_1 + column_2, data = x))