R 中的热图错误

Error with Heatmap in R

我正在尝试在 R 中制作热图。基本上,有两项调查,我试图绘制是否有人回答或未回答问题的地图。我能够使用下面列出的代码为以下内容制作一个:

x1 <- c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
x2 <- c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
x3 <- c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0)
x4 <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
x5 <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
x6 <- c(0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0)

x <- rbind(x1, x2, x3, x4, x5, x6) 
hv <- heatmap(t(x), col = c("Forestgreen", "Darkorange2"), margins = c(4, 12), Colv = NA, Rowv = NA, scale = "column", xlab ="Person", ylab ="", main = "",  labCol=c("1", "2", "3", "4", "5", "6"))
legend("topright", c("Non-Missing", "Missing"), col=c("Forestgreen", "Darkorange2"), bty="n", fill=c("Forestgreen", "Darkorange2"))

虽然由此生成的热图很好,但我尝试为第二次调查创建的热图已关闭。请参阅下面的代码:

y1 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0)
y2 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
y3 <- rep(c(0, 1), c(34, 2))
y4 <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y5 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
y6 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y7 <- rep(c(0, 1), each=18)
y8 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y9 <- c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1)
y10 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y11 <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y12 <- c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
y13 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0)


y <- rbind(y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11, y12, y13) 
hv <- heatmap(t(y), col = c("Forestgreen", "Darkorange2"), margins = c(4, 12), Colv = NA, Rowv = NA, scale = "column", xlab ="Person", ylab ="", main = "")
legend("topright", c("Non-Missing", "Missing"), col=c("Forestgreen", "Darkorange2"), bty="n", fill=c("Forestgreen", "Darkorange2"))

我不明白为什么 y2 上有一条白线。尤其是当第一个没有问题的时候。任何见解都会有所帮助。谢谢!

如评论中所述,这里的问题是 y2 中的值都是 1。您已指示 heatmap 函数根据列值进行缩放(scale = "column").由于第二列中没有差异,因此无法缩放,因此您什么也得不到。 heatmap 函数可能会对此抛出错误或警告,但无论出于何种原因,它都没有这样做。

好消息是这很容易修复。如果您将缩放比例从 "column" 更改为 "none",问题会自行解决。事实上,有趣的是,当 scale = "column" 时其他列似乎也是错误的 - 我不确定为什么,特别是当你在 y2.

中引入方差时问题消失了
y1 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0)
y2 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
y3 <- rep(c(0, 1), c(34, 2))
y4 <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y5 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
y6 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y7 <- rep(c(0, 1), each=18)
y8 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y9 <- c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1)
y10 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y11 <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
y12 <- c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
y13 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0)

y <- rbind(y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11, y12, y13) 

hv <- heatmap(t(y), col = c("Forestgreen", "Darkorange2"), margins = c(4, 12), Colv = NA, Rowv = NA, scale = "none", xlab ="Person", ylab ="", main = "")
legend("topright", c("Non-Missing", "Missing"), col=c("Forestgreen", "Darkorange2"), bty="n", fill=c("Forestgreen", "Darkorange2"))

关于 heatmapscale 参数的帮助指出:

character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. The default is "row" if symm false, and "none" otherwise.

列或行的居中和缩放是通过 heatmap 函数中的以下代码完成的:

else if (scale == "column") {
    x <- sweep(x, 2L, colMeans(x, na.rm = na.rm), check.margin = FALSE)
    sx <- apply(x, 2L, sd, na.rm = na.rm)
    x <- sweep(x, 2L, sx, "/", check.margin = FALSE)
}

使用一些较小的示例数据可以很好地演示。

x1 <- c(1,2,3)
x2 <- c(4,5,4)
x3 <- c(1,1,1)

data_mat <- cbind(x1,x2,x3)
print(data_mat)
     x1 x2 x3
[1,]  1  4  1
[2,]  2  5  1
[3,]  3  4  1
data_mat <- sweep(x = data_mat,MARGIN = 2,STATS = colMeans(data_mat))
print(data_mat)
     x1 x2 x3
[1,]  1  4  1
[2,]  2  5  1
[3,]  3  4  1
sd_data_mat <- apply(X = data_mat, MARGIN = 2, FUN = sd)
print(sd_data_mat)
     x1         x2 x3
[1,] -1 -0.3333333  0
[2,]  0  0.6666667  0
[3,]  1 -0.3333333  0
data_mat <- sweep(x = data_mat,MARGIN = 2,STATS = sd_data_mat,FUN = "/")
print(data_mat)
     x1         x2  x3
[1,] -1 -0.5773503 NaN
[2,]  0  1.1547005 NaN
[3,]  1 -0.5773503 NaN

您可以看到,在 x3 中,您最终得到 NaN,因为您将 0 除以 0。这最终通过了稍后的绘图,这导致该列丢失.

当热图应用缩放时,第二列全部 NaN,您可以检查:

y_scaled <- scale(t(y))

这是因为那里没有方差(即所有观察值都是 1)

修复它的一种方法是在那里人为输入一些数据,

y_scaled[is.nan(y_scaled)] <- 1

hv <- heatmap(y_scaled, col = c("Forestgreen", "Darkorange2"), margins = c(4, 12), Colv = NA, Rowv = NA, scale = "none", xlab ="Person", ylab ="", main = "")

特别是您想要显示一个简单的 1/0 分类变量。