默认情况下,R 热图如何对行进行排序?

How does R heatmap order rows by default?

R heatmap() 文档说 RowvColv(即行和列排序参数):

If either is missing, as by default, then the ordering of the corresponding dendrogram is by the mean value of the rows/columns, i.e., in the case of rows, Rowv <- rowMeans(x, na.rm = na.rm).

我以为就这么简单,但现在我想默认排序算法中肯定还有更多东西。

让我们有这个相关矩阵:

m = matrix(nrow=7, ncol = 7, c(1,0.578090870728824,0.504272263365781,0.526539138953634,0.523049273011785,0.503296777916728,0.638770769734758,0.578090870728824,1,0.59985543029105,0.663649941610205,0.630998114483389,0.66814547270115,0.596161809036262,0.504272263365781,0.59985543029105,1,0.62468477053142,0.632715952452297,0.599037620726669,0.607925540860012,0.526539138953634,0.663649941610205,0.62468477053142,1,0.7100707346884,0.738094117424525,0.639668277558577,0.523049273011785,0.630998114483389,0.632715952452297,0.7100707346884,1,0.651331659193182,0.64138213322125,0.503296777916728,0.66814547270115,0.599037620726669,0.738094117424525,0.651331659193182,1,0.612326706593738,0.638770769734758,0.596161809036262,0.607925540860012,0.639668277558577,0.64138213322125,0.612326706593738,1))

m
          [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
[1,] 1.0000000 0.5780909 0.5042723 0.5265391 0.5230493 0.5032968 0.6387708
[2,] 0.5780909 1.0000000 0.5998554 0.6636499 0.6309981 0.6681455 0.5961618
[3,] 0.5042723 0.5998554 1.0000000 0.6246848 0.6327160 0.5990376 0.6079255
[4,] 0.5265391 0.6636499 0.6246848 1.0000000 0.7100707 0.7380941 0.6396683
[5,] 0.5230493 0.6309981 0.6327160 0.7100707 1.0000000 0.6513317 0.6413821
[6,] 0.5032968 0.6681455 0.5990376 0.7380941 0.6513317 1.0000000 0.6123267
[7,] 0.6387708 0.5961618 0.6079255 0.6396683 0.6413821 0.6123267 1.0000000

heatmap(m) 输出为:

行(和列)顺序为:1、3、7、5、2、6、4

但是,我希望顺序是:

order(rowMeans(m))
1 3 7 2 6 5 4

怎么样?

我想这可能与树状图的聚类方式有关。但仍然不确定:如果我首先将 4 和 6 分组,然后可能使用 6x6 矩阵,其中一个 row/column 是原始第 4 行和第 6 行的平均值(?),它仍然不应该改变相互顺序例如第 2 行和第 5 行,应该吗?

非常感谢您的任何提示!

来自heatmap帮助你可以阅读:

Typically, reordering of the rows and columns according to some set of values (row or column means) within the restrictions imposed by the dendrogram is carried out.

事实上,使用 Rowmeans/Colmeans 的重新排序应用于集群。这是在内部分两步完成的。我将在每个步骤中绘制树状图,以显示簇是如何重新排列的。

hcr <- hclust(dist(m))
ddr <- as.dendrogram(hcr)
plot(ddr)

现在,如果您按 rowmenas 对树状图重新排序,我们会得到相同的 OP 顺序。

Rowv <- rowMeans(m, na.rm = T)
ddr <- reorder(ddr, Rowv)
plot(ddr)

当然可以更改此顺序,如果您提供新的聚类函数或排序函数。这里我使用默认的:hclustreorder.