集群合并
Merging of clusters
我有一个描述对象组的矩阵。
n <- 6 # number of objects
group <- matrix(c(1,2,1,4,1,3,6,3,5,3,NA,NA,2,NA,2,NA,NA,6,NA,6,NA,NA,NA,NA,4,NA,NA,NA,NA,5),5,6)
colnames(group) <- colnames(group, do.NULL = FALSE, prefix = "obj.")
rownames(group) <- rownames(group, do.NULL = FALSE, prefix = "step.")
group # an n-1 by n matrix
# obj.1 obj.2 obj.3 obj.4 obj.5 obj.6
# step.1 1 3 NA NA NA NA
# step.2 2 6 NA NA NA NA
# step.3 1 3 2 6 NA NA
# step.4 4 5 NA NA NA NA
# step.5 1 3 2 6 4 5
我想创建一个在步骤中合并集群的矩阵。此矩阵等于 hclust 函数中返回的对象合并。
merge <- matrix(c(-1, -2, 1, -4, 3, -3, -6, 2, -5, 4), 5, 2)
merge
# [,1] [,2]
# [1,] -1 -3
# [2,] -2 -6
# [3,] 1 2
# [4,] -4 -5
# [5,] 3 4
merge is an n-1 by 2 matrix. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
我还没找到简单的解决办法。这个有什么功能吗?
基本上你有一组组(每行一个)...
group
# obj.1 obj.2 obj.3 obj.4 obj.5 obj.6
# step.1 1 3 NA NA NA NA
# step.2 2 6 NA NA NA NA
# step.3 1 3 2 6 NA NA
# step.4 4 5 NA NA NA NA
# step.5 1 3 2 6 4 5
...并且您想知道前两行合并为当前行。
我首先创建一个矩阵,指示每个对象是否在特定行中:
(hasObs <- sapply(seq_len(ncol(group)), function(i) rowSums(!is.na(group) & group == i)))
# [,1] [,2] [,3] [,4] [,5] [,6]
# step.1 1 0 1 0 0 0
# step.2 0 1 0 0 0 1
# step.3 1 1 1 0 0 1
# step.4 0 0 0 1 1 0
# step.5 1 1 1 1 1 1
我会用它来创建一个矩阵,其中每个元素 (i,j) 表示 j 出现的最近的前一行(在 i 之前)(如果没有这样的前一行,则为 -j):
(prevObs <- sapply(seq_len(ncol(hasObs)), function(i) {
pos <- which(head(hasObs, -1)[,i] == 1)
rep(c(-i, pos), diff(c(0, pos, nrow(hasObs))))
}))
# [,1] [,2] [,3] [,4] [,5] [,6]
# -1 -2 -3 -4 -5 -6
# step.1 1 -2 1 -4 -5 -6
# step.1 1 2 1 -4 -5 2
# step.3 3 3 3 -4 -5 3
# step.3 3 3 3 4 4 3
现在很容易确定哪些行被合并为当前行:
t(apply(hasObs*prevObs, 1, function(x) unique(x[x != 0])))
# [,1] [,2]
# step.1 -1 -3
# step.2 -2 -6
# step.3 1 2
# step.4 -4 -5
# step.5 3 4
第一行合并单个元素1和3,下一行合并单个元素2和6,第三行合并前两组,第四行合并单个元素4和5,第五行合并行中的组3 和 4.
我有一个描述对象组的矩阵。
n <- 6 # number of objects
group <- matrix(c(1,2,1,4,1,3,6,3,5,3,NA,NA,2,NA,2,NA,NA,6,NA,6,NA,NA,NA,NA,4,NA,NA,NA,NA,5),5,6)
colnames(group) <- colnames(group, do.NULL = FALSE, prefix = "obj.")
rownames(group) <- rownames(group, do.NULL = FALSE, prefix = "step.")
group # an n-1 by n matrix
# obj.1 obj.2 obj.3 obj.4 obj.5 obj.6
# step.1 1 3 NA NA NA NA
# step.2 2 6 NA NA NA NA
# step.3 1 3 2 6 NA NA
# step.4 4 5 NA NA NA NA
# step.5 1 3 2 6 4 5
我想创建一个在步骤中合并集群的矩阵。此矩阵等于 hclust 函数中返回的对象合并。
merge <- matrix(c(-1, -2, 1, -4, 3, -3, -6, 2, -5, 4), 5, 2)
merge
# [,1] [,2]
# [1,] -1 -3
# [2,] -2 -6
# [3,] 1 2
# [4,] -4 -5
# [5,] 3 4
merge is an n-1 by 2 matrix. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
我还没找到简单的解决办法。这个有什么功能吗?
基本上你有一组组(每行一个)...
group
# obj.1 obj.2 obj.3 obj.4 obj.5 obj.6
# step.1 1 3 NA NA NA NA
# step.2 2 6 NA NA NA NA
# step.3 1 3 2 6 NA NA
# step.4 4 5 NA NA NA NA
# step.5 1 3 2 6 4 5
...并且您想知道前两行合并为当前行。
我首先创建一个矩阵,指示每个对象是否在特定行中:
(hasObs <- sapply(seq_len(ncol(group)), function(i) rowSums(!is.na(group) & group == i)))
# [,1] [,2] [,3] [,4] [,5] [,6]
# step.1 1 0 1 0 0 0
# step.2 0 1 0 0 0 1
# step.3 1 1 1 0 0 1
# step.4 0 0 0 1 1 0
# step.5 1 1 1 1 1 1
我会用它来创建一个矩阵,其中每个元素 (i,j) 表示 j 出现的最近的前一行(在 i 之前)(如果没有这样的前一行,则为 -j):
(prevObs <- sapply(seq_len(ncol(hasObs)), function(i) {
pos <- which(head(hasObs, -1)[,i] == 1)
rep(c(-i, pos), diff(c(0, pos, nrow(hasObs))))
}))
# [,1] [,2] [,3] [,4] [,5] [,6]
# -1 -2 -3 -4 -5 -6
# step.1 1 -2 1 -4 -5 -6
# step.1 1 2 1 -4 -5 2
# step.3 3 3 3 -4 -5 3
# step.3 3 3 3 4 4 3
现在很容易确定哪些行被合并为当前行:
t(apply(hasObs*prevObs, 1, function(x) unique(x[x != 0])))
# [,1] [,2]
# step.1 -1 -3
# step.2 -2 -6
# step.3 1 2
# step.4 -4 -5
# step.5 3 4
第一行合并单个元素1和3,下一行合并单个元素2和6,第三行合并前两组,第四行合并单个元素4和5,第五行合并行中的组3 和 4.