修复 KMeans 集群位置

Question

我正在尝试使用 KMeans 对 RGB 颜色进行聚类，并自动计算每组中有多少像素出现在图像上。为此，我将质心的初始位置设置在我想要分类的位置和来自 sklearn 的运行 KMeans。

问题是，根据图像，算法输出改变了初始质心向量的顺序，所以当我计算元素的数量时，它的颜色是错误的。

当我没有一种或多种颜色位于图像的初始质心时，通常会发生这种情况。在这种情况下，我希望它计数为 0。

有谁知道如何修正 KMeans 预测输出的初始质心顺序？

代码如下：

 centroid_start = np.array([[0,0,0],#Black
                           [38,64,87], #Col1
                           [43,68,98], #Col2
                           [23,42,45], #Col3
                           [160, 62, 0],#Col3
                           [153, 82, 33], #Col5
                           [198, 130, 109], #Col6
                           [100,105,79], #Col7
                           [220,138, 22]#Col8
                           ], np.float64)      
    image = cv.cvtColor(img, cv.COLOR_HSV2RGB)
    reshape=image.reshape((image.shape[0]*image.shape[1], 3))
    cluster = KMeans(n_clusters =np.shape(centroid_start[0], init =centroid_start).fit(reshape)
 pixels = Counter(cluster.labels_)
print(pixels)

问题is:when我检查'pixels'变量，0不一定对应黑色，1不一定对应Col1等

Answer 1

如果您不想迁移颜色，您可能不应该使用 k-means。相反，只需在颜色和图像像素之间使用 pairwise distances，然后使用距离最小的颜色 select。

如果您确实希望初始颜色迁移，那么您必须接受一些初始聚类中心（颜色）可能会消失或可能迁移到与初始颜色非常不同的东西。一种选择是重新排序 cluster_centers_ 属性（可能还有 labels_）的行 KMeans object. Another - probably safer - option is to compute a mapping of fitted cluster centers to your original colors (again using pairwise distances), then translate the results of your subsequent k-means classification. If you want to do it all in one step, you could subclass KMeans or wrap it by creating your own class derived from BaseEstimator.

修复 KMeans 集群位置

Fix KMeans cluster position

python

cluster-analysis

machine-learning

k-means

scikit-learn