k-均值聚类映射

k-means clustering mapping

我正在使用 sklearn 的 K-Means 聚类,并希望使用经过训练的 K-Means 模型将计算的 K-Means 聚类标签替换为质心值。

我使用的代码如下:

# Initialize K-Means clustering model-
kmeans_conv1 = KMeans(n_clusters = 5)

# Train model on training data (compute k-means clustering)-
kmeans_conv1.fit(conv1_nonzero.reshape(-1, 1))

# number of clusters used-
kmeans_conv1.n_clusters
# 5

# Get centroids-
kmeans_conv1.cluster_centers_
'''
array([[-0.05669265],
       [ 0.06742188],
       [-0.08835593],
       [ 0.03749201],
       [ 0.0896403 ]], dtype=float32)
'''


# Clustered labels of each data point-
kmeans_conv1.labels_

set(kmeans_conv1.labels_)                                             
Out[142]: {0, 1, 2, 3, 4}

# Get clustered label for each data point-
clustered_labels = kmeans_conv1.labels_

目前,我使用 if-else 条件将标签映射到质心值,如下所示:

new_clusters = []


for clabel in clustered_labels:
    if clabel == 0:
        new_clusters.append(kmeans_conv1.cluster_centers_[0][0])
    elif clabel == 1:
        new_clusters.append(kmeans_conv1.cluster_centers_[1][0])
    elif clabel == 2:
        new_clusters.append(kmeans_conv1.cluster_centers_[2][0])
    elif clabel == 3:
        new_clusters.append(kmeans_conv1.cluster_centers_[3][0])
    elif clabel == 4:
        new_clusters.append(kmeans_conv1.cluster_centers_[4][0])

最后,我希望 'new_clusters' 列表或 np.array 变量包含质心值而不是簇标签。

但是,有没有更好的方法可以在不使用 if-else 条件的情况下实现这一点?

这就足够了:

for clabel in clustered_labels:
    new_clusters.append(
        kmeans_conv1.cluster_centers_[clabel][0]
    )

找到这个方法:

# First conv layer condition-
cond_conv1 = [clustered_labels == 0, clustered_labels == 1, clustered_labels == 2, clustered_labels == 3, clustered_labels == 4]

# values-
val_conv1 = kmeans_conv1.cluster_centers_[:, 0]

# Get new clustered value weights-
new_weights_conv1 = np.select(cond_conv1, val_conv1)