K-means 聚类颜色不变

Question

我用的是jupyter notebook 这里是内核信息

Python 3.5.2 |蟒蛇 4.1.1（64 位）| （默认，2016 年 7 月 2 日，17:53:06） [GCC 4.4.7 20120313（红帽 4.4.7-1）]

我正在使用 k 均值聚类。当我聚类时，唯一使用的颜色是蓝色。这对于目前的设置方式来说不是什么大问题，但我需要将其放大，以便颜色需要不同。我遵循了一个教程，所以我不能 100% 理解所有代码。代码如下。

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
style.use("ggplot")
from sklearn.cluster import KMeans

x = [1,5,1.5,8,1,9]
y = [2,8,1.8,8,.6,11]

plt.scatter(x,y)
plt.show()

X = np.array([[1,2],[5,8],[1.5,1.8],[8,8],[1,.6],[9,11]])

kmeans = KMeans(n_clusters=2)
kmeans.fit(X)

centroids = kmeans.cluster_centers_
labels = kmeans.labels_

print(centroids)
print(labels)

colors = ['r','b','y','g','c','m']

for i in range(len(X)):
    print("coordinate:",X[i], "label:", labels[i])
    plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize = 10)

plt.scatter(centroids[:, 0],centroids[:, 1], marker = "x", s=150, linewidths = 5, zorder = 10)

plt.show()

plt.scatter(x,y)
plt.scatter(centroids[:, 0],centroids[:, 1], marker = "x", s=150, linewidths = 5, zorder = 10)

plt.show()

我觉得我的问题出在这块。

colors = ['r','b','y','g','c','m']

for i in range(len(X)):
    print("coordinate:",X[i], "label:", labels[i])
    plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize = 10)

Answer 1

我确实错了。我以前的解决方案是不正确的。我终于可以好好看看标签和质心的 return，我认为这应该可以满足您的要求。

您可以给出一个序列作为 color= 参数的参数，因此不需要 fol-loop

colors = ['r','b','y','g','c','m']
plt.scatter(x,y, color=[colors[l_] for l_ in labels], label=labels)
plt.scatter(centroids[:, 0],centroids[:, 1], color=[c for c in colors[:len(centroids)]], marker = "x", s=150, linewidths = 5, zorder = 10)

Answer 2

使用 K 意味着您希望每个簇具有不同的颜色。如果你有 2 个集群，那么你的模型 kmeans 的标签存储在 kmeans.labels_ 中的数组中，看起来像 [1 1 1 1 0 0 1 0 0 0 1 0 0...]。要使用特定颜色，请在开始所有绘图代码之前遍历此代码并使用列表设置每个点的颜色：

colors = []
for i in kmeans.labels_:
  if i == 0:
    colors.append('blue')
  elif i == 1:
    colors.append('orange')

如果您想为您的颜色使用预定义的 Seaborn 调色板，您也可以遍历调色板！例如，如果您想使用 'deep' 调色板：

palette = sns.color_palette('deep')
colors = []
for i in kmeans.labels_:
  if i == 0:
    colors.append(palette[0])
  elif i == 1:
    colors.append(palette[1])

如果您有 3 种颜色，则需要为 i == 2 添加另一个 elif，依此类推。

然后当您创建绘图时，只需将 c 参数设置为等于您创建的 colors 列表：

plt.scatter(df['x'], df['y'], c = colors)
plt.show()

K-means 聚类颜色不变

K-means clustering color not changing

python

matplotlib

k-means

jupyter-notebook