如果已知数据点与集群中心的距离，如何获取数据点的 features/attributes？

Question

我有一个包含 A、B 和 C 列的 DataFrame X。我使用 n_clusters=4 应用了 kMeans 聚类，并从每个聚类的中心获得了 euclidean distance 个最近的 10 个数据点。例如，对于第 i 个集群，我这样做了：-

#getting 10 nearest points from ith cluster center
print(np.sort(kmeans.transform(X)[:, i])[: 10])
#output:-
array([0.06096257, 0.07785726, 0.09155965, 0.09301038, 0.09741242,
   0.1016601 , 0.10242911, 0.10314227, 0.10775149, 0.10895064])

现在，我想获取这 10 个数据点的特征 A、B 和 C。如何解决这个问题？

Answer 1

如果您想获得最小值的索引，请使用argsort。

将距离映射到点很复杂。

如果已知数据点与集群中心的距离，如何获取数据点的 features/attributes？

How to get features/attributes of a data point if its distance from the cluster's center is known?

cluster-analysis

machine-learning

k-means

python-3.x