KNeighborsClassifier 中如何使用参数 "weights"？

Question

在sklearn文档中，函数KNeighborsClassifier的参数weights="distance"解释如下：

‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.

虽然对我来说对相邻点进行加权然后将预测计算为加权点的平均值很有意义，例如使用 KNeighborsRegressor...但是，我看不到权重如何用于分类算法。根据 The Elements of Statistical Learning 一书，KNN 分类基于多数投票。不是吗？

Answer 1

在分类过程中，权重将用于计算邻居的模式（而不是频率，权重的总和将用于计算模式）。

要了解更多详情，请查看here，以了解实际实施情况。

示例来自 documentation：

>>> from sklearn.utils.extmath import weighted_mode
>>> x = [4, 1, 4, 2, 4, 2]
>>> weights = [1, 1, 1, 1, 1, 1]
>>> weighted_mode(x, weights)
(array([4.]), array([3.]))
The value 4 appears three times: with uniform weights, the result is simply the mode of the distribution.

>>>
>>> weights = [1, 3, 0.5, 1.5, 1, 2]  # deweight the 4's
>>> weighted_mode(x, weights)
(array([2.]), array([3.5]))

您可以查看实现here

KNeighborsClassifier 中如何使用参数 "weights"？

How is parameter "weights" used in KNeighborsClassifier?

python

machine-learning

knn

scikit-learn