不同数据点的不同权重? (从零开始逻辑回归)

Different weights for different data points? (logistic regression from scratch)

我正在尝试从头开始实施逻辑回归。在这里我很困惑,最初我们得到了一个单一的权重随机值。然而,随着过程的进行。我发现训练的最终结果给出了多个权重(匹配训练集中数据点的数量)。我在这里完全不知道,因为预测工作正常,但我认为为单个特征设置多个权重没有任何意义。我也在下面的代码中提到了我的问题。

np.random.seed(100)
class LogisticRegression:

def sigmoid(self, z): return 1 / (1 + np.e**(-z))

def cost_function(self, X, y, weights):                 
    z = X*weights
    predict_1 = y * np.log(self.sigmoid(z))
    predict_0 = (1 - y) * np.log(1 - self.sigmoid(z))
    #print(-sum(predict_1 + predict_0) / len(X))
    return -sum(predict_1 + predict_0) / len(X)

def fit(self, X, y, epochs=250, lr=0.05):        
    loss = []
    weights = np.rand()    # Initially weights here is a single number...
    N = len(X)
             
    for _ in range(epochs):
        # Gradient Descent
        y_hat = self.sigmoid(X*weights)
        weights -= lr * X*(y_hat - y) / N    # ...But then the number of weights 
                                             # become equal to the number of
                                             # data points at this line...
        # Saving Progress
        loss.append(self.cost_function(X, y, weights)) 
        
    self.weights = weights
    self.loss = loss
    print('weights:', weights)     # ...Which causes us to get different 
                                   # weight for each data points.
                                   # How can I plot the final logistic curve then
                                   # if I got multiple final weights?
    #print(loss)

def predict(self, X):        
    # Predicting with sigmoid function
    z = X*self.weights
    # Returning binary result
    return [1 if i > 0.5 else 0 for i in self.sigmoid(z)]
    #print(self.sigmoid(z))
clf = LogisticRegression()
clf.fit(X,y)
clf.predict(X)

权重增量应包括数据点的总和。查看 this 页面以了解有关反向传播推导的更多详细信息。

所以权重应该更新为:

weights -= lr * sum(X*(y_hat - y)) / N

而不是:

weights -= lr * X*(y_hat - y) / N

有了这个,你只得到一个预期的重量。