不同数据点的不同权重？（从零开始逻辑回归）

Question

我正在尝试从头开始实施逻辑回归。在这里我很困惑，最初我们得到了一个单一的权重随机值。然而，随着过程的进行。我发现训练的最终结果给出了多个权重（匹配训练集中数据点的数量）。我在这里完全不知道，因为预测工作正常，但我认为为单个特征设置多个权重没有任何意义。我也在下面的代码中提到了我的问题。

np.random.seed(100)
class LogisticRegression:

def sigmoid(self, z): return 1 / (1 + np.e**(-z))

def cost_function(self, X, y, weights):                 
    z = X*weights
    predict_1 = y * np.log(self.sigmoid(z))
    predict_0 = (1 - y) * np.log(1 - self.sigmoid(z))
    #print(-sum(predict_1 + predict_0) / len(X))
    return -sum(predict_1 + predict_0) / len(X)

def fit(self, X, y, epochs=250, lr=0.05):        
    loss = []
    weights = np.rand()    # Initially weights here is a single number...
    N = len(X)
             
    for _ in range(epochs):
        # Gradient Descent
        y_hat = self.sigmoid(X*weights)
        weights -= lr * X*(y_hat - y) / N    # ...But then the number of weights 
                                             # become equal to the number of
                                             # data points at this line...
        # Saving Progress
        loss.append(self.cost_function(X, y, weights)) 
        
    self.weights = weights
    self.loss = loss
    print('weights:', weights)     # ...Which causes us to get different 
                                   # weight for each data points.
                                   # How can I plot the final logistic curve then
                                   # if I got multiple final weights?
    #print(loss)

def predict(self, X):        
    # Predicting with sigmoid function
    z = X*self.weights
    # Returning binary result
    return [1 if i > 0.5 else 0 for i in self.sigmoid(z)]
    #print(self.sigmoid(z))

clf = LogisticRegression()
clf.fit(X,y)
clf.predict(X)

Answer 1

权重增量应包括数据点的总和。查看 this 页面以了解有关反向传播推导的更多详细信息。

所以权重应该更新为：

weights -= lr * sum(X*(y_hat - y)) / N

而不是：

weights -= lr * X*(y_hat - y) / N

有了这个，你只得到一个预期的重量。

不同数据点的不同权重？（从零开始逻辑回归）

Different weights for different data points? (logistic regression from scratch)

python

classification

machine-learning

logistic-regression

不同数据点的不同权重？ （从零开始逻辑回归）

Different weights for different data points? (logistic regression from scratch)

python

classification

machine-learning

logistic-regression

不同数据点的不同权重？（从零开始逻辑回归）