不同数据点的不同权重? (从零开始逻辑回归)
Different weights for different data points? (logistic regression from scratch)
我正在尝试从头开始实施逻辑回归。在这里我很困惑,最初我们得到了一个单一的权重随机值。然而,随着过程的进行。我发现训练的最终结果给出了多个权重(匹配训练集中数据点的数量)。我在这里完全不知道,因为预测工作正常,但我认为为单个特征设置多个权重没有任何意义。我也在下面的代码中提到了我的问题。
np.random.seed(100)
class LogisticRegression:
def sigmoid(self, z): return 1 / (1 + np.e**(-z))
def cost_function(self, X, y, weights):
z = X*weights
predict_1 = y * np.log(self.sigmoid(z))
predict_0 = (1 - y) * np.log(1 - self.sigmoid(z))
#print(-sum(predict_1 + predict_0) / len(X))
return -sum(predict_1 + predict_0) / len(X)
def fit(self, X, y, epochs=250, lr=0.05):
loss = []
weights = np.rand() # Initially weights here is a single number...
N = len(X)
for _ in range(epochs):
# Gradient Descent
y_hat = self.sigmoid(X*weights)
weights -= lr * X*(y_hat - y) / N # ...But then the number of weights
# become equal to the number of
# data points at this line...
# Saving Progress
loss.append(self.cost_function(X, y, weights))
self.weights = weights
self.loss = loss
print('weights:', weights) # ...Which causes us to get different
# weight for each data points.
# How can I plot the final logistic curve then
# if I got multiple final weights?
#print(loss)
def predict(self, X):
# Predicting with sigmoid function
z = X*self.weights
# Returning binary result
return [1 if i > 0.5 else 0 for i in self.sigmoid(z)]
#print(self.sigmoid(z))
clf = LogisticRegression()
clf.fit(X,y)
clf.predict(X)
权重增量应包括数据点的总和。查看 this 页面以了解有关反向传播推导的更多详细信息。
所以权重应该更新为:
weights -= lr * sum(X*(y_hat - y)) / N
而不是:
weights -= lr * X*(y_hat - y) / N
有了这个,你只得到一个预期的重量。
我正在尝试从头开始实施逻辑回归。在这里我很困惑,最初我们得到了一个单一的权重随机值。然而,随着过程的进行。我发现训练的最终结果给出了多个权重(匹配训练集中数据点的数量)。我在这里完全不知道,因为预测工作正常,但我认为为单个特征设置多个权重没有任何意义。我也在下面的代码中提到了我的问题。
np.random.seed(100)
class LogisticRegression:
def sigmoid(self, z): return 1 / (1 + np.e**(-z))
def cost_function(self, X, y, weights):
z = X*weights
predict_1 = y * np.log(self.sigmoid(z))
predict_0 = (1 - y) * np.log(1 - self.sigmoid(z))
#print(-sum(predict_1 + predict_0) / len(X))
return -sum(predict_1 + predict_0) / len(X)
def fit(self, X, y, epochs=250, lr=0.05):
loss = []
weights = np.rand() # Initially weights here is a single number...
N = len(X)
for _ in range(epochs):
# Gradient Descent
y_hat = self.sigmoid(X*weights)
weights -= lr * X*(y_hat - y) / N # ...But then the number of weights
# become equal to the number of
# data points at this line...
# Saving Progress
loss.append(self.cost_function(X, y, weights))
self.weights = weights
self.loss = loss
print('weights:', weights) # ...Which causes us to get different
# weight for each data points.
# How can I plot the final logistic curve then
# if I got multiple final weights?
#print(loss)
def predict(self, X):
# Predicting with sigmoid function
z = X*self.weights
# Returning binary result
return [1 if i > 0.5 else 0 for i in self.sigmoid(z)]
#print(self.sigmoid(z))
clf = LogisticRegression()
clf.fit(X,y)
clf.predict(X)
权重增量应包括数据点的总和。查看 this 页面以了解有关反向传播推导的更多详细信息。
所以权重应该更新为:
weights -= lr * sum(X*(y_hat - y)) / N
而不是:
weights -= lr * X*(y_hat - y) / N
有了这个,你只得到一个预期的重量。