如何获得多项逻辑回归的系数？

Question

我需要使用 sklearn 计算多元逻辑回归的系数：

X=

x1          x2          x3   x4         x5    x6
0.300000    0.100000    0.0  0.0000     0.5   0.0
0.000000    0.006000    0.0  0.0000     0.2   0.0
0.010000    0.678000    0.0  0.0000     2.0   0.0
0.000000    0.333000    1.0  12.3966    0.1   4.0
0.200000    0.005000    1.0  0.4050     1.0   0.0
0.000000    0.340000    1.0  15.7025    0.5   0.0
0.000000    0.440000    1.0  8.2645     0.0   4.0
0.500000    0.055000    1.0  18.1818    0.0   4.0

y 的值是分类范围 [1; 4].

y =

我就是这样做的：

import pandas as pd
from sklearn import linear_modelion
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

h = .02

logreg = linear_model.LogisticRegression(C=1e5)

logreg.fit(X, y)

# print the coefficients
print(logreg.intercept_)
print(logreg.coef_)

但是，我在 logreg.intercept_ 的输出中得到 6 列，在 logreg.coef_ 的输出中得到 6 列我怎样才能得到每个特征 1 个系数，例如a - f 个值？

y = a*x1 + b*x2 + c*x3 + d*x4 + e*x5 + f*x6

此外，可能我做错了什么，因为 y_pred = logreg.predict(X) 为我提供了所有行的 1 值。

Answer 1

勾选the online documentation:

coef_ : array, shape (1, n_features) or (n_classes, n_features)

Coefficient of the features in the decision function.

coef_ is of shape (1, n_features) when the given problem is binary.

正如@Xochipilli 在评论中已经提到的那样，您将拥有 (n_classes, n_features) 或者在您的情况下 (4,6) 系数和 4 个截距（每个 class 一个）

Probably I am doing something wrong, because y_pred = logreg.predict(X) gives me the value of 1 for all rows.

是的，您不应尝试使用用于训练模型的数据进行预测。将您的数据分成训练和测试数据集，使用训练数据集训练您的模型并使用测试数据集检查其准确性。

如何获得多项逻辑回归的系数？

How to get coefficients of multinomial logistic regression?

python

pandas

scikit-learn

logistic-regression