Python 逻辑回归中 scikit-learn 中的巨大而奇怪的错误？

Question

以下操作涉及 Python scikit-learn

中的逻辑回归

我给你最重要的代码示例：

predictions = logistic_regression.predict(X_test)
prediction=logistic_regression.predict_proba(X_test)[:,:]
prediction=pd.DataFrame(data=predictions, 
                         columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])
prediction.head(10)

昨天我得到了这段代码的结果，符合我的预期：（不一样的 table 标题但相同的结果）

enter image description here

但是今天，我完全不知道为什么，当我想再次运行这段代码时，我有一个错误：

ValueError: Shape of passed values is (300, 1), indices imply (300, 2)

怎么可能昨天有效而今天无效呢？我能做什么？下面是完整错误的屏幕：

enter image description here

预测样本如下：

print(predictions)

[1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

并且我不想在 table 中有 1 或 0 我希望有 1 或 0 的概率百分比，如屏幕中的示例

在下面来源的预测结束时查看相同的 table，有相同的代码并且有效： https://www.kaggle.com/neisha/heart-disease-prediction-using-logistic-regression

Answer 1

我认为错误发生是因为预测只有一行，你有两个列名：

prediction=pd.DataFrame(data=predictions, 
                         columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])

根据您提供的kaggle代码：

y_pred_prob=logreg.predict_proba(x_test)[:,:]
y_pred_prob_df=pd.DataFrame(data=y_pred_prob, columns=['Prob of no heart disease (0)','Prob of Heart Disease (1)'])
y_pred_prob_df.head()

我认为您应该将代码更改为：

prediction_df = pd.DataFrame(data=prediction,  
                         columns=['Prob of Bad credit (0)','Prob of Good credit (1)'])

注意它应该是预测，而不是预测。

Python 逻辑回归中 scikit-learn 中的巨大而奇怪的错误？

Enormous and weird error in scikit-learn in Python Logistic Regression?

python

regression

shapes

pandas