"Not enough values to unpack" 在 sklearn.fit

Question

这是代码片段：

from sklearn.model_selection import StratifiedKFold
from sklearn.linear_model import LogisticRegressionCV

skf = StratifiedKFold(n_splits=5)
skf_1 = skf.split(titanic_dataset, surv_titanic)

ls_1 = np.logspace(-1.0, 2.0, num=500)

clf = LogisticRegressionCV(Cs=ls_1, cv = skf_1, scoring = "roc_auc", n_jobs=-1, random_state=17)

clf_model = clf.fit(x_train, y_train)

这表示：

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-130-b99a5912ff5a> in <module>
----> 1 clf_model = clf.fit(x_train, y_train)

H:\Anaconda_3\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight)
   2098         #  (n_classes, n_folds, n_Cs . n_l1_ratios) or
   2099         #  (1, n_folds, n_Cs . n_l1_ratios)
-> 2100         coefs_paths, Cs, scores, n_iter_ = zip(*fold_coefs_)
   2101         self.Cs_ = Cs[0]
   2102         if multi_class == 'multinomial':

ValueError: not enough values to unpack (expected 4, got 0)

之前已经准备好训练和测试数据集，它们与其他分类器的表现很好。

如此一般的错误消息什么也没告诉我。这里有什么问题？

Answer 1

简而言之，问题是当您需要直接传递 StratifiedKFold(n_splits=5) 时，您将 skf.split(titanic_dataset, surv_titanic) 的结果传递给 LogisticRegressionCV 上的 cv 参数。

下面我展示了重现您的错误的代码，下面我展示了两种替代方法来完成我认为您正在尝试做的事情。

# Some example data
data = load_breast_cancer()
X = data['data']
y = data['target']

# Set up the stratifiedKFold
skf = StratifiedKFold(n_splits=5)

# Don't do this... only here to reproduce the error
skf_indicies = skf.split(X, y)

# Some regularization
ls_1 = np.logspace(-1.0, 2.0, num=5)

# This creates your error
clf_error = LogisticRegressionCV(Cs=ls_1,
                                 cv = skf_indicies, 
                                 scoring = "roc_auc", 
                                 n_jobs=-1, 
                                 random_state=17)

# Error created by passing result of skf.split to cv
clf_model = clf_error.fit(X, y)

# This is probably what you meant to do
clf_using_skf = LogisticRegressionCV(Cs=ls_1,
                                     cv = skf, 
                                     scoring = "roc_auc", 
                                     n_jobs=-1,
                                     random_state=17, 
                                     max_iter=1_000)

# This will now fit without the error
clf_model_skf = clf_using_skf.fit(X, y)

# This is the easiest method, and from the docs also does the
# same thing as StratifiedKFold
# https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html
clf_easiest = LogisticRegressionCV(Cs=ls_1,
                                     cv = 5, 
                                     scoring = "roc_auc", 
                                     n_jobs=-1,
                                     random_state=17, 
                                     max_iter=1_000)

# This will now fit without the error
clf_model_easiest = clf_easiest.fit(X, y)

"Not enough values to unpack" 在 sklearn.fit

"Not enough values to unpack" in sklearn.fit

python

scikit-learn