AttributeError: 'str' object has no attribute 'decode' in fitting Logistic Regression Model
AttributeError: 'str' object has no attribute 'decode' in fitting Logistic Regression Model
我目前正在尝试使用 Logistic 回归创建二元分类。目前我正在确定特征的重要性。我已经用 XGBoost 和 RandomFOrestClassifier 进行了数据预处理(一次热编码和采样)运行,没问题
但是,当我尝试拟合 LogisticRegression 模型时(下面是我在 Notebook 中的代码),
from sklearn.linear_model import LogisticRegression
#Logistic Regression
# fit the model
model = LogisticRegression()
# fit the model
model.fit(np.array(X_over), np.array(y_over))
# get importance
importance = model.coef_[0]
# summarize feature importance
df_imp = pd.DataFrame({'feature':list(X_over.columns), 'importance':importance})
display(df_imp.sort_values('importance', ascending=False).head(20))
# plot feature importance
plt.bar(list(X_over.columns), importance)
plt.show()
报错
...
~\AppData\Local\Continuum\anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in _logistic_regression_path(X, y, pos_class, Cs, fit_intercept, max_iter, tol, verbose, solver, coef, class_weight, dual, penalty, intercept_scaling, multi_class, random_state, check_input, max_squared_sum, sample_weight, l1_ratio)
762 n_iter_i = _check_optimize_result(
763 solver, opt_res, max_iter,
--> 764 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
765 w0, loss = opt_res.x, opt_res.fun
766 elif solver == 'newton-cg':
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\optimize.py in _check_optimize_result(solver, result, max_iter, extra_warning_msg)
241 " https://scikit-learn.org/stable/modules/"
242 "preprocessing.html"
--> 243 ).format(solver, result.status, result.message.decode("latin1"))
244 if extra_warning_msg is not None:
245 warning_msg += "\n" + extra_warning_msg
AttributeError: 'str' object has no attribute 'decode'
我用谷歌搜索了一下,大部分回复都说这个错误是因为 scikit-learn 库试图解码一个已经解码的字符串。但我不知道如何解决我这里的情况。我确保我所有的数据都是整数或 float64,并且没有字符串。
在最新版本的 scikit-learn(现在是 0.24.1)中,问题已得到解决,它包含了我在下面报告的 try-catch 块中的部分代码:文件是
optimize.py -> _check_optimize_result(solver, result, max_iter=None,
extra_warning_msg=None)
代码段是
if solver == "lbfgs":
if result.status != 0:
try:
# The message is already decoded in scipy>=1.6.0
result_message = result.message.decode("latin1")
except AttributeError:
result_message = result.message
warning_msg = (
"{} failed to converge (status={}):\n{}.\n\n"
"Increase the number of iterations (max_iter) "
"or scale the data as shown in:\n"
" https://scikit-learn.org/stable/modules/"
"preprocessing.html"
).format(solver, result.status, result_message)
这只是
if solver == "lbfgs":
if result.status != 0:
warning_msg = (
"{} failed to converge (status={}):\n{}.\n\n"
"Increase the number of iterations (max_iter) "
"or scale the data as shown in:\n"
" https://scikit-learn.org/stable/modules/"
"preprocessing.html"
).format(solver, result.status, result.message.decode("latin1"))
之前。
所以升级scikit-learn解决了问题。
我尝试使用以下命令升级我的 scikit-learn
,但仍然没有解决 AttributeError: 'str' object has no attribute 'decode'
问题
pip install scikit-learn -U
最后,下面的代码片段解决了这个问题,将求解器添加为 liblinear
model = LogisticRegression(solver='liblinear')
solver='lbfgs' 存在错误。
更改为 'sag' 可以解决它。
我目前正在尝试使用 Logistic 回归创建二元分类。目前我正在确定特征的重要性。我已经用 XGBoost 和 RandomFOrestClassifier 进行了数据预处理(一次热编码和采样)运行,没问题
但是,当我尝试拟合 LogisticRegression 模型时(下面是我在 Notebook 中的代码),
from sklearn.linear_model import LogisticRegression
#Logistic Regression
# fit the model
model = LogisticRegression()
# fit the model
model.fit(np.array(X_over), np.array(y_over))
# get importance
importance = model.coef_[0]
# summarize feature importance
df_imp = pd.DataFrame({'feature':list(X_over.columns), 'importance':importance})
display(df_imp.sort_values('importance', ascending=False).head(20))
# plot feature importance
plt.bar(list(X_over.columns), importance)
plt.show()
报错
...
~\AppData\Local\Continuum\anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in _logistic_regression_path(X, y, pos_class, Cs, fit_intercept, max_iter, tol, verbose, solver, coef, class_weight, dual, penalty, intercept_scaling, multi_class, random_state, check_input, max_squared_sum, sample_weight, l1_ratio)
762 n_iter_i = _check_optimize_result(
763 solver, opt_res, max_iter,
--> 764 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
765 w0, loss = opt_res.x, opt_res.fun
766 elif solver == 'newton-cg':
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\optimize.py in _check_optimize_result(solver, result, max_iter, extra_warning_msg)
241 " https://scikit-learn.org/stable/modules/"
242 "preprocessing.html"
--> 243 ).format(solver, result.status, result.message.decode("latin1"))
244 if extra_warning_msg is not None:
245 warning_msg += "\n" + extra_warning_msg
AttributeError: 'str' object has no attribute 'decode'
我用谷歌搜索了一下,大部分回复都说这个错误是因为 scikit-learn 库试图解码一个已经解码的字符串。但我不知道如何解决我这里的情况。我确保我所有的数据都是整数或 float64,并且没有字符串。
在最新版本的 scikit-learn(现在是 0.24.1)中,问题已得到解决,它包含了我在下面报告的 try-catch 块中的部分代码:文件是
optimize.py -> _check_optimize_result(solver, result, max_iter=None,
extra_warning_msg=None)
代码段是
if solver == "lbfgs":
if result.status != 0:
try:
# The message is already decoded in scipy>=1.6.0
result_message = result.message.decode("latin1")
except AttributeError:
result_message = result.message
warning_msg = (
"{} failed to converge (status={}):\n{}.\n\n"
"Increase the number of iterations (max_iter) "
"or scale the data as shown in:\n"
" https://scikit-learn.org/stable/modules/"
"preprocessing.html"
).format(solver, result.status, result_message)
这只是
if solver == "lbfgs":
if result.status != 0:
warning_msg = (
"{} failed to converge (status={}):\n{}.\n\n"
"Increase the number of iterations (max_iter) "
"or scale the data as shown in:\n"
" https://scikit-learn.org/stable/modules/"
"preprocessing.html"
).format(solver, result.status, result.message.decode("latin1"))
之前。 所以升级scikit-learn解决了问题。
我尝试使用以下命令升级我的 scikit-learn
,但仍然没有解决 AttributeError: 'str' object has no attribute 'decode'
问题
pip install scikit-learn -U
最后,下面的代码片段解决了这个问题,将求解器添加为 liblinear
model = LogisticRegression(solver='liblinear')
solver='lbfgs' 存在错误。 更改为 'sag' 可以解决它。