对未腌制的分类器进行预测时出错

Question

我正在做一个文本分类程序，输入了上千封电子邮件，所以为了方便，我决定在训练完成后将分类器保存在 pickled 文件中，以便在进一步执行程序后，我不需要重新训练它。

path = 'classifier.pkl'
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
if not os.path.exists(path):
    # making a classifier
    clf.fit(x_train, y_train)
    with open(path, 'wb') as f:
        pickle.dump(clf, f)
else:
    print('<classifier found!>')
    input_file = open(path, 'rb')
    clf = pickle.load(input_file)
    input_file.close()
pred = clf.predict(x_test) # the error occurs on this line

预测首先在运行上起作用（当分类器不是文件输入时）。但它在下次执行时给我这个错误：

ValueError: operands could not be broadcast together with shapes (3516,379) (376,)

x_train和x_test的形状如下：(14062, 379), (3516, 379)

如有任何帮助，我们将不胜感激

编辑：我已经尝试了 desertnaut 的酸洗建议 pred = clf.predict(x_test) 并在程序的进一步运行中使用它，我从那些运行中获得的准确度分数似乎是比最初训练分类器时得分低两倍

Answer 1

无法弄清楚为什么酸洗不起作用。然而，sklearn 的 joblib 功能似乎工作得很好。

from sklearn.externals import joblib
if not os.path.exists(path):
    clf = clf.fit(x_train, y_train)
    joblib.dump(clf, path)
else:
    clf = joblib.load(path)

对未腌制的分类器进行预测时出错

Error when making predictions on un-pickled classifier

pickle

python-3.x

scikit-learn