我什么时候必须使用 scikit learn 的 fit 方法？

Question

我不明白什么时候必须使用scikit learn的fit方法。

在此网页中：http://machinelearningmastery.com/automate-machine-learning-workflows-pipelines-python-scikit-learn/ 有一个管道 + StandardScaler 的例子。未使用拟合方法。

但在另一个中：http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html 还有一个 StandardScaler 和一个 fit 方法。

这是我的代码：Pipeline+Robustscaler：

result_list = []

for name in ["AWA","Rem","S1","S2","SWS","SX", "ALL"]: 
    x=sio.loadmat('/home/{}_E.mat'.format(name))['x'] 
    s_y=sio.loadmat('/home/{}_E.mat'.format(name))['y']
    y=np.ravel(s_y)

    print(name, x.shape, y.shape) 
    print("")

    #Create a pipeline
    clf = make_pipeline(preprocessing.RobustScaler(), SVC(cache_size=1000, kernel='rbf'))


    ###################10x20 SSS##################################
    print("10x20")
    xSSSmean20 = []
    for i in range(10):
        sss= StratifiedShuffleSplit(y, 20, test_size=0.1, random_state=i)
        scoresSSS=cross_validation.cross_val_score(clf, x, y, cv=sss)

        xSSSmean20.append(scoresSSS.mean()) 

     result_list.append(xSSSmean20)

     print("")

Answer 1

要训练您的分类器，您必须将其放入您的训练数据集中。

第一个link也是这样做的，并不是因为它没有明确出现在片段中所以它没有这样做：

方法 cross_val_score 使用 model 这是估计器将其拟合 到数据。

查看方法 'cross_val_score' 的实现并尝试了解它的工作原理，而不是在不了解它的作用的情况下使用它。

Here is the documentation of the function and hereGitHub中的实现可参考

一条建议：

当你不明白的时候，试着去挖掘代码。你会学到很多东西！

我什么时候必须使用 scikit learn 的 fit 方法？

When do i have to use the fit method of scikit learn?

svm

python-3.x

scikit-learn