TypeError: estimator should be an estimator implementing 'fit' method

TypeError: estimator should be an estimator implementing 'fit' method

我从 Stepik 解决了这个问题:

One tree is good, but where are the guarantees that it is the best, or at least close to it? One of the ways to find a more or less optimal set of tree parameters is to iterate over a set of trees with different parameters and choose the appropriate one. For this purpose, there is a GridSearchCV class that iterates over each of the combinations of parameters among those specified for the model, trains it on the data and performs cross-validation. After that, the model with the best parameters is stored in the .best_estimator_ attribute. Now the task is to iterate over all the trees on the iris data according to the following parameters: maximum depth - from 1 to 10 levels the minimum number of samples for separation is from 2 to 10 minimum number of samples per sheet - from 1 to 10 and store the best tree in the variable best_tree. Name the variable with GridSearchCV search. Here is my solution:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris


iris = load_iris()
X = iris.data
y = iris.target

parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(iris, parameters)

search.fit(X, y)

best_tree = search.estimator

为什么会出现此错误?:

Traceback (most recent call last):
  File "jailed_code", line 22, in <module>
    search.fit(X, y)
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 595, in fit
    self.estimator, scoring=self.scoring)
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 342, in _check_multimetric_scoring
    scorers = {"score": check_scoring(estimator, scoring=scoring)}
  File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 274, in check_scoring
    "'fit' method, %r was passed" % estimator)
TypeError: estimator should be an estimator implementing 'fit' method, {'data': array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       ...

您尚未将任何估算器传递给您的 GridSearchCV 函数。您必须将要适合 GridSearCV 的估算器的实例传递给 GridSearCV,但是您只是传递不是估算器的 iris

您传递了数据集而不是估算器。如果你还没有,看看这个 https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

这应该有效

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data
y = iris.target

parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(estimator=DecisionTreeClassifier(),
                      param_grid=parameters)

search.fit(X, y)

search.cv_results_