随机森林提高准确性

Question

我正在使用 RandomForestClassifier 方法进行对象检测问题是，即使我知道我的随机状态默认情况下应该为零，但我的准确性非常差，所以无论如何都知道什么是我的最佳值n_estimators, random_state 参数 ?

from sklearn.ensemble import RandomForestClassifier
RF_model = RandomForestClassifier(n_estimators = 250, random_state = 120)

Answer 1

要确定模型的最佳参数，您可以使用称为网格搜索的过程。 Sklearn 提供了 class 来执行此操作，GridSearchCV。我提供了一个代码示例，说明如何将其用于随机森林 classifier。

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV

# provide iterables of values to be tested each parameter
parameters = {'n_estimators': [100, 250, 500, 750]}
clf = GridSearchCV(RandomForestClassifier(), parameters)
clf.fit(X, y)  # X and y are your training data and targets

值得注意的是，在您的问题中，您特别提到寻找 n_estimators 和 random_state 参数的最佳值。我没有将 random_state 作为 GridSearch 的一部分包含在内，因为该参数通常用于结果的可重复性。这是 Sklearns Glossary 中关于该参数的一些 additonal reading。

随机森林提高准确性

Random Forest Improve Accuracy

python

random

classification