强制随机森林分类器使用所有输入特征？

Force the random forest classifier to use all the input features?

为了执行二元预测，我有 5 个特征要用于我的随机森林分类器，其中两个特征根本没有被使用。我知道这是机器学习的全部意义 select 有用的功能，但其他三个功能可能有偏差数据，我想确保我的所有功能都以与运行我的分类器。我找不到这个问题的直接答案。我在 python 中使用 sklearn 来完成这项工作。任何 comments/suggestions 将不胜感激。

您可以通过设置 max_features = None 请求在随机森林分类器的每个分割中考虑所有特征。

来自docs：

max_features : int, float, string or None, optional (default=”auto”)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.

If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split.

If “auto”, then max_features=sqrt(n_features).

If “sqrt”, then max_features=sqrt(n_features) (same as “auto”).

If “log2”, then max_features=log2(n_features).

If None, then max_features=n_features.

中的答案可能有助于解释和提供一些上下文。

能帮你的是设置参数max_feature = 1，这样每个节点都会取一个（均匀分布的）随机特征，并且强制使用它。不过，您也需要设置树的深度，因为它会无限添加点头，直到获得主要特征之一。

强制随机森林分类器使用所有输入特征？

Force the random forest classifier to use all the input features?

python

classification

machine-learning

random-forest

scikit-learn