如何为 DecisionTreeClassifier 绘制 feature_importance？

Question

我需要为 DecisionTreeClassifier 绘制 feature_importances。已经找到特征并实现目标结果，但我的老师告诉我绘制 feature_importances 以查看影响因素的权重。我不知道该怎么做。

model = DecisionTreeClassifier(random_state=12345, max_depth=8,class_weight='balanced') 
model.fit(features_train,target_train)
model.feature_importances_

它给了我。

array([0.02927077, 0.3551379 , 0.01647181, ..., 0.03705096, 0.        ,
       0.01626676])

为什么它没有附加到 max_depth 之类的东西，而只是一些数字的数组？

Answer 1

特征重要性指的是一种 class 技术，用于为预测模型的输入特征分配分数，指示进行预测时每个特征的相对重要性。

可以为涉及预测数值的问题（称为回归）和涉及预测 class 标签的问题（称为 class化）计算特征重要性分数。

将特征重要性加载到由您的数据框列名索引的 pandas series 中，然后使用其绘图方法。

来自 Scikit Learn

Feature importances are provided by the fitted attribute feature_importances_ and they are computed as the mean and standard deviation of accumulation of the impurity decrease within each tree.

How are feature_importances in RandomForestClassifier determined?

以你的例子为例：

feat_importances = pd.Series(model.feature_importances_, index=df.columns)
feat_importances.nlargest(5).plot(kind='barh')

绘制特征重要性的更多方法-

Answer 2

特征重要性表示因素对结果变量的影响。它越大，对结果的影响就越大。这就是您收到阵列的原因。对于绘图，你可以这样做：

import matplotlib.pyplot as plt

feat_importances = pd.DataFrame(model.feature_importances_, index=features_train.columns, columns=["Importance"])
feat_importances.sort_values(by='Importance', ascending=False, inplace=True)
feat_importances.plot(kind='bar', figsize=(8,6))

如何为 DecisionTreeClassifier 绘制 feature_importance？

How to plot feature_importance for DecisionTreeClassifier?

python

machine-learning

decision-tree