如何修复 X 没有有效的特征名称，但 IsolationForest 配备了特征名称 warnings.warn(

Question

这是我的代码：

import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.ensemble import IsolationForest

data = pd.read_csv('marks1.csv', encoding='latin-1',
                   on_bad_lines='skip', index_col=0, header=0
                   )

random_state = np.random.RandomState(42)

model = IsolationForest(n_estimators=100, max_samples='auto', contamination=float(0.2)
                        , random_state=random_state)

model.fit(data[['Mark']])

random_state = np.random.RandomState(42)

data['scores'] = model.decision_function(data[['Mark']])

data['anomaly_score'] = model.predict(data[['Mark']])

data[data['anomaly_score'] == -1].head()

错误：

C:\Program Files\Python39\lib\site-packages\sklearn\base.py:450: UserWarning: X does not have valid feature names, but IsolationForest was fitted with feature names warnings.warn(

Answer 1

这取决于您使用的 sklearn 版本。在 1.0 之后的版本中，模型在使用集成列名称的数据框进行训练时具有 feature_names 属性。此版本中存在一个错误，在使用数据帧进行训练时会引发错误。 https://github.com/scikit-learn/scikit-learn/issues/21577

我还没有了解这方面的最新最佳实践，所以我不能明确地说应该如何设置它。但我现在只是在我的代码中解决了这个问题。为了解决这个问题，我在训练前将我的数据帧转换为一个 numpy 数组

df.to_numpy()

如何修复 X 没有有效的特征名称，但 IsolationForest 配备了特征名称 warnings.warn(

How to fix X does not have valid feature names, but IsolationForest was fitted with feature names warnings.warn(

python

pandas

scikit-learn

如何修复 X ​​没有有效的特征名称，但 IsolationForest 配备了特征名称 warnings.warn(

How to fix X does not have valid feature names, but IsolationForest was fitted with feature names warnings.warn(

python

pandas

scikit-learn

如何修复 X 没有有效的特征名称，但 IsolationForest 配备了特征名称 warnings.warn(