无法弄清楚如何清除随机森林中的 NaN

Can't figure out how to clear NaNs in Random Forest

获取 ValueError:当我 运行 下面的代码

时输入包含 NaN
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators = 1000, random_state = 42)
rf.fit(train_features, train_labels);

我有 运行 以下内容,得到的结果表明没有 NaN 或无限值,但不同的循环将它们显示在 train_features 数组

np.any(np.isnan(train_features))

我在下面 运行 但它并没有改变我收到的错误

train_features = np.nan_to_num(train_features)
train_labels = np.nan_to_num(train_labels)

请帮忙!

编辑:添加完整的相关代码:

features = pd.read_csv(x)
labels = np.array(features['Actuals'])
features = features.drop('Actuals', axis = 1)
feature_list = list(features.columns)
features = np.array(features)

from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size = 0.25, random_state = 42)

from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators = 1000, random_state = 42)
rf.fit(train_features, train_labels);

从我在您的代码中看到的情况来看,您只检查了 nan,而不是 inf。使用 numpy 可能有更好的方法,但 pandas 方法应该有效:

with pd.option_context('mode.use_inf_as_na', True):
    pd.DataFrame(train_features).isnull().sum() #Will show you which columns have nan or inf values
    pd.DataFrame(train_labels).isnull().sum()

有了这个,您可以确定是否有 naninf 值。那么你可以 fillna.