sklearn, ValueError: could not convert string to float, even if I'm not using strings
sklearn, ValueError: could not convert string to float, even if I'm not using strings
我正在尝试用 sklearn 拟合随机森林。
每次我 运行 我的算法,我都会遇到错误:
ValueError: could not convert string to float: '#DIV/0!'
在 Whosebug 上搜索我发现它可能正在发生,因为我试图除以零。为了避免这种情况,我将数据框中的每个值乘以 100,然后将每个 0 替换为 1:给定新值的比例,1 将是无关紧要的,或者至少这是我的想法。我使用的代码是:
df = df.mul(100)
df = df.replace(0, 1)
发生的情况是,如果我现在尝试适合我的 RF,我会收到一个新错误:
ValueError: could not convert string to float: '-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932'
我 100% 确定我没有在我的数据集中使用任何字符串作为值。
这是一个小示例:
所以我现在的问题是:如何解决这个问题?
编辑
通过使用"df.info"我发现有一个对象。我用下面的一行解决了这个问题:
df = df.apply(lambda col:pd.to_numeric(col, errors='coerce'))
现在所有值的格式都是 "float64"。
问题是现在我收到一个新错误:
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
好的,通过进一步的研究,我发现了第二个单线性解决了我的问题:现在拟合成功了。
df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]
我正在尝试用 sklearn 拟合随机森林。 每次我 运行 我的算法,我都会遇到错误:
ValueError: could not convert string to float: '#DIV/0!'
在 Whosebug 上搜索我发现它可能正在发生,因为我试图除以零。为了避免这种情况,我将数据框中的每个值乘以 100,然后将每个 0 替换为 1:给定新值的比例,1 将是无关紧要的,或者至少这是我的想法。我使用的代码是:
df = df.mul(100)
df = df.replace(0, 1)
发生的情况是,如果我现在尝试适合我的 RF,我会收到一个新错误:
ValueError: could not convert string to float: '-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932'
我 100% 确定我没有在我的数据集中使用任何字符串作为值。 这是一个小示例:
所以我现在的问题是:如何解决这个问题?
编辑
通过使用"df.info"我发现有一个对象。我用下面的一行解决了这个问题:
df = df.apply(lambda col:pd.to_numeric(col, errors='coerce'))
现在所有值的格式都是 "float64"。 问题是现在我收到一个新错误:
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
好的,通过进一步的研究,我发现了第二个单线性解决了我的问题:现在拟合成功了。
df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]