使用 df.fillna 对列顶部的 NaN 应用前向填充?
Applying forward fill on NaNs at the top of a column using df.fillna?
这是我的数据框
Id_Student English History Mathmatic
1 66.0 NaN 80.0
2 NaN 66.0 NaN
3 NaN NaN NaN
4 55.0 94.0 94.0
我想用这个方法来修复缺失值
mdf1 = mdf.fillna(method='ffill')
但看起来如果第一个值是 NaN 则帮助不大。历史记录列下的第一个值仍然是 NaN
Id_Student English History Mathmatic
1 66.0 NaN 80.0
2 66.0 66.0 80.0
3 66.0 66.0 80.0
4 55.0 94.0 94.0
5 55.0 85.0 85.0
任何解决此类问题的想法
干杯队友
我认为这是正常行为,因为 ffill
将 NaN
替换为前向填充,如果第一行中没有值,则仅获取 NaNs
到第一个非 NaN 值。
您可以使用另一个 fillna
来替换 NaNs
,它不能被 ffill
替换:
mdf1 = mdf.ffill().fillna(0)
#same as
#mdf1 = mdf.fillna(method='ffill').fillna(0)
同样的问题是 bfill
(回填)和最后一行的 NaN
s 值,然后可以添加 fillna
或其他方法:
print (mdf)
Id_Student English History Mathmatic
0 1 66.0 NaN NaN
1 2 NaN 66.0 NaN
2 3 NaN NaN NaN
3 4 55.0 94.0 94.0
4 5 NaN 10.0 NaN
5 6 NaN NaN 20.0
print (mdf.ffill())
Id_Student English History Mathmatic
0 1 66.0 NaN NaN
1 2 66.0 66.0 NaN
2 3 66.0 66.0 NaN
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
print (mdf.bfill())
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 NaN 10.0 20.0
5 6 NaN NaN 20.0
用标量替换所有 NaN:
mdf1 = mdf.ffill().fillna(0)
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 0.0 0.0
1 2 66.0 66.0 0.0
2 3 66.0 66.0 0.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
mdf1 = mdf.bfill().fillna(0)
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 0.0 10.0 20.0
5 6 0.0 0.0 20.0
用另一种方法替换 - 如果先 ffill
,然后 bfill
:
mdf1 = mdf.ffill().bfill()
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 66.0 66.0 94.0
2 3 66.0 66.0 94.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
mdf1 = mdf.bfill().ffill()
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 20.0
5 6 55.0 10.0 20.0
这是我的数据框
Id_Student English History Mathmatic
1 66.0 NaN 80.0
2 NaN 66.0 NaN
3 NaN NaN NaN
4 55.0 94.0 94.0
我想用这个方法来修复缺失值
mdf1 = mdf.fillna(method='ffill')
但看起来如果第一个值是 NaN 则帮助不大。历史记录列下的第一个值仍然是 NaN
Id_Student English History Mathmatic
1 66.0 NaN 80.0
2 66.0 66.0 80.0
3 66.0 66.0 80.0
4 55.0 94.0 94.0
5 55.0 85.0 85.0
任何解决此类问题的想法 干杯队友
我认为这是正常行为,因为 ffill
将 NaN
替换为前向填充,如果第一行中没有值,则仅获取 NaNs
到第一个非 NaN 值。
您可以使用另一个 fillna
来替换 NaNs
,它不能被 ffill
替换:
mdf1 = mdf.ffill().fillna(0)
#same as
#mdf1 = mdf.fillna(method='ffill').fillna(0)
同样的问题是 bfill
(回填)和最后一行的 NaN
s 值,然后可以添加 fillna
或其他方法:
print (mdf)
Id_Student English History Mathmatic
0 1 66.0 NaN NaN
1 2 NaN 66.0 NaN
2 3 NaN NaN NaN
3 4 55.0 94.0 94.0
4 5 NaN 10.0 NaN
5 6 NaN NaN 20.0
print (mdf.ffill())
Id_Student English History Mathmatic
0 1 66.0 NaN NaN
1 2 66.0 66.0 NaN
2 3 66.0 66.0 NaN
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
print (mdf.bfill())
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 NaN 10.0 20.0
5 6 NaN NaN 20.0
用标量替换所有 NaN:
mdf1 = mdf.ffill().fillna(0)
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 0.0 0.0
1 2 66.0 66.0 0.0
2 3 66.0 66.0 0.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
mdf1 = mdf.bfill().fillna(0)
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 0.0 10.0 20.0
5 6 0.0 0.0 20.0
用另一种方法替换 - 如果先 ffill
,然后 bfill
:
mdf1 = mdf.ffill().bfill()
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 66.0 66.0 94.0
2 3 66.0 66.0 94.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 94.0
5 6 55.0 10.0 20.0
mdf1 = mdf.bfill().ffill()
print (mdf1)
Id_Student English History Mathmatic
0 1 66.0 66.0 94.0
1 2 55.0 66.0 94.0
2 3 55.0 94.0 94.0
3 4 55.0 94.0 94.0
4 5 55.0 10.0 20.0
5 6 55.0 10.0 20.0