Pandas：如何用前一个none-空值和下一个none-空值的平均值填充n/a

Question

我的数据框中有一些 N/A 值

df = pd.DataFrame({'A':[1,1,1,3],
              'B':[1,1,1,3],
              'C':[1,np.nan,3,5],
              'D':[2,np.nan, np.nan, 6]})
print(df)

    A   B   C   D
0   1   1   1.0 2.0
1   1   1   NaN NaN
2   1   1   3.0 NaN
3   3   3   5.0 6.0

如何用其列中前一个非空值和下一个非空值的平均值填充 n/a 值？比如C列第二个值应该填成(1+3)/2= 2

期望的输出：

    A   B   C   D
0   1   1   1.0 2.0
1   1   1   2.0 4.0
2   1   1   3.0 4.0
3   3   3   5.0 6.0

谢谢！

Answer 1

使用ffill和bfill通过前后填充替换NaNs，然后concat和groupby通过索引与聚合mean:

df1 = pd.concat([df.ffill(), df.bfill()]).groupby(level=0).mean()
print (df1)
   A  B    C    D
0  1  1  1.0  2.0
1  1  1  2.0  4.0
2  1  1  3.0  4.0
3  3  3  5.0  6.0

详情：

print (df.ffill())
   A  B    C    D
0  1  1  1.0  2.0
1  1  1  1.0  2.0
2  1  1  3.0  2.0
3  3  3  5.0  6.0

print (df.bfill())
   A  B    C    D
0  1  1  1.0  2.0
1  1  1  3.0  6.0
2  1  1  3.0  6.0
3  3  3  5.0  6.0

Pandas：如何用前一个none-空值和下一个none-空值的平均值填充n/a

Pandas: How can I fill in the n/a with the mean of previous none-empty value and next none-empty value

python

pandas

data-science