如何在考虑 X 值的情况下在 pandas 中进行线性插值？

Question

我有一个包含两列的数据框：X 和 Y。 Y 中的某些值缺失 (np.nan)。

我想使用线性插值法填充 NaNs。更详细地说，我想按 X 对数据框进行排序， Y 的任何缺失值应该是 Y 的两个相邻值的 "linear mixture" （一个对应于较小的 X 和另一个较大的 X).

如果缺失的Y对应的X的值更接近可用Y的两个X之一，则[=的填充值12=]应该接近对应的Y。 pandas如何高效优雅的做到？

请注意，据我所知，pandas.Series.interpolate 没有满足我的需求。

Answer 1

设置数据框：

x = [0,1,3,4,7,9,11,122,123,128]
y = [2,8,12,np.NaN, 22, 31, 34, np.NaN, 43, 48]

df = pd.DataFrame({"x":x, "y":y})
print(df)

     x     y
0    0   2.0
1    1   8.0
2    3  12.0
3    4   NaN
4    7  22.0
5    9  31.0
6   11  34.0
7  122   NaN
8  123  43.0
9  128  48.0

将列 'x' 设置为索引：

df = df.set_index('x')

然后将interplot中的方法设置为'index'。

df.y = df.y.interpolate(method='index')

这导致：

df

        y
x   
0      2.000000
1      8.000000
3     12.000000
4     14.500000
7     22.000000
9     31.000000
11    34.000000
122   42.919643
123   43.000000
128   48.000000

如何在考虑 X 值的情况下在 pandas 中进行线性插值？

How to do a linear interpolation in pandas taking values of X into account?

python

interpolation

pandas