Pandas 插值不给出单调结果

Question

我有以下数据，我想将使用样条的插值方法应用于最后 4 个数字（我知道这是外推法）：

import numpy as np

x = [
    18.792571,
    19.170139,
    19.370556,
    19.393820,
    19.239932,
    18.908891,
    18.400699,
    17.892507,
    17.384314,
    16.876122,
    16.367930,
    15.859737,    
    np.nan,
    np.nan,
    np.nan,
    np.nan
]

我正在运行 pandas 插值，发生了一件非常奇怪的事情，如代码

import pandas as pd

pd.Series(x).interpolate(
    method="spline", 
    order=1
)

returns

0     18.792571
1     19.170139
2     19.370556
3     19.393820
4     19.239932
5     18.908891
6     18.400699
7     17.892507
8     17.384314
9     16.876122
10    16.367930
11    15.859737
12    16.103099
13    15.790022
14    15.476945
15    15.163868
dtype: float64

因此，虽然数据的趋势显然是负面的，因为很早的指数插值产生向上跳跃。当运行使用scipy

同样的计算

import scipy.interpolate as inp
train_x = [_ for _ in x if _ > 0]
s = inp.InterpolatedUnivariateSpline(range(len(train_x)), train_x, k=1)
ynew = s(range(len(x)))
ynew[12:]

我明白了

array([15.351544, 14.843351, 14.335158, 13.826965])

在这种情况下，插值没有向上变化，所以结果对我来说很有意义。

那么我的问题是：

为什么 pandas 和 scipy 结果不同？
如何使 pandas interpolate 给出我使用 scipy 获得的结果？
为什么这种向上的变化发生在 pandas？

提前致谢！

编辑

使用 scipy interp1d 我有同样的问题：

s = inp.interp1d(range(len(train_x)), train_x, kind=1, fill_value='extrapolate')
ynew = s(range(len(x)))
ynew[12:]

给予

array([15.351544, 14.843351, 14.335158, 13.826965])

Answer 1

也许不是答案，只是一些评论：

Pandas 使用 scipy.interpolate.interp1d 而不是 InterpolatedUnivariateSpline。我相信这些在实现上略有不同。
我会使用 scipy.interpolate.interp1d 来查看 pandas 和 scipy 是否匹配。
插值用于填充数据。你所拥有的更符合外推。尽管可以使用这些方法进行外推。我预计结果可能会导致特殊情况，例如向上变化。

Answer 2

实际上，pandas 使用 UnivariateSpline，因此，要获得与 pandas 中相同的结果，我们可以运行使用 scipy 以下内容：

import scipy.interpolate as inp
train_x = [_ for _ in x if _ > 0]
s = inp.UnivariateSpline(x=range(len(train_x)), y=train_x, k=1)
ynew = s(range(len(x)))
ynew[12:]

这给出了

array([16.10309945, 15.79002222, 15.47694498, 15.16386774])

并且使插值递减的方法，在这种情况下，是通过 s = 0:

pd.Series(x).interpolate(
    method="spline", 
    order=1,
    s=0
)

哪个returns:

0     18.792571
1     19.170139
2     19.370556
3     19.393820
4     19.239932
5     18.908891
6     18.400699
7     17.892507
8     17.384314
9     16.876122
10    16.367930
11    15.859737
12    15.351544
13    14.843351
14    14.335158
15    13.826965
dtype: float64

Pandas 插值不给出单调结果

Pandas interpolate doesn't give monotonic results

python

interpolation

numpy

scipy

pandas

编辑