在 pandas 中按组插值到新的 x 轴

Interpolate to the new x-axis by group in pandas

我有一个观察数据框,看起来像(按日期对数据集排序后索引混淆):

                     date   depth   temp    salt
25799 1962-10-25 08:00:00 -242.34  5.470  12.900
25798 1962-10-25 08:00:00 -227.19  5.460  12.840
25797 1962-10-25 08:00:00 -201.93  5.500  12.750
25796 1962-10-25 08:00:00 -176.68  5.530  12.610
25795 1962-10-25 08:00:00 -151.43  5.500  12.370
                      ...     ...    ...     ...
21617 2019-02-06 07:59:00  -20.00  3.780   7.532
21616 2019-02-06 07:59:00  -15.40  3.771   7.531
21615 2019-02-06 07:59:00  -10.00  3.759   7.529
21614 2019-02-06 07:59:00   -5.40  3.747   7.528
21613 2019-02-06 07:59:00      -0  5.748   7.528

我需要将每个日期的温度和盐度值插值到 -230、-225、...、-5、-0 的深度级别:

interp_dep = [val for val in range(-230, 5, 5)]

主题Python Pandas interpolate with new x-axis doesn't work for me, as I don't consider indexes. The interpolation within a groupby described in 中描述的方法不适用,因为我需要按日期组对值进行插值,但要对新的深度值进行插值。我有点困惑,如何处理这个问题。感谢任何帮助。

在评论中讨论了一些误解后,我认为我可以提供一个解决方案。测量时间点 05.08.1968 23:59:00 的示例输出

示例输出

Out[20]: 
                     depth      temp       salt
1968-05-08 23:59:00   -235  5.766500  12.716000
1968-05-08 23:59:00   -230  5.764000  12.706000
1968-05-08 23:59:00   -225  5.761500  12.696000
1968-05-08 23:59:00   -220  5.752000  12.683000
1968-05-08 23:59:00   -215  5.732000  12.665500
1968-05-08 23:59:00   -210  5.712000  12.648000
1968-05-08 23:59:00   -205  5.692000  12.630500
1968-05-08 23:59:00   -200  5.675200  12.607200
               ...
1968-05-08 23:59:00    -40  2.985000   7.580000
1968-05-08 23:59:00    -35  3.282500   7.550000
1968-05-08 23:59:00    -30  3.580000   7.520000
1968-05-08 23:59:00    -25  3.797500   7.502500
1968-05-08 23:59:00    -20  4.015000   7.485000
1968-05-08 23:59:00    -15  4.232500   7.467500
1968-05-08 23:59:00    -10  4.450000   7.450000
1968-05-08 23:59:00     -5  4.516667   7.450000

解决方案 对于我的最终代码解决方案,我将索引更改为 datetime 对象,使用 groupby() 遍历所有测量时间点,然后使用 np.interp() 进行插值。结果被保存到 pandas DataFrame 的列表中,最后与 interpolated_df = pd.concat(results)

连接到一个 DataFrame
import pandas as pd
from pathlib import Path
import numpy as np


df = pd.read_csv(Path("C:/Test/obs.csv"))
df['date'] = pd.to_datetime(df['date'])
df_nid=df.set_index('date')

# timepoints = df_nid['1968-05-08 23'].groupby('date')
timepoints = df_nid.groupby('date')

results = []
x = np.arange(-235,0,5)

for name, time in timepoints:
    time =time.sort_values('depth')
    xp = time['depth']

    yp_temp = time['temp']
    y_temp = np.interp(x,xp,yp_temp)

    yp_salt = time['salt']
    y_salt = np.interp(x, xp, yp_salt)

    data = {'depth':x,
            'temp':y_temp,
            'salt':y_salt}
    # Create 47 x copies of the current Timestamps as index [-235:0:5] Steps
    idx = [name] * len(x)
    results.append(pd.DataFrame(data=data,index=idx))

interpolated_df = pd.concat(results)