将多列转换为 interpolate/copy 个缺失值

transform on multiple columns to interpolate/copy missing values

我正在尝试通过插入或复制组内的最后一个已知值(由 trip 标识)来填充 pandas 数据框中的缺失值。我的数据如下所示:

    brake   speed   trip
0   0.0     NaN     1
1   1.0     NaN     1
2   NaN     1.264   1
3   NaN     0.000   1
4   0.0     NaN     1
5   NaN     1.264   1
6   NaN     6.704   1
7   1.0     NaN     1
8   0.0     NaN     1
9   NaN     11.746  2
10  1.0     NaN     2
11  0.0     NaN     2
12  NaN     16.961  3
13  1.0     NaN     3
14  NaN     11.832  3
15  0.0     NaN     3
16  NaN     17.082  3
17  NaN     22.435  3
18  NaN     28.707  3
19  NaN     34.216  3

我找到了 但我需要 brake 简单地从最后已知的复制,但 speed 进行插值(我的实际数据集有 12 列,每个列都需要这样的待遇)

您可以对每一列应用不同的方法。例如:

# interpolate speed
df['speed'] = df.groupby('trip').speed.transform(lambda x: x.interpolate())
# fill brake with last known value 
df['brake'] = df.groupby('trip').brake.transform(lambda x: x.fillna(method='ffill'))

>>> df
    brake    speed  trip
0     0.0      NaN     1
1     1.0      NaN     1
2     1.0   1.2640     1
3     1.0   0.0000     1
4     0.0   0.6320     1
5     0.0   1.2640     1
6     0.0   6.7040     1
7     1.0   6.7040     1
8     0.0   6.7040     1
9     NaN  11.7460     2
10    1.0  11.7460     2
11    0.0  11.7460     2
12    NaN  16.9610     3
13    1.0  14.3965     3
14    1.0  11.8320     3
15    0.0  14.4570     3
16    0.0  17.0820     3
17    0.0  22.4350     3
18    0.0  28.7070     3
19    0.0  34.2160     3

请注意,这意味着您仍然保持 NaN 的刹车,因为行程的第一排没有 "last known value",而第一排的速度为 NaNs几行是 NaN。您可以将它们替换为您认为合适的 fillna()