映射两个 Pandas DataFrame 的行
Mapping Rows of Two Pandas DataFrames
我正在使用两个 DataFrame,
df1
altitude
132 16.324794
133 16.027025
134 15.738367
135 15.462613
136 15.195307
137 14.934009
138 14.682448
139 14.440509
140 14.207593
141 14.070644
df2
altitude density east_wind north_wind
0 5 0.020567 39.714397 6.795392
1 7 0.016871 41.171996 6.852655
2 9 0.013839 42.629594 6.909918
3 11 0.011351 44.087193 6.967182
4 13 0.009311 45.544791 7.024445
5 15 0.007638 47.003028 7.079618
6 17 0.006263 48.303168 7.340789
7 19 0.005129 48.942837 8.478684
8 21 0.004201 49.588021 9.587021
9 23 0.003433 50.797853 11.256209
我想将 df1 中的 altitude
映射到 df2 中最接近的 altitude
并最终合并 density
east_wind
和 north_wind
值该行到一个新的数据框。
预期结果
altitude density east_wind north_wind
132 16.324794 0.006263 48.303168 7.340789
136 15.195307 0.007638 47.003028 7.079618
137 14.934009 0.007638 47.003028 7.079618
请指教
您可以使用 idxmin
:
获得最接近的 df2.altitude
索引
df1['df2_idx'] = df1.altitude.apply(lambda x: df2.altitude.sub(x).abs().idxmin())
# altitude df2_idx
# 132 16.324794 6
# 133 16.027025 6
# 134 15.738367 5
# 135 15.462613 5
# 136 15.195307 5
# 137 14.934009 5
# 138 14.682448 5
# 139 14.440509 5
# 140 14.207593 5
# 141 14.070644 5
然后 merge
在 df1.df2_idx
和 df2.index
上:
df1.merge(df2.drop('altitude', axis=1), left_on='df2_idx', right_index=True).drop('df2_idx', axis=1)
# altitude density east_wind north_wind
# 132 16.324794 0.006263 48.303168 7.340789
# 133 16.027025 0.006263 48.303168 7.340789
# 134 15.738367 0.007638 47.003028 7.079618
# 135 15.462613 0.007638 47.003028 7.079618
# 136 15.195307 0.007638 47.003028 7.079618
# 137 14.934009 0.007638 47.003028 7.079618
# 138 14.682448 0.007638 47.003028 7.079618
# 139 14.440509 0.007638 47.003028 7.079618
# 140 14.207593 0.007638 47.003028 7.079618
# 141 14.070644 0.007638 47.003028 7.079618
我正在使用两个 DataFrame,
df1
altitude
132 16.324794
133 16.027025
134 15.738367
135 15.462613
136 15.195307
137 14.934009
138 14.682448
139 14.440509
140 14.207593
141 14.070644
df2
altitude density east_wind north_wind
0 5 0.020567 39.714397 6.795392
1 7 0.016871 41.171996 6.852655
2 9 0.013839 42.629594 6.909918
3 11 0.011351 44.087193 6.967182
4 13 0.009311 45.544791 7.024445
5 15 0.007638 47.003028 7.079618
6 17 0.006263 48.303168 7.340789
7 19 0.005129 48.942837 8.478684
8 21 0.004201 49.588021 9.587021
9 23 0.003433 50.797853 11.256209
我想将 df1 中的 altitude
映射到 df2 中最接近的 altitude
并最终合并 density
east_wind
和 north_wind
值该行到一个新的数据框。
预期结果
altitude density east_wind north_wind
132 16.324794 0.006263 48.303168 7.340789
136 15.195307 0.007638 47.003028 7.079618
137 14.934009 0.007638 47.003028 7.079618
请指教
您可以使用 idxmin
:
df2.altitude
索引
df1['df2_idx'] = df1.altitude.apply(lambda x: df2.altitude.sub(x).abs().idxmin())
# altitude df2_idx
# 132 16.324794 6
# 133 16.027025 6
# 134 15.738367 5
# 135 15.462613 5
# 136 15.195307 5
# 137 14.934009 5
# 138 14.682448 5
# 139 14.440509 5
# 140 14.207593 5
# 141 14.070644 5
然后 merge
在 df1.df2_idx
和 df2.index
上:
df1.merge(df2.drop('altitude', axis=1), left_on='df2_idx', right_index=True).drop('df2_idx', axis=1)
# altitude density east_wind north_wind
# 132 16.324794 0.006263 48.303168 7.340789
# 133 16.027025 0.006263 48.303168 7.340789
# 134 15.738367 0.007638 47.003028 7.079618
# 135 15.462613 0.007638 47.003028 7.079618
# 136 15.195307 0.007638 47.003028 7.079618
# 137 14.934009 0.007638 47.003028 7.079618
# 138 14.682448 0.007638 47.003028 7.079618
# 139 14.440509 0.007638 47.003028 7.079618
# 140 14.207593 0.007638 47.003028 7.079618
# 141 14.070644 0.007638 47.003028 7.079618