将 pandas 多索引数据框插入到特定位置的另一个多索引数据框
Insert a pandas multi-index dataframe into another multi-index data frame at a particular location
我有一个 parent_df 和一个 child_df,如下所示。
parent_df:
x y colA
x1 y1 A1
x1 y2 A2
x2 y1 A3
x2 y2 A4
child_df:
p q colB colC
p1 q1 B1 C1
p1 q2 B2 C2
p2 q1 B3 C3
p2 q2 B4 C4
我想修改 parent_df 或创建一个新的 parent_df,方法是将 child_df 放入 parent_df 中特定行的 parent_df(x2, y1) 这样:
parent_df:
x y p q colA colB colC
x1 y1 A1 NA NA
x1 y2 A2 NA NA
x2 y1 p1 q1 A3 B1 C1
p1 q2 A3 B2 C2
p2 q1 A3 B3 C3
p2 q2 A3 B4 C4
x2 y2 A4 NA NA
有办法吗?
我认为你需要merge
with sort_index
:
print (parent_df)
colA
x y
x1 y1 A1
y2 A2
x2 y1 A3
y2 A4
print (child_df)
colB colC
p q
p1 q1 B1 C1
q2 B2 C2
p2 q1 B3 C3
q2 B4 C4
#create new columns
child_df['x'] = 'x2'
child_df['y'] = 'y1'
#set index by new columns
child_df = child_df.reset_index().set_index(['x','y'])
print (child_df)
p q colB colC
x y
x2 y1 p1 q1 B1 C1
y1 p1 q2 B2 C2
y1 p2 q1 B3 C3
y1 p2 q2 B4 C4
df = pd.merge(parent_df, child_df, left_index=True, right_index=True, how='outer')
#replace NaN in p. q columns with '', append and sort index
df = df.fillna({'p':'','q':''}).set_index(['p','q'], append=True).sort_index()
print (df)
colA colB colC
x y p q
x1 y1 A1 NaN NaN
y2 A2 NaN NaN
x2 y1 p1 q1 A3 B1 C1
q2 A3 B2 C2
p2 q1 A3 B3 C3
q2 A3 B4 C4
y2 A4 NaN NaN
我有一个 parent_df 和一个 child_df,如下所示。
parent_df:
x y colA
x1 y1 A1
x1 y2 A2
x2 y1 A3
x2 y2 A4
child_df:
p q colB colC
p1 q1 B1 C1
p1 q2 B2 C2
p2 q1 B3 C3
p2 q2 B4 C4
我想修改 parent_df 或创建一个新的 parent_df,方法是将 child_df 放入 parent_df 中特定行的 parent_df(x2, y1) 这样:
parent_df:
x y p q colA colB colC
x1 y1 A1 NA NA
x1 y2 A2 NA NA
x2 y1 p1 q1 A3 B1 C1
p1 q2 A3 B2 C2
p2 q1 A3 B3 C3
p2 q2 A3 B4 C4
x2 y2 A4 NA NA
有办法吗?
我认为你需要merge
with sort_index
:
print (parent_df)
colA
x y
x1 y1 A1
y2 A2
x2 y1 A3
y2 A4
print (child_df)
colB colC
p q
p1 q1 B1 C1
q2 B2 C2
p2 q1 B3 C3
q2 B4 C4
#create new columns
child_df['x'] = 'x2'
child_df['y'] = 'y1'
#set index by new columns
child_df = child_df.reset_index().set_index(['x','y'])
print (child_df)
p q colB colC
x y
x2 y1 p1 q1 B1 C1
y1 p1 q2 B2 C2
y1 p2 q1 B3 C3
y1 p2 q2 B4 C4
df = pd.merge(parent_df, child_df, left_index=True, right_index=True, how='outer')
#replace NaN in p. q columns with '', append and sort index
df = df.fillna({'p':'','q':''}).set_index(['p','q'], append=True).sort_index()
print (df)
colA colB colC
x y p q
x1 y1 A1 NaN NaN
y2 A2 NaN NaN
x2 y1 p1 q1 A3 B1 C1
q2 A3 B2 C2
p2 q1 A3 B3 C3
q2 A3 B4 C4
y2 A4 NaN NaN