Pandas 向层次索引的内部级别添加值

Pandas add value to inner level of hierarchical index

我有一个 Pandas 带有分层索引 (MultiIndex) 的 DataFrame。我通过对 "cousub" 和 "year".

的值进行分组来创建此 DataFrame
annualMed = df.groupby(["cousub", "year"])[["ratio", "sr_val_transfer"]].median().round(2)
print annualMed.head(8)    

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0    

我想在 "year" 级别中添加一个 "overall" 值,然后我可以根据单独的 "cousub" 分组填充值,即排除 "year".我希望结果如下所示

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
             Overall   0.50          90000.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0 
             Overall   0.50         135000.0

如何将这个新项目添加到 MultiIndex 的 "years" 级别?

如果您只想显式添加这两列,您可以使用 loc 指定所有 MultiIndex 级别。

df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)
df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)

但是,如果您有整个 list 个城市要为其添加此行,这会有点乏味。也许您可以 append 另一个具有 overall 值的 DataFrame 并进行一些索引操作。

(df.reset_index()
   .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                         ['Belleville city', 'Overall', 0.5, 135000.]], 
                         columns=list(df.index.names) + list(df.columns)))
   .set_index(df.index.names)
   .sort_index())

演示

方法一(小写)

>>> df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)

>>> df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)

>>> df.sort_index()

                         ratio  sr_val_transfer
cousub          year                           
Allen Park city 2013      0.51          75000.0
                2014      0.47          85950.0
                2015      0.47          95030.0
                2016      0.45         102500.0
                Overall   0.50          90000.0
Belleville city 2013      0.49         113900.0
                2014      0.55         114750.0
                2015      0.53         149000.0
                2016      0.48         121500.0
                Overall   0.50         135000.0

方法二(大写)

>>> (df.reset_index()
       .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                             ['Belleville city', 'Overall', 0.5, 135000.]], 
                             columns=list(df.index.names) + list(df.columns)))
       .set_index(df.index.names)
       .sort_index())

                         ratio  sr_val_transfer
cousub          year                           
Allen Park city 2013      0.51          75000.0
                2014      0.47          85950.0
                2015      0.47          95030.0
                2016      0.45         102500.0
                Overall   0.50          90000.0
Belleville city 2013      0.49         113900.0
                2014      0.55         114750.0
                2015      0.53         149000.0
                2016      0.48         121500.0
                Overall   0.50         135000.0