在 python 中创建额外的 headers (pandas) 关卡

Question

我是编程新手，但目前正在使用数据帧。我正在尝试将我当前的数据框堆叠到特定的“设计”中。目前我正在处理更大的文件，其中包含大量数据。但是我不能根据我的意愿堆叠（）我的数据，而且形状一团糟。我需要有关如何定义多索引、创建更多级别的帮助。

希望大家能帮帮我，我贴个例子

我从我的代码中得到了什么（在 stack() 之前）：

    Exports      NaN      NaN      NaN      Net Exports       NaN      NaN  
0      Total   Sweden   Norway  Germany        Total   Sweden   Norway    
1     1032.8      358    239.7    435.1        636.8    274.1      9.7   
2     1198.8    556.4    211.8    430.6        846.3    522.6     -1.1   `

使用堆栈():

     Exports            Total
     NaN               Sweden
     NaN               Norway
     NaN              Germany
     Net Exports        Total
     NaN               Sweden
     NaN               Norway
     NaN              Germany
     NaN                  GWh
1    Exports           1032.8
     NaN                  358
     NaN                239.7
     NaN                435.1
     Net Exports        636.8
     NaN                274.1
     NaN                  9.7
     NaN                  353

预先感谢您帮助我

Answer 1

我认为你需要：

print (r.head())
    Unnamed: 18 Unnamed: 19 Unnamed: 20 Unnamed: 21   Unnamed: 22 Unnamed: 23  \
0       Exports         NaN         NaN         NaN  Net Exports          NaN   
2         Total      Sweden      Norway     Germany         Total      Sweden   
189      1032.8         358       239.7       435.1         636.8       274.1   
190      1198.8       556.4       211.8       430.6         846.3       522.6   
191       982.7       159.3       166.2       657.2         276.3      -156.8   

    Unnamed: 24 Unnamed: 25     Unit:  
0           NaN         NaN       NaN  
2        Norway     Germany       GWh  
189         9.7         353   January  
190        -1.1       324.8  February  
191      -105.9         539     March

#create index from column Unit 
r = r.set_index('Unit:')
#create Multiindex from first and second row
#NaNs in frst row was replace by ffill - forward filling fillna()
r.columns= pd.MultiIndex.from_arrays([r.iloc[0].ffill(), r.iloc[1]], names=(None, None))
#remove first and second row
r = r.iloc[2:]

print (r.head())
         Exports                       Net Exports                       
           Total Sweden Norway Germany        Total Sweden Norway Germany
Unit:                                                                    
January   1032.8    358  239.7   435.1        636.8  274.1    9.7     353
February  1198.8  556.4  211.8   430.6        846.3  522.6   -1.1   324.8
March      982.7  159.3  166.2   657.2        276.3 -156.8 -105.9     539
April      962.3   22.1     62   878.2       -268.6 -741.3 -352.9   825.6
May        951.2   13.5   15.9   921.8       -511.5 -885.2 -496.4   870.1

print (r.stack().head(10))
                 Exports Net Exports 
Unit:                                
January  Germany   435.1          353
         Norway    239.7          9.7
         Sweden      358        274.1
         Total    1032.8        636.8
February Germany   430.6        324.8
         Norway    211.8         -1.1
         Sweden    556.4        522.6
         Total    1198.8        846.3
March    Germany   657.2          539
         Norway    166.2       -105.9

在 python 中创建额外的 headers (pandas) 关卡

creating additional levels of headers (pandas) in python

python

multi-index

dataframe

python-2.7

pandas