pandas 数据框的累计百分比

Cumulative percentage of pandas data frame

我有一个如下所示的数据框,具有特定的 ID(代码)和区域以及特定距离的长度 (Dist_km)

     code  Dist_km    Shape_Leng    Shape_Area
0   M0017      5.0  57516.601608  5.076465e+07   
1   M0017     10.0  94037.663673  4.638184e+07   
2   M0017     15.0  39106.310470  1.426327e+07   
3   M0017     20.0    138.038115  6.464380e+02   
4   M0017     30.0  12158.395200  4.102351e+06   
5   M0073      5.0  51922.847698  3.375080e+07   
6   M0073     10.0  75543.660382  5.966612e+07   
7   M0073     15.0  55277.027428  3.423961e+07   
8   M0073     20.0  26945.782055  2.584022e+07   
9   M0073     25.0   4052.670711  6.904536e+05   
10  M0333      5.0  30090.687597  5.468791e+07   
11  M0333     10.0  55946.815385  5.768929e+07   
12  M0333     15.0  65026.329732  4.008600e+07   
13  M0333     20.0  59014.487216  2.994337e+07   
14  M0333     25.0  17423.635441  6.358991e+06  

使用:

mrb['cum_area_sqm'] = mrb.groupby(['code'])['Shape_Area'].apply(lambda x: x.cumsum())
mrb['cum_area_ha'] = mrb['cum_area_sqm']/10000
mrb_cumsum = mrb.groupby(['code','Dist_km']).agg({'cum_area_ha': 'sum'})

我已经成功地将数据框转换成下面的格式

               cum_area_ha
code  Dist_km              
M0017 5.0       5076.464548
      10.0      9714.648238
      15.0     11140.974881
      20.0     11141.039525
      30.0     11551.274623
M0073 5.0       3375.080465
      10.0      9341.692680
      15.0     12765.654064
      20.0     15349.676332
      25.0     15418.721691
M0333 5.0       5468.790981
      10.0     11237.720454
      15.0     15246.320869
      20.0     18240.658255
      25.0     18876.557351 

但是,我现在想获得每个 code 这些区域的累积百分比 Dist_km,最高可达 100%。

所以,例如对于 M0017,我想要类似下面的内容。

               cum_area_ha   cum_area_pc
code  Dist_km              
M0017 5.0       5076.464548    43.49
      10.0      9714.648238    84.10
      15.0     11140.974881    96.45
      20.0     11141.039525    96.45
      30.0     11551.274623   100.00

您可以将每个元素除以同一代码组中的最后一个 cum_area_ha。

mrb_cumsum.div(mrb_cumsum.groupby(level=0).last())
Out[97]: 
               cum_area_ha
code  Dist_km             
M0017 5.0         0.439472
      10.0        0.841002
      15.0        0.964480
      20.0        0.964486
      30.0        1.000000
M0073 5.0         0.218895
      10.0        0.605867
      15.0        0.827932
      20.0        0.995522
      25.0        1.000000
M0333 5.0         0.289713
      10.0        0.595327
      15.0        0.807685
      20.0        0.966313
      25.0        1.000000