对 Dataframe 的行应用权重公式 Pandas

Question

我下面有一个df1。我将它复制到 df2 以保存 df1；然后我使用 df3 来计算 df2.

df2=df1.copy()

我想计算一个权重，例如 Weight(A) = Price(A) / Sum(row_Prices) 和 return 它低于价格 df2 例如对于每一行我得到 3 行数据，价格，标准和重量行。我还想计算该行的标准差，我想它的形式类似。

我试过了

df3 = df2.iloc[1:,1:].div(df2.iloc[1:,1:].sum(axis=1), axis=0)

获取权重然后打印 df3 但它不起作用。

为了每个日期得到 2 行，我尝试堆叠 .stack() 但我可能做错了。帮助！谢谢

                       A      B      C        D     E
2006-04-27 00:00:00                                    
2006-04-28 00:00:00  69.62  69.62  6.518   65.09  69.62
2006-05-01 00:00:00   71.5   71.5  6.522   65.16   71.5
2006-05-02 00:00:00  72.34  72.34  6.669   66.55  72.34
2006-05-03 00:00:00  70.22  70.22  6.662   66.46  70.22
2006-05-04 00:00:00  68.32  68.32  6.758   67.48  68.32
2006-05-05 00:00:00     68     68  6.805   67.99     68
2006-05-08 00:00:00  67.88  67.88  6.768   67.56  67.88

我希望它能很好地输出：

                            A      B      C        D     E
2006-04-27 00:00:00

2006-04-28 00:00:00                                    
            price        69.62  69.62  6.518   65.09  69.62
            weight
            std
2006-05-01 00:00:00  
            price         71.5   71.5  6.522   65.16   71.5
            weight
            std
2006-05-02 00:00:00   
            price        72.34  72.34  6.669   66.55  72.34
            weight
            std

Answer 1

据我所知，没有一种简单快捷的方法可以实现您想要做的事情。您需要计算所有数据，然后将其全部合并到使用多级索引的 DataFrame 中：

# Making weight/std DataFrames
cols = list('ABCDE')
weight = pd.DataFrame([df[col] / df.sum(axis=1) for col in df], index=cols).T
std = pd.DataFrame([df.std(axis=1) for col in df], index=cols).T

# Making MultiIndex DataFrame
mindex = pd.MultiIndex.from_product([['price', 'weight', 'std'], df.index])
new_df = pd.DataFrame(index=mindex, columns=cols)

# Inserting data
new_df.ix['price'] = df.values
new_df.ix['weight'] = weight.values
new_df.ix['std'] = std.values

# Swapping levels
new_df = new_df.swaplevel(0, 1).sort_index()

结果 new_df 应该看起来像这样：

2006-04-28 price      69.62     69.62      6.518     65.09     69.62
           std      27.7829   27.7829    27.7829   27.7829   27.7829
           weight  0.248228  0.248228  0.0232397  0.232076  0.248228
2006-05-01 price       71.5      71.5      6.522     65.16      71.5
           std      28.4828   28.4828    28.4828   28.4828   28.4828
           weight  0.249841  0.249841  0.0227897  0.227687  0.249841
2006-05-02 price      72.34     72.34      6.669     66.55     72.34
           std      28.8308   28.8308    28.8308   28.8308   28.8308
           weight  0.249243  0.249243  0.0229776  0.229294  0.249243
2006-05-03 price      70.22     70.22      6.662     66.46     70.22
           std      28.0509   28.0509    28.0509   28.0509   28.0509
           weight  0.247443  0.247443  0.0234758  0.234194  0.247443
2006-05-04 price      68.32     68.32      6.758     67.48     68.32
           std      27.4399   27.4399    27.4399   27.4399   27.4399
           weight  0.244701  0.244701   0.024205  0.241692  0.244701
2006-05-05 price         68        68      6.805     67.99        68
           std      27.3661   27.3661    27.3661   27.3661   27.3661
           weight  0.243907  0.243907  0.0244086  0.243871  0.243907
2006-05-08 price      67.88     67.88      6.768     67.56     67.88
           std      27.2947   27.2947    27.2947   27.2947   27.2947
           weight  0.244201  0.244201  0.0243481   0.24305  0.244201

附带说明一下，我不确定您要计算哪种标准，所以我只是假设它是行价格标准（每行的 single/repeated 值） .

对 Dataframe 的行应用权重公式 Pandas

Apply weight formula over rows of Dataframe Pandas

python

apply

dataframe

pandas