从计算值中排除一列

Question

我是图书馆的新手，我正在尝试弄清楚如何将列添加到具有最近三个月交易数据的行数据的均值和标准差的数据透视表 table。

这是设置枢轴的代码 table:

previousThreeMonths = [prev_month_for_analysis, prev_month2_for_analysis, prev_month3_for_analysis]
dfPreviousThreeMonths = df[df['Month'].isin(previousThreeMonths)]

ptHistoricalConsumption = dfPreviousThreeMonths.pivot_table(dfPreviousThreeMonths,
                                                            index=['Customer Part #'],
                                                            columns=['Month'],
                                                            aggfunc={'Qty Shp':np.sum}
                                                            )

ptHistoricalConsumption['Mean'] = ptHistoricalConsumption.mean(numeric_only=True, axis=1)
ptHistoricalConsumption['Std Dev'] = ptHistoricalConsumption.std(numeric_only=True, axis=1)
ptHistoricalConsumption

生成的枢轴 table 如下所示：

问题是标准偏差列在其计算中包括了平均值，而我只希望它使用前三个月的原始数据。例如，部件号 2225 的 Std Dev 应该是 11.269，而不是 9.2。

我确定有更好的方法来做到这一点，我只是遗漏了一些东西。

Answer 1

一种方法是在调用 .std():

之前暂时删除 Mean 列

ptHistoricalConsumption['Std Dev'] = ptHistoricalConsumption.drop('Mean', axis=1).std(numeric_only=True, axis=1)

这不会从永久删除它，它只会从提供给 .std() 的副本中删除它。

从计算值中排除一列

Exclude a column from calculated value

python

statistics

pivot-table

dataframe

pandas