pandas 每月数据的百分比值

percentage value for monthly data with pandas

我有一个示例数据:

date        Product  Sales
2020-01-01.  Dell.    4
2020-01-01.  Apple.   6
2020-01-01.  Lenovo.  5
2020-01-02.  Dell.    2
2020-01-02.  Apple.   4
2020-01-02.  Lenovo.  3

我想创建另一个名为 'percentage monthly sale' 的列,它是通过(产品的每月销售额/当月所有产品的总销售额)* 100 获得的。

输出应如下所示:

date        Product  Sales. Percentage_monthly_sale
2020-01-01.  Dell.    4.      26.6 (4/15 *100)
2020-01-01.  Apple.   6.      40.0. (6/15*100)
2020-01-01.  Lenovo.  5.      33.3.  (5/15 *100)
2020-01-02.  Dell.    2.      22.2 (2/9 *100)
2020-01-02.  Apple.   4.      44.4 (4/9 *100)
2020-01-02.  Lenovo.  3.      33.3 (3/9 *100)

使用 groupby transform with pd.Grouper 计算总和,然后将系列相除并相乘:

(显示所需的输出)

df['date'] = pd.to_datetime(df['date'])

daily_sums = df.groupby(
    pd.Grouper(key='date', freq='1D')
)['Sales'].transform('sum')

df['Percentage_daily_sale'] = df['Sales'] / daily_sums * 100
        date  Product  Sales  Percentage_daily_sale
0 2020-01-01    Dell.      4              26.666667
1 2020-01-01   Apple.      6              40.000000
2 2020-01-01  Lenovo.      5              33.333333
3 2020-01-02    Dell.      2              22.222222
4 2020-01-02   Apple.      4              44.444444
5 2020-01-02  Lenovo.      3              33.333333

获取每个产品每月销售额的百分比:

(解释的期望行为)

df['date'] = pd.to_datetime(df['date'])

monthly_product_total = df.groupby(
    [pd.Grouper(key='date', freq='1M'), 'Product']
)['Sales'].transform('sum')

monthly_total = df.groupby(
    pd.Grouper(key='date', freq='1M')
)['Sales'].transform('sum')

df['Percentage_Monthly_sale'] = monthly_product_total / monthly_total * 100
        date  Product  Sales  Percentage_Monthly_sale
0 2020-01-01    Dell.      4                25.000000
1 2020-01-01   Apple.      6                41.666667
2 2020-01-01  Lenovo.      5                33.333333
3 2020-01-02    Dell.      2                25.000000
4 2020-01-02   Apple.      4                41.666667
5 2020-01-02  Lenovo.      3                33.333333

您可以将 groupby transformlambda function 一起使用:

df['Percentage_daily_sale'] = df.groupby(
    ['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)

输出:

          date  Product  Sales  Percentage_daily_sale
0  2020-01-01.    Dell.      4                  26.67
1  2020-01-01.   Apple.      6                  40.00
2  2020-01-01.  Lenovo.      5                  33.33
3  2020-01-02.    Dell.      2                  22.22
4  2020-01-02.   Apple.      4                  44.44
5  2020-01-02.  Lenovo.      3                  33.33