pandas 每月数据的百分比值
percentage value for monthly data with pandas
我有一个示例数据:
date Product Sales
2020-01-01. Dell. 4
2020-01-01. Apple. 6
2020-01-01. Lenovo. 5
2020-01-02. Dell. 2
2020-01-02. Apple. 4
2020-01-02. Lenovo. 3
我想创建另一个名为 'percentage monthly sale' 的列,它是通过(产品的每月销售额/当月所有产品的总销售额)* 100 获得的。
输出应如下所示:
date Product Sales. Percentage_monthly_sale
2020-01-01. Dell. 4. 26.6 (4/15 *100)
2020-01-01. Apple. 6. 40.0. (6/15*100)
2020-01-01. Lenovo. 5. 33.3. (5/15 *100)
2020-01-02. Dell. 2. 22.2 (2/9 *100)
2020-01-02. Apple. 4. 44.4 (4/9 *100)
2020-01-02. Lenovo. 3. 33.3 (3/9 *100)
使用 groupby transform
with pd.Grouper
计算总和,然后将系列相除并相乘:
(显示所需的输出)
df['date'] = pd.to_datetime(df['date'])
daily_sums = df.groupby(
pd.Grouper(key='date', freq='1D')
)['Sales'].transform('sum')
df['Percentage_daily_sale'] = df['Sales'] / daily_sums * 100
date Product Sales Percentage_daily_sale
0 2020-01-01 Dell. 4 26.666667
1 2020-01-01 Apple. 6 40.000000
2 2020-01-01 Lenovo. 5 33.333333
3 2020-01-02 Dell. 2 22.222222
4 2020-01-02 Apple. 4 44.444444
5 2020-01-02 Lenovo. 3 33.333333
获取每个产品每月销售额的百分比:
(解释的期望行为)
df['date'] = pd.to_datetime(df['date'])
monthly_product_total = df.groupby(
[pd.Grouper(key='date', freq='1M'), 'Product']
)['Sales'].transform('sum')
monthly_total = df.groupby(
pd.Grouper(key='date', freq='1M')
)['Sales'].transform('sum')
df['Percentage_Monthly_sale'] = monthly_product_total / monthly_total * 100
date Product Sales Percentage_Monthly_sale
0 2020-01-01 Dell. 4 25.000000
1 2020-01-01 Apple. 6 41.666667
2 2020-01-01 Lenovo. 5 33.333333
3 2020-01-02 Dell. 2 25.000000
4 2020-01-02 Apple. 4 41.666667
5 2020-01-02 Lenovo. 3 33.333333
您可以将 groupby transform
与 lambda function
一起使用:
df['Percentage_daily_sale'] = df.groupby(
['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)
输出:
date Product Sales Percentage_daily_sale
0 2020-01-01. Dell. 4 26.67
1 2020-01-01. Apple. 6 40.00
2 2020-01-01. Lenovo. 5 33.33
3 2020-01-02. Dell. 2 22.22
4 2020-01-02. Apple. 4 44.44
5 2020-01-02. Lenovo. 3 33.33
我有一个示例数据:
date Product Sales
2020-01-01. Dell. 4
2020-01-01. Apple. 6
2020-01-01. Lenovo. 5
2020-01-02. Dell. 2
2020-01-02. Apple. 4
2020-01-02. Lenovo. 3
我想创建另一个名为 'percentage monthly sale' 的列,它是通过(产品的每月销售额/当月所有产品的总销售额)* 100 获得的。
输出应如下所示:
date Product Sales. Percentage_monthly_sale
2020-01-01. Dell. 4. 26.6 (4/15 *100)
2020-01-01. Apple. 6. 40.0. (6/15*100)
2020-01-01. Lenovo. 5. 33.3. (5/15 *100)
2020-01-02. Dell. 2. 22.2 (2/9 *100)
2020-01-02. Apple. 4. 44.4 (4/9 *100)
2020-01-02. Lenovo. 3. 33.3 (3/9 *100)
使用 groupby transform
with pd.Grouper
计算总和,然后将系列相除并相乘:
(显示所需的输出)
df['date'] = pd.to_datetime(df['date'])
daily_sums = df.groupby(
pd.Grouper(key='date', freq='1D')
)['Sales'].transform('sum')
df['Percentage_daily_sale'] = df['Sales'] / daily_sums * 100
date Product Sales Percentage_daily_sale
0 2020-01-01 Dell. 4 26.666667
1 2020-01-01 Apple. 6 40.000000
2 2020-01-01 Lenovo. 5 33.333333
3 2020-01-02 Dell. 2 22.222222
4 2020-01-02 Apple. 4 44.444444
5 2020-01-02 Lenovo. 3 33.333333
获取每个产品每月销售额的百分比:
(解释的期望行为)
df['date'] = pd.to_datetime(df['date'])
monthly_product_total = df.groupby(
[pd.Grouper(key='date', freq='1M'), 'Product']
)['Sales'].transform('sum')
monthly_total = df.groupby(
pd.Grouper(key='date', freq='1M')
)['Sales'].transform('sum')
df['Percentage_Monthly_sale'] = monthly_product_total / monthly_total * 100
date Product Sales Percentage_Monthly_sale
0 2020-01-01 Dell. 4 25.000000
1 2020-01-01 Apple. 6 41.666667
2 2020-01-01 Lenovo. 5 33.333333
3 2020-01-02 Dell. 2 25.000000
4 2020-01-02 Apple. 4 41.666667
5 2020-01-02 Lenovo. 3 33.333333
您可以将 groupby transform
与 lambda function
一起使用:
df['Percentage_daily_sale'] = df.groupby(
['date'])['Sales'].transform(lambda x: (x/x.sum()) * 100)
输出:
date Product Sales Percentage_daily_sale
0 2020-01-01. Dell. 4 26.67
1 2020-01-01. Apple. 6 40.00
2 2020-01-01. Lenovo. 5 33.33
3 2020-01-02. Dell. 2 22.22
4 2020-01-02. Apple. 4 44.44
5 2020-01-02. Lenovo. 3 33.33