了解列之间的相关性 Pandas DataFrame
Understanding Correlation Between Columns Pandas DataFrame
我有一个数据集,其中包含两种产品在发布前 10 天的每日销售额。下面的数据框显示了每种产品每天销售的一件商品和几十件商品。相信没有一件产品卖完就卖掉几十件。这两款产品(Period_ID)预计销售数量为数十件。
d = {'Period_ID':['A12']*10, 'Prod_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
df = pd.DataFrame(data=d)
问题
我想构建一个描述性分析,其中我的一个问题是计算在第一次、第二次、...、第 10 次售出一打之前,每种产品平均售出多少件?
鉴于 df.Period_ID.nunique() = 1568
修改 与上述累计销售额相对的每日销售额数据集,并使用 Pankaj Joshi
稍作改动的解决方案,
print(f'Average number of single items before {index + 1} dozen = {df1.A_Singles[:val+1].mean():0.2f}')
d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,0,1,0,1,0,1], 'B_Singles':[0,0,1,0,1,0,1,0,1,0],
'A_Dozens':[0,0,0,0,0,0,0,1,0,0], 'B_Dozens':[0,0,0,0,0,0,1,0,1,0]}
df1 = pd.DataFrame(data=d)
# For product A
Average number of single items before 1 dozen = 0.38
# For product B
6
Average number of single items before 1 dozen = 0.43
8
Average number of single items before 2 dozen = 0.44, But I want this to be counted from the last Dozens of sales. so rather 0.44, it should be 0.5
目的是一旦我获得每个 Period_ID
的信息,然后我将取所有 df.Period_ID.nunique() (= 1568) 的平均值并尝试优化 [=34] 的预期数量=] col Prod_A_Doz 和 Prod_B_Doz
下给出的每个产品的销售额
我将不胜感激。
以下是我的处理方式:
d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
df1 = pd.DataFrame(data=d)
for per_id in set(df1.Period_ID):
print(per_id)
df_temp = df1[df1.Period_ID == per_id]
for index, val in enumerate(df_temp.index[df_temp.A_Dozens>0]):
print(val)
print(f'Average number of single items before {index} dozen = {df_temp.A_Singles[:val].mean():0.2f}')
print(f'Average number of single items before {index} dozen = {df_temp.B_Dozens[:val].mean():0.2f}')
我有一个数据集,其中包含两种产品在发布前 10 天的每日销售额。下面的数据框显示了每种产品每天销售的一件商品和几十件商品。相信没有一件产品卖完就卖掉几十件。这两款产品(Period_ID)预计销售数量为数十件。
d = {'Period_ID':['A12']*10, 'Prod_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
df = pd.DataFrame(data=d)
问题
我想构建一个描述性分析,其中我的一个问题是计算在第一次、第二次、...、第 10 次售出一打之前,每种产品平均售出多少件?
鉴于 df.Period_ID.nunique() = 1568
修改 与上述累计销售额相对的每日销售额数据集,并使用 Pankaj Joshi
稍作改动的解决方案,
print(f'Average number of single items before {index + 1} dozen = {df1.A_Singles[:val+1].mean():0.2f}')
d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,0,1,0,1,0,1], 'B_Singles':[0,0,1,0,1,0,1,0,1,0],
'A_Dozens':[0,0,0,0,0,0,0,1,0,0], 'B_Dozens':[0,0,0,0,0,0,1,0,1,0]}
df1 = pd.DataFrame(data=d)
# For product A
Average number of single items before 1 dozen = 0.38
# For product B
6
Average number of single items before 1 dozen = 0.43
8
Average number of single items before 2 dozen = 0.44, But I want this to be counted from the last Dozens of sales. so rather 0.44, it should be 0.5
目的是一旦我获得每个 Period_ID
的信息,然后我将取所有 df.Period_ID.nunique() (= 1568) 的平均值并尝试优化 [=34] 的预期数量=] col Prod_A_Doz 和 Prod_B_Doz
我将不胜感激。
以下是我的处理方式:
d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
df1 = pd.DataFrame(data=d)
for per_id in set(df1.Period_ID):
print(per_id)
df_temp = df1[df1.Period_ID == per_id]
for index, val in enumerate(df_temp.index[df_temp.A_Dozens>0]):
print(val)
print(f'Average number of single items before {index} dozen = {df_temp.A_Singles[:val].mean():0.2f}')
print(f'Average number of single items before {index} dozen = {df_temp.B_Dozens[:val].mean():0.2f}')