拆分跨 Python 日历季度的财政季度的收入

Split Revenue for Fiscal Quarters that Straddle Calendar Quarters in Python

我有按客户、按产品(类型、ID、描述)、按 "fiscal quarter id" 的销售数据(收入和单位),其中财政季度对该公司来说是唯一的并且不规则(即,每个天数不完全相同)。

我想(我想?)将每一行 "split" 分为两个有效 observations/transactions 以将单位和收入的适当份额分配给财政季度跨越的两个常规日历季度。

我还有一个 table (df2),它将公司的每个财政季度映射到日历开始和结束日期。

小样本:

df1 = pd.DataFrame({'fisc_q_id': ['2013Q1', '2013Q2'], 
                   'cust':['Faux Corp', 'Notaco'], 
                   'prod_id':['ABC-123', 'DEF-456'], 
                   'revenue':[100, 400]})

df2 = pd.DataFrame({'fisc_q_id': ['2013Q1', '2013Q2'], 
                    'fq_start':['2012-07-29', '2012-10-28'], 
                    'fq_end':['2012-10-27', '2013-01-26']})

所需的输出将是四行,每行都保持原始 "fiscal quarter ID",但会添加一个列,其中包含适当的日历季度和该季度的分配收入。

我对这可能如何运作有一些想法,但我的解决方案——如果我能找到一个的话——与你们提供的解决方案相比肯定不够优雅。

IICU

  #Merge the datframes
df3=df1.merge(df2)
#Coerce dates into datetime
df3.fq_start = pd.to_datetime(df3.fq_start)
df3.fq_end = pd.to_datetime(df3.fq_end)#Calculate the Calender Quarter for strat and end
df3['fq_startquarter'] = pd.PeriodIndex(df3.fq_start, freq='Q')
df3['fq_endquarter'] = pd.PeriodIndex(df3.fq_end, freq='Q')
#Calculate the end date of the first quarter in the date range and hence the day difference on either side of the partition
df3['Qdate'] = df3['fq_start'].dt.to_period("Q").dt.end_time
df3['EndQdate'] = pd.to_datetime(df3['Qdate'], format='%Y-%M-%d')
df3['days1']=(df3['EndQdate']-df3['fq_start']).dt.days+1
df3['days2']=(df3['fq_end']-df3['EndQdate']).dt.days
df3['dys0']=(df3['fq_end']-df3['fq_start']).dt.days
df3.drop(columns=['Qdate','EndQdate'], inplace=True)
#Melt the calculated quarters
df4=pd.melt(df3, id_vars=['fisc_q_id','cust','prod_id','revenue','fq_start','fq_end','days1','days2','dys0'], value_name='CalenderQuarter')
df4.sort_values(by='prod_id', inplace=True)
#Allocate groups to the quarteres to allow allocation of calculated days
df4['daysp']=df4.groupby('prod_id')['CalenderQuarter'].cumcount()+1
#Set conditions and choices and use np.where to conditionally calculate revenue prportions
conditions= (df4['daysp']==1, df4['daysp']==2)
choices=(df4['revenue']*(df4['days1']/df4['dys0']),df4['revenue']*(df4['days2']/df4['dys0']))
df4['revenuep']=np.select(conditions,choices)
#Drop columns not required
df4['revenuep']=np.select(conditions,choices).round(0)

卷曲的。肯定有机会方法链,这样更高效更快。