递归:具有分布的账户价值
Recursion: account value with distributions
更新:不确定如果没有某种形式的 , but np.where
will not work here. If the answer is, "you can't", then so be it. If it can be done, it may use something from scipy.signal
.
是否可行
我想对下面代码中的循环进行矢量化,但由于输出的递归性质,我不确定如何进行矢量化。
我当前设置的遍历:
以起始金额(100 万美元)和季度美元分配(5,000 美元)为例:
dist = 5000.
v0 = float(1e6)
每月生成一些随机数 security/account returns(小数形式):
r = pd.Series(np.random.rand(12) * .01,
index=pd.date_range('2017', freq='M', periods=12))
创建一个空系列来保存每月帐户值:
value = pd.Series(np.empty_like(r), index=r.index)
将 "start month" 添加到 value
。此标签将包含 v0
.
from pandas.tseries import offsets
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)]))
.sort_index())
我想摆脱的循环在这里:
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date])
组合代码:
import pandas as pd
from pandas.tseries import offsets
from pandas import Series
import numpy as np
dist = 5000.
v0 = float(1e6)
r = pd.Series(np.random.rand(12) * .01, index=pd.date_range('2017', freq='M', periods=12))
value = pd.Series(np.empty_like(r), index=r.index)
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date])
在伪代码中,循环所做的只是:
for each date in index of value:
if the date is not a quarter end:
multiply previous value by (1 + r) for that month
if the date is a quarter end:
multiply previous value by (1 + r) for that month and subtract dist
问题是,我目前看不到矢量化是如何实现的,因为连续值取决于前一个月是否进行了分布。我得到了想要的结果,但对于更高频率的数据或更长的时间段来说效率很低。
好的...我正在尝试这个。
import numpy as np
import pandas as pd
#Define a generator for accumulating deposits and returns
def gen(lst):
acu = 0
for r, v in lst:
yield acu * (1 + r) +v
acu *= (1 + r)
acu += v
dist = 5000.
v0 = float(1e6)
random_returns = np.random.rand(12) * 0.1
#Create the index.
index=pd.date_range('2016-12-31', freq='M', periods=13)
#Generate a return so that the value at i equals the return from i-1 to i
r = pd.Series(np.insert(random_returns, 0,0), index=index, name='Return')
#Generate series with deposits and withdrawals
w = [-dist if is_q_end else 0 for is_q_end in index [1:].is_quarter_end]
d = pd.Series(np.insert(w, 0, v0), index=index, name='Movements')
df = pd.concat([r, d], axis=1)
df['Value'] = list(gen(zip(df['Return'], df['Movements'])))
现在,你的代码
#Generate some random security/account returns (decimal form) at monthly freq:
r = pd.Series(random_returns,
index=pd.date_range('2017', freq='M', periods=12))
#Create an empty Series that will hold the monthly account values:
value = pd.Series(np.empty_like(r), index=r.index)
#Add a "start month" to value. This label will contain v0.
from pandas.tseries import offsets
value = (value.append(pd.Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
#The loop I'd like to get rid of is here:
def loopy(value) :
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date])
return value
以及比较和计时
(loopy(value)==list(gen(zip(r, d)))).all()
Out[11]: True
returns 相同的结果
%timeit list(gen(zip(r, d)))
%timeit loopy(value)
10000 loops, best of 3: 72.4 µs per loop
100 loops, best of 3: 5.37 ms per loop
而且似乎速度更快一些。希望对你有帮助。
您可以使用以下代码:
cum_r = (1 + r).cumprod()
result = cum_r * v0
for date in r.index[r.index.is_quarter_end]:
result[date:] -= cum_r[date:] * (dist / cum_r.loc[date])
你会:
- 每月累计 1 个产品 returns。
- 1 向量与标量的乘法
v0
n
向量与标量的乘法 dist / cum_r.loc[date]
n
向量减法
其中 n
是季度结束的次数。
基于这段代码我们可以进一步优化:
cum_r = (1 + r).cumprod()
t = (r.index.is_quarter_end / cum_r).cumsum()
result = cum_r * (v0 - dist * t)
也就是
- 1 累计产品
(1 + r).cumprod()
- 两个系列之间的 1 个分区
r.index.is_quarter_end / cum_r
- 上述除法的1个累加和
- 1 上述总和与标量的乘积
dist
- 1 标量
v0
与 dist * t
的减法
- 1
cum_r
与 v0 - dist * t
的点乘法
更新:不确定如果没有某种形式的 np.where
will not work here. If the answer is, "you can't", then so be it. If it can be done, it may use something from scipy.signal
.
我想对下面代码中的循环进行矢量化,但由于输出的递归性质,我不确定如何进行矢量化。
我当前设置的遍历:
以起始金额(100 万美元)和季度美元分配(5,000 美元)为例:
dist = 5000.
v0 = float(1e6)
每月生成一些随机数 security/account returns(小数形式):
r = pd.Series(np.random.rand(12) * .01,
index=pd.date_range('2017', freq='M', periods=12))
创建一个空系列来保存每月帐户值:
value = pd.Series(np.empty_like(r), index=r.index)
将 "start month" 添加到 value
。此标签将包含 v0
.
from pandas.tseries import offsets
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)]))
.sort_index())
我想摆脱的循环在这里:
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date])
组合代码:
import pandas as pd
from pandas.tseries import offsets
from pandas import Series
import numpy as np
dist = 5000.
v0 = float(1e6)
r = pd.Series(np.random.rand(12) * .01, index=pd.date_range('2017', freq='M', periods=12))
value = pd.Series(np.empty_like(r), index=r.index)
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date])
在伪代码中,循环所做的只是:
for each date in index of value:
if the date is not a quarter end:
multiply previous value by (1 + r) for that month
if the date is a quarter end:
multiply previous value by (1 + r) for that month and subtract dist
问题是,我目前看不到矢量化是如何实现的,因为连续值取决于前一个月是否进行了分布。我得到了想要的结果,但对于更高频率的数据或更长的时间段来说效率很低。
好的...我正在尝试这个。
import numpy as np
import pandas as pd
#Define a generator for accumulating deposits and returns
def gen(lst):
acu = 0
for r, v in lst:
yield acu * (1 + r) +v
acu *= (1 + r)
acu += v
dist = 5000.
v0 = float(1e6)
random_returns = np.random.rand(12) * 0.1
#Create the index.
index=pd.date_range('2016-12-31', freq='M', periods=13)
#Generate a return so that the value at i equals the return from i-1 to i
r = pd.Series(np.insert(random_returns, 0,0), index=index, name='Return')
#Generate series with deposits and withdrawals
w = [-dist if is_q_end else 0 for is_q_end in index [1:].is_quarter_end]
d = pd.Series(np.insert(w, 0, v0), index=index, name='Movements')
df = pd.concat([r, d], axis=1)
df['Value'] = list(gen(zip(df['Return'], df['Movements'])))
现在,你的代码
#Generate some random security/account returns (decimal form) at monthly freq:
r = pd.Series(random_returns,
index=pd.date_range('2017', freq='M', periods=12))
#Create an empty Series that will hold the monthly account values:
value = pd.Series(np.empty_like(r), index=r.index)
#Add a "start month" to value. This label will contain v0.
from pandas.tseries import offsets
value = (value.append(pd.Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
#The loop I'd like to get rid of is here:
def loopy(value) :
for date in value.index[1:]:
if date.is_quarter_end:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date]) - dist
else:
value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
* (1 + r.loc[date])
return value
以及比较和计时
(loopy(value)==list(gen(zip(r, d)))).all()
Out[11]: True
returns 相同的结果
%timeit list(gen(zip(r, d)))
%timeit loopy(value)
10000 loops, best of 3: 72.4 µs per loop
100 loops, best of 3: 5.37 ms per loop
而且似乎速度更快一些。希望对你有帮助。
您可以使用以下代码:
cum_r = (1 + r).cumprod()
result = cum_r * v0
for date in r.index[r.index.is_quarter_end]:
result[date:] -= cum_r[date:] * (dist / cum_r.loc[date])
你会:
- 每月累计 1 个产品 returns。
- 1 向量与标量的乘法
v0
n
向量与标量的乘法dist / cum_r.loc[date]
n
向量减法
其中 n
是季度结束的次数。
基于这段代码我们可以进一步优化:
cum_r = (1 + r).cumprod()
t = (r.index.is_quarter_end / cum_r).cumsum()
result = cum_r * (v0 - dist * t)
也就是
- 1 累计产品
(1 + r).cumprod()
- 两个系列之间的 1 个分区
r.index.is_quarter_end / cum_r
- 上述除法的1个累加和
- 1 上述总和与标量的乘积
dist
- 1 标量
v0
与dist * t
的减法
- 1
cum_r
与v0 - dist * t
的点乘法