使用需要先前计算值的 numpy 对值进行矢量化计算
Vectorizing calculation of values using numpy which requires previously calculated value
我正在尝试从 Investopedia 计算 EMA 的特定公式,看起来像
EmaToday = (ValueToday ∗ (Smoothing / 1+Days))
+ (EmaYesterday * (1 - (Smoothing / 1+Days)))
我们可以将其简化为:
Smoothing and Days are constants.
Let's call (Smoothing / 1 + Days) as 'M'
The simplified equation becomes:
EmaToday = ((ValueToday - EmaYesterday) * M) + EmaYesterday
我们可以在传统的 python 中使用循环来做到这一点,如下所示:
# Initialize an empty numpy array to hold calculated ema values
emaTodayArray = np.zeros((1, valueTodayArray.size - Days), dtype=np.float32)
ema = emaYesterday
# Calculate ema
for i, valueToday in enumerate(np.nditer(valueList)):
ema = ((valueToday - ema) * M) + ema
emaTodayArray[i] = ema
emaTodayArray
包含所有计算的 EMA 值。
我很难弄清楚如何将其完全矢量化,因为每个新计算都需要 emaYesterday
值。
如果首先可以使用 numpy 进行完全矢量化,如果有人能告诉我方法,我将不胜感激。
注意:我不得不填写一些假人来制作你的代码运行,请检查它们是否正确。
循环可以通过变换 ema[i] ~> ema'[i] = ema[i] x (1-M)^-i
进行矢量化,之后它变成 cumsum
.
这在下面实现为 ema_pp_naive
。
此方法的问题在于,对于中等大小的 i
(~10^3),(1-M)^-i 项可能会溢出,导致结果无用。
我们可以通过记录 space 来绕过这个问题(使用 np.logaddexp
求和)。 ema_pp_safe
比原始方法贵很多,但仍然比原始循环快 10 倍以上。在我快速而肮脏的测试中,这为一百万个及以上的术语提供了正确的结果。
代码:
import numpy as np
K = 1000
Days = 0
emaYesterday = np.random.random()
valueTodayArray = np.random.random(K)
M = np.random.random()
valueList = valueTodayArray
import time
T = []
T.append(time.perf_counter())
# Initialize an empty numpy array to hold calculated ema values
emaTodayArray = np.zeros((valueTodayArray.size - Days), dtype=np.float32)
ema = emaYesterday
# Calculate ema
for i, valueToday in enumerate(np.nditer(valueList)):
ema = ((valueToday - ema) * M) + ema
emaTodayArray[i] = ema
T.append(time.perf_counter())
scaling = np.broadcast_to(1/(1-M), valueTodayArray.size+1).cumprod()
ema_pp_naive = ((np.concatenate([[emaYesterday], valueTodayArray * M]) * scaling).cumsum() / scaling)[1:]
T.append(time.perf_counter())
logscaling = np.log(1-M)*np.arange(valueTodayArray.size+1)
log_ema_pp = np.logaddexp.accumulate(np.log(np.concatenate([[emaYesterday], valueTodayArray * M])) - logscaling) + logscaling
ema_pp_safe = np.exp(log_ema_pp[1:])
T.append(time.perf_counter())
print(f'K = {K}')
print('naive method correct:', np.allclose(ema_pp_naive, emaTodayArray))
print('safe method correct:', np.allclose(ema_pp_safe, emaTodayArray))
print('OP {:.3f} ms naive {:.3f} ms safe {:.3f} ms'.format(*np.diff(T)*1000))
示例 运行s:
K = 100
naive method correct: True
safe method correct: True
OP 0.236 ms naive 0.061 ms safe 0.053 ms
K = 1000
naive method correct: False
safe method correct: True
OP 2.397 ms naive 0.224 ms safe 0.183 ms
K = 1000000
naive method correct: False
safe method correct: True
OP 2145.956 ms naive 18.342 ms safe 108.528 ms
我正在尝试从 Investopedia 计算 EMA 的特定公式,看起来像
EmaToday = (ValueToday ∗ (Smoothing / 1+Days))
+ (EmaYesterday * (1 - (Smoothing / 1+Days)))
我们可以将其简化为:
Smoothing and Days are constants.
Let's call (Smoothing / 1 + Days) as 'M'
The simplified equation becomes:
EmaToday = ((ValueToday - EmaYesterday) * M) + EmaYesterday
我们可以在传统的 python 中使用循环来做到这一点,如下所示:
# Initialize an empty numpy array to hold calculated ema values
emaTodayArray = np.zeros((1, valueTodayArray.size - Days), dtype=np.float32)
ema = emaYesterday
# Calculate ema
for i, valueToday in enumerate(np.nditer(valueList)):
ema = ((valueToday - ema) * M) + ema
emaTodayArray[i] = ema
emaTodayArray
包含所有计算的 EMA 值。
我很难弄清楚如何将其完全矢量化,因为每个新计算都需要 emaYesterday
值。
如果首先可以使用 numpy 进行完全矢量化,如果有人能告诉我方法,我将不胜感激。
注意:我不得不填写一些假人来制作你的代码运行,请检查它们是否正确。
循环可以通过变换 ema[i] ~> ema'[i] = ema[i] x (1-M)^-i
进行矢量化,之后它变成 cumsum
.
这在下面实现为 ema_pp_naive
。
此方法的问题在于,对于中等大小的 i
(~10^3),(1-M)^-i 项可能会溢出,导致结果无用。
我们可以通过记录 space 来绕过这个问题(使用 np.logaddexp
求和)。 ema_pp_safe
比原始方法贵很多,但仍然比原始循环快 10 倍以上。在我快速而肮脏的测试中,这为一百万个及以上的术语提供了正确的结果。
代码:
import numpy as np
K = 1000
Days = 0
emaYesterday = np.random.random()
valueTodayArray = np.random.random(K)
M = np.random.random()
valueList = valueTodayArray
import time
T = []
T.append(time.perf_counter())
# Initialize an empty numpy array to hold calculated ema values
emaTodayArray = np.zeros((valueTodayArray.size - Days), dtype=np.float32)
ema = emaYesterday
# Calculate ema
for i, valueToday in enumerate(np.nditer(valueList)):
ema = ((valueToday - ema) * M) + ema
emaTodayArray[i] = ema
T.append(time.perf_counter())
scaling = np.broadcast_to(1/(1-M), valueTodayArray.size+1).cumprod()
ema_pp_naive = ((np.concatenate([[emaYesterday], valueTodayArray * M]) * scaling).cumsum() / scaling)[1:]
T.append(time.perf_counter())
logscaling = np.log(1-M)*np.arange(valueTodayArray.size+1)
log_ema_pp = np.logaddexp.accumulate(np.log(np.concatenate([[emaYesterday], valueTodayArray * M])) - logscaling) + logscaling
ema_pp_safe = np.exp(log_ema_pp[1:])
T.append(time.perf_counter())
print(f'K = {K}')
print('naive method correct:', np.allclose(ema_pp_naive, emaTodayArray))
print('safe method correct:', np.allclose(ema_pp_safe, emaTodayArray))
print('OP {:.3f} ms naive {:.3f} ms safe {:.3f} ms'.format(*np.diff(T)*1000))
示例 运行s:
K = 100
naive method correct: True
safe method correct: True
OP 0.236 ms naive 0.061 ms safe 0.053 ms
K = 1000
naive method correct: False
safe method correct: True
OP 2.397 ms naive 0.224 ms safe 0.183 ms
K = 1000000
naive method correct: False
safe method correct: True
OP 2145.956 ms naive 18.342 ms safe 108.528 ms