将循环理解转化为 numpy 形式
Turning loop comprehensions into numpy form
我是否可以像 y_mean
和 xy_mean
函数一样转换要计算的标准偏差函数。我不想使用 for 循环来计算标准偏差或占用大量 RAM 内存的函数。我正在尝试使用 np.convolve()
函数来计算标准偏差 std
.
变量:
number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
原版 python 功能:
y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5
Numpy 版本:
y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
std = ?
您可以将 np.lib.stride_tricks.as_strided
and np.std
与 ddof=1
一起使用:
>>> np.std(
np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
),
axis=-1,
ddof=1
)
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
否则你可以使用 pandas.Series.rolling.std
, pandas.Series.dropna
then pandas.Series.to_numpy
:
>>> pd.Series(PC_list).rolling(number).std().dropna().to_numpy()
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
解释:
np.lib.stride_tricks.as_strided
用于以特殊方式重塑数组,类似于滚动:
>>> np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
)
array([[457.334015, 424.440002, 394.79599 , 408.903992, 398.821014], #index: 0,1,2,3,4
[424.440002, 394.79599 , 408.903992, 398.821014, 402.152008], #index: 1,2,3,4,5
[394.79599 , 408.903992, 398.821014, 402.152008, 435.790985], #index: 2,3,4,5,6
[408.903992, 398.821014, 402.152008, 435.790985, 423.204987], # ... and so on
[398.821014, 402.152008, 435.790985, 423.204987, 411.574005],
[402.152008, 435.790985, 423.204987, 411.574005, 404.424988],
[435.790985, 423.204987, 411.574005, 404.424988, 399.519989],
[423.204987, 411.574005, 404.424988, 399.519989, 377.181 ],
[411.574005, 404.424988, 399.519989, 377.181 , 375.46701 ],
[404.424988, 399.519989, 377.181 , 375.46701 , 386.944 ],
[399.519989, 377.181 , 375.46701 , 386.944 , 383.61499 ],
[377.181 , 375.46701 , 386.944 , 383.61499 , 375.071991],
[375.46701 , 386.944 , 383.61499 , 375.071991, 359.511993],
[386.944 , 383.61499 , 375.071991, 359.511993, 328.865997],
[383.61499 , 375.071991, 359.511993, 328.865997, 320.51001 ],
[375.071991, 359.511993, 328.865997, 320.51001 , 330.07901 ],
[359.511993, 328.865997, 320.51001 , 330.07901 , 336.187012],
[328.865997, 320.51001 , 330.07901 , 336.187012, 352.940002],
[320.51001 , 330.07901 , 336.187012, 352.940002, 365.026001],
[330.07901 , 336.187012, 352.940002, 365.026001, 361.562012],
[336.187012, 352.940002, 365.026001, 361.562012, 362.299011],
[352.940002, 365.026001, 361.562012, 362.299011, 378.549011],
[365.026001, 361.562012, 362.299011, 378.549011, 390.414001],
[361.562012, 362.299011, 378.549011, 390.414001, 400.869995],
[362.299011, 378.549011, 390.414001, 400.869995, 394.77301 ],
[378.549011, 390.414001, 400.869995, 394.77301 , 382.556 ]])
现在,如果我们将上述数组的 std
穿过最后一个轴,以获得滚动 std
。默认情况下 numpy
使用 ddof=0
,即 Delta Degrees of Freedom = 0,这意味着对于 number
数量的样本,除数将等于 number - 0
。现在如你所愿number - 1
,你需要ddof=1
.
我是否可以像 y_mean
和 xy_mean
函数一样转换要计算的标准偏差函数。我不想使用 for 循环来计算标准偏差或占用大量 RAM 内存的函数。我正在尝试使用 np.convolve()
函数来计算标准偏差 std
.
变量:
number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
原版 python 功能:
y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5
Numpy 版本:
y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
std = ?
您可以将 np.lib.stride_tricks.as_strided
and np.std
与 ddof=1
一起使用:
>>> np.std(
np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
),
axis=-1,
ddof=1
)
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
否则你可以使用 pandas.Series.rolling.std
, pandas.Series.dropna
then pandas.Series.to_numpy
:
>>> pd.Series(PC_list).rolling(number).std().dropna().to_numpy()
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
解释:
np.lib.stride_tricks.as_strided
用于以特殊方式重塑数组,类似于滚动:
>>> np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
)
array([[457.334015, 424.440002, 394.79599 , 408.903992, 398.821014], #index: 0,1,2,3,4
[424.440002, 394.79599 , 408.903992, 398.821014, 402.152008], #index: 1,2,3,4,5
[394.79599 , 408.903992, 398.821014, 402.152008, 435.790985], #index: 2,3,4,5,6
[408.903992, 398.821014, 402.152008, 435.790985, 423.204987], # ... and so on
[398.821014, 402.152008, 435.790985, 423.204987, 411.574005],
[402.152008, 435.790985, 423.204987, 411.574005, 404.424988],
[435.790985, 423.204987, 411.574005, 404.424988, 399.519989],
[423.204987, 411.574005, 404.424988, 399.519989, 377.181 ],
[411.574005, 404.424988, 399.519989, 377.181 , 375.46701 ],
[404.424988, 399.519989, 377.181 , 375.46701 , 386.944 ],
[399.519989, 377.181 , 375.46701 , 386.944 , 383.61499 ],
[377.181 , 375.46701 , 386.944 , 383.61499 , 375.071991],
[375.46701 , 386.944 , 383.61499 , 375.071991, 359.511993],
[386.944 , 383.61499 , 375.071991, 359.511993, 328.865997],
[383.61499 , 375.071991, 359.511993, 328.865997, 320.51001 ],
[375.071991, 359.511993, 328.865997, 320.51001 , 330.07901 ],
[359.511993, 328.865997, 320.51001 , 330.07901 , 336.187012],
[328.865997, 320.51001 , 330.07901 , 336.187012, 352.940002],
[320.51001 , 330.07901 , 336.187012, 352.940002, 365.026001],
[330.07901 , 336.187012, 352.940002, 365.026001, 361.562012],
[336.187012, 352.940002, 365.026001, 361.562012, 362.299011],
[352.940002, 365.026001, 361.562012, 362.299011, 378.549011],
[365.026001, 361.562012, 362.299011, 378.549011, 390.414001],
[361.562012, 362.299011, 378.549011, 390.414001, 400.869995],
[362.299011, 378.549011, 390.414001, 400.869995, 394.77301 ],
[378.549011, 390.414001, 400.869995, 394.77301 , 382.556 ]])
现在,如果我们将上述数组的 std
穿过最后一个轴,以获得滚动 std
。默认情况下 numpy
使用 ddof=0
,即 Delta Degrees of Freedom = 0,这意味着对于 number
数量的样本,除数将等于 number - 0
。现在如你所愿number - 1
,你需要ddof=1
.