将循环理解转化为 numpy 形式

Turning loop comprehensions into numpy form

我是否可以像 y_meanxy_mean 函数一样转换要计算的标准偏差函数。我不想使用 for 循环来计算标准偏差或占用大量 RAM 内存的函数。我正在尝试使用 np.convolve() 函数来计算标准偏差 std.

变量:

number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

原版 python 功能:

y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5

Numpy 版本:

y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
std = ?

您可以将 np.lib.stride_tricks.as_strided and np.stdddof=1 一起使用:

>>> np.std(
        np.lib.stride_tricks.as_strided(
            PC_list, 
            shape=(PC_list.shape[0] - number + 1, number), 
            strides=PC_list.strides*2
        ), 
        axis=-1, 
        ddof=1
    )
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
       14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
        9.58525968,  5.32623099, 10.61466493, 23.71209646, 27.85489139,
       23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
       11.7602241 ,  9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
        9.02825105])

否则你可以使用 pandas.Series.rolling.std, pandas.Series.dropna then pandas.Series.to_numpy:

>>> pd.Series(PC_list).rolling(number).std().dropna().to_numpy()
 
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
       14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
        9.58525968,  5.32623099, 10.61466493, 23.71209646, 27.85489139,
       23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
       11.7602241 ,  9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
        9.02825105])

解释np.lib.stride_tricks.as_strided 用于以特殊方式重塑数组,类似于滚动:

>>> np.lib.stride_tricks.as_strided(
            PC_list, 
            shape=(PC_list.shape[0] - number + 1, number), 
            strides=PC_list.strides*2
        )

array([[457.334015, 424.440002, 394.79599 , 408.903992, 398.821014],   #index: 0,1,2,3,4
       [424.440002, 394.79599 , 408.903992, 398.821014, 402.152008],   #index: 1,2,3,4,5
       [394.79599 , 408.903992, 398.821014, 402.152008, 435.790985],   #index: 2,3,4,5,6
       [408.903992, 398.821014, 402.152008, 435.790985, 423.204987],   # ... and so on
       [398.821014, 402.152008, 435.790985, 423.204987, 411.574005],
       [402.152008, 435.790985, 423.204987, 411.574005, 404.424988],
       [435.790985, 423.204987, 411.574005, 404.424988, 399.519989],
       [423.204987, 411.574005, 404.424988, 399.519989, 377.181   ],
       [411.574005, 404.424988, 399.519989, 377.181   , 375.46701 ],
       [404.424988, 399.519989, 377.181   , 375.46701 , 386.944   ],
       [399.519989, 377.181   , 375.46701 , 386.944   , 383.61499 ],
       [377.181   , 375.46701 , 386.944   , 383.61499 , 375.071991],
       [375.46701 , 386.944   , 383.61499 , 375.071991, 359.511993],
       [386.944   , 383.61499 , 375.071991, 359.511993, 328.865997],
       [383.61499 , 375.071991, 359.511993, 328.865997, 320.51001 ],
       [375.071991, 359.511993, 328.865997, 320.51001 , 330.07901 ],
       [359.511993, 328.865997, 320.51001 , 330.07901 , 336.187012],
       [328.865997, 320.51001 , 330.07901 , 336.187012, 352.940002],
       [320.51001 , 330.07901 , 336.187012, 352.940002, 365.026001],
       [330.07901 , 336.187012, 352.940002, 365.026001, 361.562012],
       [336.187012, 352.940002, 365.026001, 361.562012, 362.299011],
       [352.940002, 365.026001, 361.562012, 362.299011, 378.549011],
       [365.026001, 361.562012, 362.299011, 378.549011, 390.414001],
       [361.562012, 362.299011, 378.549011, 390.414001, 400.869995],
       [362.299011, 378.549011, 390.414001, 400.869995, 394.77301 ],
       [378.549011, 390.414001, 400.869995, 394.77301 , 382.556   ]])

现在,如果我们将上述数组的 std 穿过最后一个轴,以获得滚动 std。默认情况下 numpy 使用 ddof=0,即 Delta Degrees of Freedom = 0,这意味着对于 number 数量的样本,除数将等于 number - 0。现在如你所愿number - 1,你需要ddof=1.