使用 numpy 以矢量化形式转换 for 循环函数
Turning a for loop function in Vectorized form with numpy
我试图通过使用 numpy 数组来使我的程序更快,但是我一直尝试以向量的形式修改 vanilla python 它给了我错误。我如何向量化代码,这样我就不必使用下面的 for loop.In for 循环代码 我有线性回归和标准偏差公式,它们取决于要计算的 PC_list
值。
PC_list= [457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000]
#x_mean and x_squared is used for the lin regressions and stand dev
x_mean = number/2*(1 + number)
x_squared_mean = number*(number+1)*(2*number+1)/6
for i in range(len(PC_list)-number):
y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
#Linear regression slope(m) and b vert shift
m = (x_mean* y_mean- xy_mean)/((x_mean)**2- x_squared_mean)
b = y_mean - m*x_mean
#Standard Dev function = square root((first list value - y_mean)+(second list value - y_mean) + (third list value - y_mean)/n-1)
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5
#Upper and lower boundary calculations
Upper_Boundary = round((m*(i)+b + Upper*std),1)
Lower_Boundary = round((m*(i)+b + Lower*std),1)
#appends the upper and lower boundary to a list
upper.append(Upper_Boundary)
lower.append(Lower_Boundary)
#Boundary x and y positions appended in list for graphing
Boundary_x = number + i
Boundary_x_list.append(Boundary_x)
Python 和 Numpy 在这里很好地实现了简单线性回归:Simple Linear Regression in Python
我建议的第一件事是将原始数据集转换为 numpy 数组。
import numpy as np
X = np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
# Calculating mean of the array is made trivial
x_mean = X.mean()
# values of array are squared first and then we get the mean
x_squared_mean = np.power(X, 2).mean()
# covariance (b)
cov = np.sum((X - x_mean) * (y - y_mean)) / np.sum(np.power(X - x_mean, 2))
# variance (m)
variance = x_mean - (cov * x_mean)
# regression line
reg_line = cov + variance * X
这只是一个示例,但通常第一步是将您的数据转换为 numpy 数组,然后您可以访问在 C 中实现的所有非循环类型函数。
我试图通过使用 numpy 数组来使我的程序更快,但是我一直尝试以向量的形式修改 vanilla python 它给了我错误。我如何向量化代码,这样我就不必使用下面的 for loop.In for 循环代码 我有线性回归和标准偏差公式,它们取决于要计算的 PC_list
值。
PC_list= [457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000]
#x_mean and x_squared is used for the lin regressions and stand dev
x_mean = number/2*(1 + number)
x_squared_mean = number*(number+1)*(2*number+1)/6
for i in range(len(PC_list)-number):
y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
#Linear regression slope(m) and b vert shift
m = (x_mean* y_mean- xy_mean)/((x_mean)**2- x_squared_mean)
b = y_mean - m*x_mean
#Standard Dev function = square root((first list value - y_mean)+(second list value - y_mean) + (third list value - y_mean)/n-1)
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5
#Upper and lower boundary calculations
Upper_Boundary = round((m*(i)+b + Upper*std),1)
Lower_Boundary = round((m*(i)+b + Lower*std),1)
#appends the upper and lower boundary to a list
upper.append(Upper_Boundary)
lower.append(Lower_Boundary)
#Boundary x and y positions appended in list for graphing
Boundary_x = number + i
Boundary_x_list.append(Boundary_x)
Python 和 Numpy 在这里很好地实现了简单线性回归:Simple Linear Regression in Python
我建议的第一件事是将原始数据集转换为 numpy 数组。
import numpy as np
X = np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
# Calculating mean of the array is made trivial
x_mean = X.mean()
# values of array are squared first and then we get the mean
x_squared_mean = np.power(X, 2).mean()
# covariance (b)
cov = np.sum((X - x_mean) * (y - y_mean)) / np.sum(np.power(X - x_mean, 2))
# variance (m)
variance = x_mean - (cov * x_mean)
# regression line
reg_line = cov + variance * X
这只是一个示例,但通常第一步是将您的数据转换为 numpy 数组,然后您可以访问在 C 中实现的所有非循环类型函数。