scipy.optimize 的最小化函数没有给出正确的答案
The minimize function of scipy.optimize doesn't give the right answer
我正在尝试使用 Scipy 的最小化函数解决最小化问题。 objective 函数只是两个具有不同均值和方差的多元正态分布的比率。我希望找到函数 g_func 的最大值,这相当于找到函数 g_optimization 的最小值。另外,我添加了 x[0] = 0 的约束。这里,x 是一个包含 8 个元素的向量。 objective函数g_optimization如下:
import numpy as np
from scipy.optimize import minimize
# Set up mean and variance for two MVN distributions
n_trait = 8
sigma = np.full((n_trait, n_trait),0.0005)
np.fill_diagonal(sigma,0.005)
omega = np.full((n_trait, n_trait),0.0000236)
np.fill_diagonal(omega,0.0486)
sigma_pos = np.linalg.inv(np.linalg.inv(sigma)+np.linalg.inv(omega))
mu_pos = np.array([-0.01288244,0.08732091,0.01049617,0.0860966,0.10055626,0.07952922,0.04363669,-0.0061975])
mu_pri = 0
sigma_pri = omega
#objective function
def g_func(beta,mu_sim_pos):
g1 = ((np.linalg.det(sigma_pri))**(1/2))/((np.linalg.det(sigma_pos))**(1/2))
g2 = (-1/2)*np.linalg.multi_dot([np.transpose(beta-mu_sim_pos),np.linalg.inv(sigma_pos),beta-mu_sim_pos])
g3 = (1/2)*np.linalg.multi_dot([np.transpose(beta-mu_pri),np.linalg.inv(sigma_pri),beta-mu_pri])
g = g1*np.exp(g2+g3)
return g
def g_optimization(beta,mu_sim_pos):
return -1*g_func(beta,mu_sim_pos)
#optimization
start_point = np.full(8,0)
cons = ({'type': 'eq',
'fun' : lambda x: np.array([x[0]])})
anws = minimize (g_optimization, [start_point], args=(mu_pos),
constraints=cons, options={'maxiter': 50}, tol=0.001)
anws
迭代两次后优化停止,函数给出的最小值为0,在点np.array([0,10.32837891,-1.62396508,10.13790152,12.38752653,9.11615259,3.53201544,-4.22115517 ]).这不可能是真的,因为即使我们将起点np.zeros(8)代入g_optimization函数,给出的结果是-657.0041125829354,小于0。所以提供的解决方案绝对不是最小的.
g_optimization(np.zeros(8),mu_pos) #gives solution of -657.0041125829354
我不确定我哪里出错了。
我会尝试不同的求解器。例如 L-BFGS-B
效果很好。
您可以查看所有选项 here.
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
constraints=cons, options={'maxiter': 50}, tol=0.001)
print(anws)
# success: True
# message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
# fun: -21688.00879938617
# x: array([-0.0101048, 0.09937778, 0.01543875, 0.0980401, 0.11383878, 0.09086455, 0.05164822, -0.00280081])
编辑:
L-BFGS-B
无法处理一般约束 h(x)=0
,只能处理变量上的边界框:
Bounds on variables for L-BFGS-B, TNC, SLSQP, Powell, and trust-constr methods. There are two ways to specify the bounds:
Instance of Bounds class.
Sequence of (min, max) pairs for each element in x. None is used to specify no bound.
在您的情况下,您必须定义 8 对下限和上限。
对于 x[0] 你必须严格限制因为该方法不能处理 x_low == x_high
.
bounds = [(None, None)] * 8
bounds[0] = (0, 0.00001)
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B', bounds=bounds,
options={'maxiter': 50}, tol=0.001)
# fun: -21467.48153792194
# x: array([0., 0.10039832, 0.01641271, 0.0990599, 0.11486735, 0.09188037, 0.05264228, -0.00183697])
另一种方法是从优化问题中排除值 x[0]:
def g_optimization(beta,mu_sim_pos):
beta2 = np.empty(8)
beta2[0] = 0
beta2[1:] = beta
return -1*g_func(beta2, mu_sim_pos)
start_point = np.zeros(7) # exclude x[0]
anws = minimize(g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
options={'maxiter': 50}, tol=0.001)
# fun: -21467.47686079844
# x: array([0.10041797, 0.01648995, 0.09908046, 0.11487707, 0.09190585, 0.05269467, -0.00174722])
# ^ missing x[0]
我正在尝试使用 Scipy 的最小化函数解决最小化问题。 objective 函数只是两个具有不同均值和方差的多元正态分布的比率。我希望找到函数 g_func 的最大值,这相当于找到函数 g_optimization 的最小值。另外,我添加了 x[0] = 0 的约束。这里,x 是一个包含 8 个元素的向量。 objective函数g_optimization如下:
import numpy as np
from scipy.optimize import minimize
# Set up mean and variance for two MVN distributions
n_trait = 8
sigma = np.full((n_trait, n_trait),0.0005)
np.fill_diagonal(sigma,0.005)
omega = np.full((n_trait, n_trait),0.0000236)
np.fill_diagonal(omega,0.0486)
sigma_pos = np.linalg.inv(np.linalg.inv(sigma)+np.linalg.inv(omega))
mu_pos = np.array([-0.01288244,0.08732091,0.01049617,0.0860966,0.10055626,0.07952922,0.04363669,-0.0061975])
mu_pri = 0
sigma_pri = omega
#objective function
def g_func(beta,mu_sim_pos):
g1 = ((np.linalg.det(sigma_pri))**(1/2))/((np.linalg.det(sigma_pos))**(1/2))
g2 = (-1/2)*np.linalg.multi_dot([np.transpose(beta-mu_sim_pos),np.linalg.inv(sigma_pos),beta-mu_sim_pos])
g3 = (1/2)*np.linalg.multi_dot([np.transpose(beta-mu_pri),np.linalg.inv(sigma_pri),beta-mu_pri])
g = g1*np.exp(g2+g3)
return g
def g_optimization(beta,mu_sim_pos):
return -1*g_func(beta,mu_sim_pos)
#optimization
start_point = np.full(8,0)
cons = ({'type': 'eq',
'fun' : lambda x: np.array([x[0]])})
anws = minimize (g_optimization, [start_point], args=(mu_pos),
constraints=cons, options={'maxiter': 50}, tol=0.001)
anws
迭代两次后优化停止,函数给出的最小值为0,在点np.array([0,10.32837891,-1.62396508,10.13790152,12.38752653,9.11615259,3.53201544,-4.22115517 ]).这不可能是真的,因为即使我们将起点np.zeros(8)代入g_optimization函数,给出的结果是-657.0041125829354,小于0。所以提供的解决方案绝对不是最小的.
g_optimization(np.zeros(8),mu_pos) #gives solution of -657.0041125829354
我不确定我哪里出错了。
我会尝试不同的求解器。例如 L-BFGS-B
效果很好。
您可以查看所有选项 here.
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
constraints=cons, options={'maxiter': 50}, tol=0.001)
print(anws)
# success: True
# message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
# fun: -21688.00879938617
# x: array([-0.0101048, 0.09937778, 0.01543875, 0.0980401, 0.11383878, 0.09086455, 0.05164822, -0.00280081])
编辑:
L-BFGS-B
无法处理一般约束 h(x)=0
,只能处理变量上的边界框:
Bounds on variables for L-BFGS-B, TNC, SLSQP, Powell, and trust-constr methods. There are two ways to specify the bounds: Instance of Bounds class. Sequence of (min, max) pairs for each element in x. None is used to specify no bound.
在您的情况下,您必须定义 8 对下限和上限。
对于 x[0] 你必须严格限制因为该方法不能处理 x_low == x_high
.
bounds = [(None, None)] * 8
bounds[0] = (0, 0.00001)
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B', bounds=bounds,
options={'maxiter': 50}, tol=0.001)
# fun: -21467.48153792194
# x: array([0., 0.10039832, 0.01641271, 0.0990599, 0.11486735, 0.09188037, 0.05264228, -0.00183697])
另一种方法是从优化问题中排除值 x[0]:
def g_optimization(beta,mu_sim_pos):
beta2 = np.empty(8)
beta2[0] = 0
beta2[1:] = beta
return -1*g_func(beta2, mu_sim_pos)
start_point = np.zeros(7) # exclude x[0]
anws = minimize(g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
options={'maxiter': 50}, tol=0.001)
# fun: -21467.47686079844
# x: array([0.10041797, 0.01648995, 0.09908046, 0.11487707, 0.09190585, 0.05269467, -0.00174722])
# ^ missing x[0]