SKlearn 中是否有任何函数可以解决具有 l2 范数效率的大型线性回归?
Is there any function in SKlearn to solve a large linear regression with l2 norm efficient?
现在,我需要用 L2 范数(y=xw,y.shape=[5,1],x.shape=[5,100K+])求解一个非常大的线性回归。
我已经试过了sklearn.linear_model.Ridge
,但是太慢了(花费超过30分钟)
那么,SKlearn
是否有另一个函数可以有效地解决大型线性回归问题?
尝试使用不同的求解器,例如迭代并设置 max_iter
较低或 tol
较高。来自 documentation:
‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more
appropriate than ‘cholesky’ for large-scale data (possibility to set
tol and max_iter).
‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative
procedure.
‘sag’ uses a Stochastic Average Gradient descent, and ‘saga’ uses its improved, unbiased version named SAGA. Both methods also use an
iterative procedure, and are often faster than other solvers when both
n_samples and n_features are large. Note that ‘sag’ and ‘saga’ fast
convergence is only guaranteed on features with approximately the same
scale. You can preprocess the data with a scaler from
sklearn.preprocessing.
现在,我需要用 L2 范数(y=xw,y.shape=[5,1],x.shape=[5,100K+])求解一个非常大的线性回归。
我已经试过了sklearn.linear_model.Ridge
,但是太慢了(花费超过30分钟)
那么,SKlearn
是否有另一个函数可以有效地解决大型线性回归问题?
尝试使用不同的求解器,例如迭代并设置 max_iter
较低或 tol
较高。来自 documentation:
‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter).
‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.
‘sag’ uses a Stochastic Average Gradient descent, and ‘saga’ uses its improved, unbiased version named SAGA. Both methods also use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.