3d 输入的线性最小二乘法解
Linear least-squares solution for 3d inputs
问题
假设我有两个具有以下形状的数组:
y.shape
是 (z, b)
。将其想象成 z (b,)
y 向量的集合。
x.shape
是 (z, b, c)
。将其想象成 z (b, c)
多元 x 矩阵的集合。
我的目的是找到 z
最小二乘系数解的独立向量。 IE。第一个解决方案是在 x[0]
上回归 y[0]
,其中这些输入的形状分别为 (b, )
和 (b, c)
。 (b
个观测值,c
个特征。)结果将是形状 (z, c)
。
一些示例数据
np.random.seed(123)
x = np.random.randn(190, 20, 3)
y = np.random.randn(190, 20) # Assumes no intercept term
# First vector of coefficients
np.linalg.lstsq(x[0], y[0])[0]
# array([-0.12823781, -0.3055392 , 0.11602805])
# Last vector of coefficients
np.linalg.lstsq(x[-1], y[-1])[0]
# array([-0.02777503, -0.20425779, 0.22874169])
NumPy 的最小二乘求解器 lstsq
无法对这些进行运算。 (我的预期结果是形状 (190, 3)
,或 190 个向量,每个向量有 3 个系数。每个 (3,)
向量是一个系数集,来自 n=20 的回归。)
是否有解决方法可以将系数矩阵封装到一个结果数组中?我在想 matrix formulation:
对于 1d y 和 2d x 这只是:
def coefs(y, x):
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
但我无法接受上面的 2d y 和 3d x。
最后,我很好奇为什么 lstsq
在这里遇到麻烦。关于为什么输入最多必须是 2d,是否有一个简单的答案?
这里有一些演示:
- 我评论中提到的问题
- looped-lstsq 与 one-step-embedded-lstsq 的主要经验分析
- (最后有一些令人惊讶的结果,对此持保留态度):
代码
import numpy as np
import scipy.sparse as sp
from sklearn.datasets import make_regression
from time import perf_counter as pc
np.set_printoptions(edgeitems=3,infstr='inf',
linewidth=160, nanstr='nan', precision=1,
suppress=False, threshold=1000, formatter=None)
""" Create task """
Z, B, C = 4, 3, 2
Zs = []
Bs = []
for i in range(Z):
X, y, = make_regression(n_samples=B, n_features=C, random_state=i)
Zs.append(X)
Bs.append(y)
Zs = np.array(Zs)
Bs = np.array(Bs)
""" Independent looping """
print('LOOPED CALLS')
start = pc()
result = np.empty((Z, C))
for z in range(Z):
result[z] = np.linalg.lstsq(Zs[z], Bs[z])[0]
end = pc()
print('lhs-shape: ', Zs.shape)
print('lhs-dense-fill-ratio: ', np.count_nonzero(Zs) / np.product(Zs.shape))
print('used time: ', end-start)
print(result)
""" Embedding in one """
print('EMBEDDING INTO ONE CALL')
Zs_ = sp.block_diag([Zs[i] for i in range(Z)]).todense() # convenient to use scipy.sparse
# oops: there is a dense-one too:
# -> scipy.linalg.block_diag
Bs_ = Bs.flatten()
start = pc() # one could argue if transform above should be timed too!
result_ = np.linalg.lstsq(Zs_, Bs_)[0]
end = pc()
print('lhs-shape: ', Zs_.shape)
print('lhs-dense-fill-ratio: ', np.count_nonzero(Zs_) / np.product(Zs_.shape))
print('used time: ', end-start)
print(result_)
输出
LOOPED CALLS
lhs-shape: (4, 3, 2)
lhs-dense-fill-ratio: 1.0
used time: 0.0005415275241778155
[[ 89.2 43.8]
[ 68.5 41.9]
[ 61.9 20.5]
[ 5.1 44.1]]
EMBEDDING INTO ONE CALL
lhs-shape: (12, 8)
lhs-dense-fill-ratio: 0.25
used time: 0.00015907748341232328
[ 89.2 43.8 68.5 41.9 61.9 20.5 5.1 44.1]
每个案例的 lstsq 问题维度
虽然原始数据看起来像:
[[[ 2.2 1. ]
[-1. 1.9]
[ 0.4 1.8]]
[[-1.1 -0.5]
[-2.3 0.9]
[-0.6 1.6]]
[[ 1.6 -2.1]
[-0.1 -0.4]
[-0.8 -1.8]]
[[-0.3 -0.4]
[ 0.1 -1.9]
[ 1.8 0.4]]]
[[ 242.7 -5.4 112.9]
[ -95.7 -121.4 26.2]
[ 57.9 -12. -88.8]
[ -17.1 -81.6 28.4]]
并且每个解决方案看起来像:
LHS
[[ 2.2 1. ]
[-1. 1.9]
[ 0.4 1.8]]
RHS
[ 242.7 -5.4 112.9]
嵌入式问题(一个求解步骤)如下所示:
LHS
[[ 2.2 1. 0. 0. 0. 0. 0. 0. ]
[-1. 1.9 0. 0. 0. 0. 0. 0. ]
[ 0.4 1.8 0. 0. 0. 0. 0. 0. ]
[ 0. 0. -1.1 -0.5 0. 0. 0. 0. ]
[ 0. 0. -2.3 0.9 0. 0. 0. 0. ]
[ 0. 0. -0.6 1.6 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 1.6 -2.1 0. 0. ]
[ 0. 0. 0. 0. -0.1 -0.4 0. 0. ]
[ 0. 0. 0. 0. -0.8 -1.8 0. 0. ]
[ 0. 0. 0. 0. 0. 0. -0.3 -0.4]
[ 0. 0. 0. 0. 0. 0. 0.1 -1.9]
[ 0. 0. 0. 0. 0. 0. 1.8 0.4]]
RHS
[ 242.7 -5.4 112.9 -95.7 -121.4 26.2 57.9 -12. -88.8 -17.1 -81.6 28.4]
鉴于 lstsq 的假设/标准形式,没有办法在不引入大量零的情况下嵌入这种独立性假设!
lstsq 是:
- 无法利用稀疏性 因为核心算法是密集算法
- 看看变形后的形状:这在内存和计算方面会很重!
- 无法使用适合 0 的信息来加速适合 1 的某些东西
- 他们毕竟是独立的;理论上没有信息增益
- 能够进行大量矢量化(但一般来说这没有帮助)
您的示例形状
针对您的特定形状修剪输出,这次:也测试稀疏求解器:
添加代码(最后)
print('EMBEDDING -> sparse-solver')
Zs_ = sp.csc_matrix(Zs_) # sparse!
start = pc()
result__ = sp.linalg.lsmr(Zs_, Bs_)[0]
end = pc()
print('lhs-shape: ', Zs_.shape)
print('lhs-dense-fill-ratio: ', Zs_.nnz / np.product(Zs_.shape))
print('used time: ', end-start)
print(result__)
输出
LOOPED CALLS
lhs-shape: (190, 20, 3)
lhs-dense-fill-ratio: 1.0
used time: 0.01716980329027777
[ 11.9 31.8 29.6]
...
[ 44.8 28.2 62.3]]
EMBEDDING INTO ONE CALL
lhs-shape: (3800, 570)
lhs-dense-fill-ratio: 0.00526315789474
used time: 0.6774500271820254
[ 11.9 31.8 29.6 ... 44.8 28.2 62.3]
EMBEDDING -> sparse-solver
lhs-shape: (3800, 570)
lhs-dense-fill-ratio: 0.00526315789474
used time: 0.0038423098412817547 # a bit of a surprise
[ 11.9 31.8 29.6 ... 44.8 28.2 62.3]
结论
一般来说:独立求解!
在某些情况下,使用稀疏求解器方法可以更快地解决上述任务,但是这里的分析很困难,因为我们正在比较两个完全不同的算法(直接与迭代),其他数据的结果可能会发生一些显着变化。
这是线性代数解决方案,对于较小的数组,速度与@sascha 的循环 相当。
print('Matrix formulation')
start = pc()
result = np.squeeze(np.matmul(np.linalg.inv(np.matmul(Zs.swapaxes(1,2), Zs)),
np.matmul(Zs.swapaxes(1,2), np.atleast_3d(Bs))))
end = pc()
print('used time: ', end-start)
print(result)
输出:
Matrix formulation
used time: 0.00015713176480858237
[[ 89.2 43.8]
[ 68.5 41.9]
[ 61.9 20.5]
[ 5.1 44.1]]
然而,对于更大的输入,@sascha 的答案很容易胜出,尤其是随着第三维的大小增长(外生的数量 variables/features)。
Z, B, C = 400, 300, 20
Zs = []
Bs = []
for i in range(Z):
X, y, = make_regression(n_samples=B, n_features=C, random_state=i)
Zs.append(X)
Bs.append(y)
Zs = np.array(Zs)
Bs = np.array(Bs)
# --------
print('Matrix formulation')
start = pc()
result = np.squeeze(np.matmul(np.linalg.inv(np.matmul(Zs.swapaxes(1,2), Zs)),
np.matmul(Zs.swapaxes(1,2), np.atleast_3d(Bs))))
end = pc()
print('used time: ', end-start)
print(result)
# --------
print('Looped calls')
start = pc()
result = np.empty((Z, C))
for z in range(Z):
result[z] = np.linalg.lstsq(Zs[z], Bs[z])[0]
end = pc()
print('used time: ', end-start)
print(result)
输出:
Matrix formulation
used time: 0.24000779996413257
[[ 1.2e+01 1.3e-15 6.3e+01 ..., -8.9e-15 5.3e-15 -1.1e-14]
[ 5.8e+01 2.7e-14 -4.8e-15 ..., 8.5e+01 -1.5e-14 1.8e-14]
[ 1.2e+01 -1.2e-14 4.4e-16 ..., 6.0e-15 8.6e+01 6.0e+01]
...,
[ 2.9e-15 6.6e+01 1.1e-15 ..., 9.8e+01 -2.9e-14 8.4e+01]
[ 2.8e+01 6.1e+01 -1.2e-14 ..., -2.5e-14 6.3e+01 5.9e+01]
[ 7.0e+01 3.3e-16 8.4e+00 ..., 4.1e+01 -6.2e-15 5.8e+01]]
Looped calls
used time: 0.17400113389658145
[[ 1.2e+01 7.1e-15 6.3e+01 ..., -2.8e-14 1.1e-14 -4.8e-14]
[ 5.8e+01 -5.7e-14 -4.9e-14 ..., 8.5e+01 -5.3e-15 6.8e-14]
[ 1.2e+01 3.6e-14 4.5e-14 ..., -3.6e-15 8.6e+01 6.0e+01]
...,
[ 6.3e-14 6.6e+01 -1.4e-13 ..., 9.8e+01 2.8e-14 8.4e+01]
[ 2.8e+01 6.1e+01 -2.1e-14 ..., -1.4e-14 6.3e+01 5.9e+01]
[ 7.0e+01 -1.1e-13 8.4e+00 ..., 4.1e+01 -9.4e-14 5.8e+01]]
问题
假设我有两个具有以下形状的数组:
y.shape
是(z, b)
。将其想象成 z(b,)
y 向量的集合。x.shape
是(z, b, c)
。将其想象成 z(b, c)
多元 x 矩阵的集合。
我的目的是找到 z
最小二乘系数解的独立向量。 IE。第一个解决方案是在 x[0]
上回归 y[0]
,其中这些输入的形状分别为 (b, )
和 (b, c)
。 (b
个观测值,c
个特征。)结果将是形状 (z, c)
。
一些示例数据
np.random.seed(123)
x = np.random.randn(190, 20, 3)
y = np.random.randn(190, 20) # Assumes no intercept term
# First vector of coefficients
np.linalg.lstsq(x[0], y[0])[0]
# array([-0.12823781, -0.3055392 , 0.11602805])
# Last vector of coefficients
np.linalg.lstsq(x[-1], y[-1])[0]
# array([-0.02777503, -0.20425779, 0.22874169])
NumPy 的最小二乘求解器 lstsq
无法对这些进行运算。 (我的预期结果是形状 (190, 3)
,或 190 个向量,每个向量有 3 个系数。每个 (3,)
向量是一个系数集,来自 n=20 的回归。)
是否有解决方法可以将系数矩阵封装到一个结果数组中?我在想 matrix formulation:
对于 1d y 和 2d x 这只是:
def coefs(y, x):
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
但我无法接受上面的 2d y 和 3d x。
最后,我很好奇为什么 lstsq
在这里遇到麻烦。关于为什么输入最多必须是 2d,是否有一个简单的答案?
这里有一些演示:
- 我评论中提到的问题
- looped-lstsq 与 one-step-embedded-lstsq 的主要经验分析
- (最后有一些令人惊讶的结果,对此持保留态度):
代码
import numpy as np
import scipy.sparse as sp
from sklearn.datasets import make_regression
from time import perf_counter as pc
np.set_printoptions(edgeitems=3,infstr='inf',
linewidth=160, nanstr='nan', precision=1,
suppress=False, threshold=1000, formatter=None)
""" Create task """
Z, B, C = 4, 3, 2
Zs = []
Bs = []
for i in range(Z):
X, y, = make_regression(n_samples=B, n_features=C, random_state=i)
Zs.append(X)
Bs.append(y)
Zs = np.array(Zs)
Bs = np.array(Bs)
""" Independent looping """
print('LOOPED CALLS')
start = pc()
result = np.empty((Z, C))
for z in range(Z):
result[z] = np.linalg.lstsq(Zs[z], Bs[z])[0]
end = pc()
print('lhs-shape: ', Zs.shape)
print('lhs-dense-fill-ratio: ', np.count_nonzero(Zs) / np.product(Zs.shape))
print('used time: ', end-start)
print(result)
""" Embedding in one """
print('EMBEDDING INTO ONE CALL')
Zs_ = sp.block_diag([Zs[i] for i in range(Z)]).todense() # convenient to use scipy.sparse
# oops: there is a dense-one too:
# -> scipy.linalg.block_diag
Bs_ = Bs.flatten()
start = pc() # one could argue if transform above should be timed too!
result_ = np.linalg.lstsq(Zs_, Bs_)[0]
end = pc()
print('lhs-shape: ', Zs_.shape)
print('lhs-dense-fill-ratio: ', np.count_nonzero(Zs_) / np.product(Zs_.shape))
print('used time: ', end-start)
print(result_)
输出
LOOPED CALLS
lhs-shape: (4, 3, 2)
lhs-dense-fill-ratio: 1.0
used time: 0.0005415275241778155
[[ 89.2 43.8]
[ 68.5 41.9]
[ 61.9 20.5]
[ 5.1 44.1]]
EMBEDDING INTO ONE CALL
lhs-shape: (12, 8)
lhs-dense-fill-ratio: 0.25
used time: 0.00015907748341232328
[ 89.2 43.8 68.5 41.9 61.9 20.5 5.1 44.1]
每个案例的 lstsq 问题维度
虽然原始数据看起来像:
[[[ 2.2 1. ]
[-1. 1.9]
[ 0.4 1.8]]
[[-1.1 -0.5]
[-2.3 0.9]
[-0.6 1.6]]
[[ 1.6 -2.1]
[-0.1 -0.4]
[-0.8 -1.8]]
[[-0.3 -0.4]
[ 0.1 -1.9]
[ 1.8 0.4]]]
[[ 242.7 -5.4 112.9]
[ -95.7 -121.4 26.2]
[ 57.9 -12. -88.8]
[ -17.1 -81.6 28.4]]
并且每个解决方案看起来像:
LHS
[[ 2.2 1. ]
[-1. 1.9]
[ 0.4 1.8]]
RHS
[ 242.7 -5.4 112.9]
嵌入式问题(一个求解步骤)如下所示:
LHS
[[ 2.2 1. 0. 0. 0. 0. 0. 0. ]
[-1. 1.9 0. 0. 0. 0. 0. 0. ]
[ 0.4 1.8 0. 0. 0. 0. 0. 0. ]
[ 0. 0. -1.1 -0.5 0. 0. 0. 0. ]
[ 0. 0. -2.3 0.9 0. 0. 0. 0. ]
[ 0. 0. -0.6 1.6 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 1.6 -2.1 0. 0. ]
[ 0. 0. 0. 0. -0.1 -0.4 0. 0. ]
[ 0. 0. 0. 0. -0.8 -1.8 0. 0. ]
[ 0. 0. 0. 0. 0. 0. -0.3 -0.4]
[ 0. 0. 0. 0. 0. 0. 0.1 -1.9]
[ 0. 0. 0. 0. 0. 0. 1.8 0.4]]
RHS
[ 242.7 -5.4 112.9 -95.7 -121.4 26.2 57.9 -12. -88.8 -17.1 -81.6 28.4]
鉴于 lstsq 的假设/标准形式,没有办法在不引入大量零的情况下嵌入这种独立性假设!
lstsq 是:
- 无法利用稀疏性 因为核心算法是密集算法
- 看看变形后的形状:这在内存和计算方面会很重!
- 无法使用适合 0 的信息来加速适合 1 的某些东西
- 他们毕竟是独立的;理论上没有信息增益
- 能够进行大量矢量化(但一般来说这没有帮助)
您的示例形状
针对您的特定形状修剪输出,这次:也测试稀疏求解器:
添加代码(最后)
print('EMBEDDING -> sparse-solver')
Zs_ = sp.csc_matrix(Zs_) # sparse!
start = pc()
result__ = sp.linalg.lsmr(Zs_, Bs_)[0]
end = pc()
print('lhs-shape: ', Zs_.shape)
print('lhs-dense-fill-ratio: ', Zs_.nnz / np.product(Zs_.shape))
print('used time: ', end-start)
print(result__)
输出
LOOPED CALLS
lhs-shape: (190, 20, 3)
lhs-dense-fill-ratio: 1.0
used time: 0.01716980329027777
[ 11.9 31.8 29.6]
...
[ 44.8 28.2 62.3]]
EMBEDDING INTO ONE CALL
lhs-shape: (3800, 570)
lhs-dense-fill-ratio: 0.00526315789474
used time: 0.6774500271820254
[ 11.9 31.8 29.6 ... 44.8 28.2 62.3]
EMBEDDING -> sparse-solver
lhs-shape: (3800, 570)
lhs-dense-fill-ratio: 0.00526315789474
used time: 0.0038423098412817547 # a bit of a surprise
[ 11.9 31.8 29.6 ... 44.8 28.2 62.3]
结论
一般来说:独立求解!
在某些情况下,使用稀疏求解器方法可以更快地解决上述任务,但是这里的分析很困难,因为我们正在比较两个完全不同的算法(直接与迭代),其他数据的结果可能会发生一些显着变化。
这是线性代数解决方案,对于较小的数组,速度与@sascha 的循环
print('Matrix formulation')
start = pc()
result = np.squeeze(np.matmul(np.linalg.inv(np.matmul(Zs.swapaxes(1,2), Zs)),
np.matmul(Zs.swapaxes(1,2), np.atleast_3d(Bs))))
end = pc()
print('used time: ', end-start)
print(result)
输出:
Matrix formulation
used time: 0.00015713176480858237
[[ 89.2 43.8]
[ 68.5 41.9]
[ 61.9 20.5]
[ 5.1 44.1]]
然而,对于更大的输入,@sascha 的答案很容易胜出,尤其是随着第三维的大小增长(外生的数量 variables/features)。
Z, B, C = 400, 300, 20
Zs = []
Bs = []
for i in range(Z):
X, y, = make_regression(n_samples=B, n_features=C, random_state=i)
Zs.append(X)
Bs.append(y)
Zs = np.array(Zs)
Bs = np.array(Bs)
# --------
print('Matrix formulation')
start = pc()
result = np.squeeze(np.matmul(np.linalg.inv(np.matmul(Zs.swapaxes(1,2), Zs)),
np.matmul(Zs.swapaxes(1,2), np.atleast_3d(Bs))))
end = pc()
print('used time: ', end-start)
print(result)
# --------
print('Looped calls')
start = pc()
result = np.empty((Z, C))
for z in range(Z):
result[z] = np.linalg.lstsq(Zs[z], Bs[z])[0]
end = pc()
print('used time: ', end-start)
print(result)
输出:
Matrix formulation
used time: 0.24000779996413257
[[ 1.2e+01 1.3e-15 6.3e+01 ..., -8.9e-15 5.3e-15 -1.1e-14]
[ 5.8e+01 2.7e-14 -4.8e-15 ..., 8.5e+01 -1.5e-14 1.8e-14]
[ 1.2e+01 -1.2e-14 4.4e-16 ..., 6.0e-15 8.6e+01 6.0e+01]
...,
[ 2.9e-15 6.6e+01 1.1e-15 ..., 9.8e+01 -2.9e-14 8.4e+01]
[ 2.8e+01 6.1e+01 -1.2e-14 ..., -2.5e-14 6.3e+01 5.9e+01]
[ 7.0e+01 3.3e-16 8.4e+00 ..., 4.1e+01 -6.2e-15 5.8e+01]]
Looped calls
used time: 0.17400113389658145
[[ 1.2e+01 7.1e-15 6.3e+01 ..., -2.8e-14 1.1e-14 -4.8e-14]
[ 5.8e+01 -5.7e-14 -4.9e-14 ..., 8.5e+01 -5.3e-15 6.8e-14]
[ 1.2e+01 3.6e-14 4.5e-14 ..., -3.6e-15 8.6e+01 6.0e+01]
...,
[ 6.3e-14 6.6e+01 -1.4e-13 ..., 9.8e+01 2.8e-14 8.4e+01]
[ 2.8e+01 6.1e+01 -2.1e-14 ..., -1.4e-14 6.3e+01 5.9e+01]
[ 7.0e+01 -1.1e-13 8.4e+00 ..., 4.1e+01 -9.4e-14 5.8e+01]]