使用迭代器访问数据时如何对for循环进行向量化?
How to vectorize for loop when iterator is used to access data?
假设我有
data = rand(10000,1); % 10000x1 double
x = 8;
y = 10;
offset = 5000; % x,y and offset are scalar.
目前,我实现了如下逻辑:
tempData=zeros(x,y)
for i=1:x
tempData(i,:)=data(offset+i-1:x:offset + (x*y) -1)
end
我已经实现了从偏移量开始获取长度为 y
的等间隔数据并将它们分成 x
个桶的逻辑。是否可以矢量化此代码?
如果是,那么我想对 x
、y
和 offset
本身是等长向量以及每个相应的 x
、y
和 offset
值给出不同的 tempData
。我认为 tempData
的大小应该是 zeros(A,max(X),max(Y))
,其中 A
是向量 x
、y
和 offset
的公共长度,它将容纳所有范围的数据。但是我也不确定如何实现这个逻辑。
使用implicit expansion (or broadcasting), the needed indices can be pre-calculated, and then data
can be accessed with these indices in one step. For MATLAB versions before R2016b, this has to be explicitly done by using the bsxfun
方法。
这是给定代码的一种可能的矢量化:
data = rand(10000, 1);
x = 8;
y = 10;
offset = 5000;
tempData = zeros(x, y);
for ii = 1:x
tempData(ii, :) = data(offset+ii-1:x:offset+(x*y)-1);
end
tempData
% MATLAB versions R2016b and newer
idx = [0:x-1].' + offset + [0:x:(x*y)-1];
% MATLAB versions before R2016b
%idx = bsxfun(@plus, bsxfun(@plus, [0:x-1].', offset), [0:x:(x*y)-1]);
tempData2 = data(idx)
disp(['Number of different array elements: ', ...
num2str(numel(find(tempData ~= tempData2)))]);
输出:
tempData =
0.8066402 0.6572843 0.9425518 0.9663419 0.9700796 0.2132531 0.0562514 0.7089385 0.1911747 0.9513211
0.3150179 0.4987158 0.5472079 0.3804589 0.6569250 0.9619353 0.1204870 0.6133104 0.7718005 0.8298695
0.5942941 0.2964820 0.5767488 0.2801063 0.4969586 0.6939726 0.6652277 0.9043894 0.8220853 0.6501431
0.5398818 0.0067256 0.5347702 0.0935663 0.9080668 0.2440419 0.5053460 0.2064903 0.9822692 0.0440910
0.2567786 0.2294226 0.8511809 0.6516491 0.1073913 0.8241950 0.9817716 0.8543800 0.3400275 0.9529938
0.5700380 0.9455092 0.5102088 0.5539329 0.0058831 0.6627464 0.3184132 0.6538248 0.5766122 0.8352150
0.4384866 0.9618210 0.6841067 0.4880946 0.3056896 0.8244916 0.6240189 0.5447771 0.0317932 0.4269364
0.0054480 0.9978763 0.7917681 0.6482806 0.5933597 0.4203822 0.3880279 0.8687756 0.7550784 0.6491559
tempData2 =
0.8066402 0.6572843 0.9425518 0.9663419 0.9700796 0.2132531 0.0562514 0.7089385 0.1911747 0.9513211
0.3150179 0.4987158 0.5472079 0.3804589 0.6569250 0.9619353 0.1204870 0.6133104 0.7718005 0.8298695
0.5942941 0.2964820 0.5767488 0.2801063 0.4969586 0.6939726 0.6652277 0.9043894 0.8220853 0.6501431
0.5398818 0.0067256 0.5347702 0.0935663 0.9080668 0.2440419 0.5053460 0.2064903 0.9822692 0.0440910
0.2567786 0.2294226 0.8511809 0.6516491 0.1073913 0.8241950 0.9817716 0.8543800 0.3400275 0.9529938
0.5700380 0.9455092 0.5102088 0.5539329 0.0058831 0.6627464 0.3184132 0.6538248 0.5766122 0.8352150
0.4384866 0.9618210 0.6841067 0.4880946 0.3056896 0.8244916 0.6240189 0.5447771 0.0317932 0.4269364
0.0054480 0.9978763 0.7917681 0.6482806 0.5933597 0.4203822 0.3880279 0.8687756 0.7550784 0.6491559
Number of different array elements: 0
编辑:关于对多个参数的扩展,也有可能使用矢量化方法预先计算所需的索引(参见Divakar's answer here),但是代码会变得完全不可读。因此,至少在我看来,将 for 循环与之前介绍的矢量化代码一起使用是合适的。我还扩展了您的初始循环代码以进行比较:
data = rand(10000, 1);
x = [6, 8];
y = [10, 9];
offset = [3000, 5000];
A = numel(x);
tempData = zeros(max(x), max(y), A);
for jj = 1:A
for ii = 1:x(jj)
X = x(jj);
Y = y(jj);
off = offset(jj);
tempData(ii, 1:Y, jj) = data(off+ii-1:X:off+(X*Y)-1);
end
end
tempData
tempData2 = zeros(max(x), max(y), A);
for jj = 1:A
X = x(jj);
Y = y(jj);
off = offset(jj);
% MATLAB versions R2016b and newer
idx = [0:X-1].' + off + [0:X:(X*Y)-1];
% MATLAB versions before R2016b
%idx = bsxfun(@plus, bsxfun(@plus, [0:X-1].', off), [0:X:(X*Y)-1]);
tempData2(1:X, 1:Y, jj) = data(idx);
end
tempData2
disp(['Number of different array elements: ', ...
num2str(numel(find(tempData ~= tempData2)))]);
输出(缩短):
tempData =
ans(:,:,1) =
0.70758 0.71552 0.54604 0.73202 0.72717 0.16028 0.63080 0.48345 0.93159 0.96625
0.00320 0.11202 0.00179 0.90887 0.21830 0.91380 0.12110 0.31074 0.72834 0.52315
[...]
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
ans(:,:,2) =
0.11960 0.35942 0.62390 0.45457 0.63471 0.23471 0.75660 0.34019 0.06892 0.00000
0.46443 0.21113 0.55479 0.51218 0.83697 0.30117 0.13935 0.81838 0.80042 0.00000
[...]
tempData2 =
ans(:,:,1) =
0.70758 0.71552 0.54604 0.73202 0.72717 0.16028 0.63080 0.48345 0.93159 0.96625
0.00320 0.11202 0.00179 0.90887 0.21830 0.91380 0.12110 0.31074 0.72834 0.52315
[...]
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
ans(:,:,2) =
0.11960 0.35942 0.62390 0.45457 0.63471 0.23471 0.75660 0.34019 0.06892 0.00000
0.46443 0.21113 0.55479 0.51218 0.83697 0.30117 0.13935 0.81838 0.80042 0.00000
[...]
Number of different array elements: 0
假设我有
data = rand(10000,1); % 10000x1 double
x = 8;
y = 10;
offset = 5000; % x,y and offset are scalar.
目前,我实现了如下逻辑:
tempData=zeros(x,y)
for i=1:x
tempData(i,:)=data(offset+i-1:x:offset + (x*y) -1)
end
我已经实现了从偏移量开始获取长度为 y
的等间隔数据并将它们分成 x
个桶的逻辑。是否可以矢量化此代码?
如果是,那么我想对 x
、y
和 offset
本身是等长向量以及每个相应的 x
、y
和 offset
值给出不同的 tempData
。我认为 tempData
的大小应该是 zeros(A,max(X),max(Y))
,其中 A
是向量 x
、y
和 offset
的公共长度,它将容纳所有范围的数据。但是我也不确定如何实现这个逻辑。
使用implicit expansion (or broadcasting), the needed indices can be pre-calculated, and then data
can be accessed with these indices in one step. For MATLAB versions before R2016b, this has to be explicitly done by using the bsxfun
方法。
这是给定代码的一种可能的矢量化:
data = rand(10000, 1);
x = 8;
y = 10;
offset = 5000;
tempData = zeros(x, y);
for ii = 1:x
tempData(ii, :) = data(offset+ii-1:x:offset+(x*y)-1);
end
tempData
% MATLAB versions R2016b and newer
idx = [0:x-1].' + offset + [0:x:(x*y)-1];
% MATLAB versions before R2016b
%idx = bsxfun(@plus, bsxfun(@plus, [0:x-1].', offset), [0:x:(x*y)-1]);
tempData2 = data(idx)
disp(['Number of different array elements: ', ...
num2str(numel(find(tempData ~= tempData2)))]);
输出:
tempData =
0.8066402 0.6572843 0.9425518 0.9663419 0.9700796 0.2132531 0.0562514 0.7089385 0.1911747 0.9513211
0.3150179 0.4987158 0.5472079 0.3804589 0.6569250 0.9619353 0.1204870 0.6133104 0.7718005 0.8298695
0.5942941 0.2964820 0.5767488 0.2801063 0.4969586 0.6939726 0.6652277 0.9043894 0.8220853 0.6501431
0.5398818 0.0067256 0.5347702 0.0935663 0.9080668 0.2440419 0.5053460 0.2064903 0.9822692 0.0440910
0.2567786 0.2294226 0.8511809 0.6516491 0.1073913 0.8241950 0.9817716 0.8543800 0.3400275 0.9529938
0.5700380 0.9455092 0.5102088 0.5539329 0.0058831 0.6627464 0.3184132 0.6538248 0.5766122 0.8352150
0.4384866 0.9618210 0.6841067 0.4880946 0.3056896 0.8244916 0.6240189 0.5447771 0.0317932 0.4269364
0.0054480 0.9978763 0.7917681 0.6482806 0.5933597 0.4203822 0.3880279 0.8687756 0.7550784 0.6491559
tempData2 =
0.8066402 0.6572843 0.9425518 0.9663419 0.9700796 0.2132531 0.0562514 0.7089385 0.1911747 0.9513211
0.3150179 0.4987158 0.5472079 0.3804589 0.6569250 0.9619353 0.1204870 0.6133104 0.7718005 0.8298695
0.5942941 0.2964820 0.5767488 0.2801063 0.4969586 0.6939726 0.6652277 0.9043894 0.8220853 0.6501431
0.5398818 0.0067256 0.5347702 0.0935663 0.9080668 0.2440419 0.5053460 0.2064903 0.9822692 0.0440910
0.2567786 0.2294226 0.8511809 0.6516491 0.1073913 0.8241950 0.9817716 0.8543800 0.3400275 0.9529938
0.5700380 0.9455092 0.5102088 0.5539329 0.0058831 0.6627464 0.3184132 0.6538248 0.5766122 0.8352150
0.4384866 0.9618210 0.6841067 0.4880946 0.3056896 0.8244916 0.6240189 0.5447771 0.0317932 0.4269364
0.0054480 0.9978763 0.7917681 0.6482806 0.5933597 0.4203822 0.3880279 0.8687756 0.7550784 0.6491559
Number of different array elements: 0
编辑:关于对多个参数的扩展,也有可能使用矢量化方法预先计算所需的索引(参见Divakar's answer here),但是代码会变得完全不可读。因此,至少在我看来,将 for 循环与之前介绍的矢量化代码一起使用是合适的。我还扩展了您的初始循环代码以进行比较:
data = rand(10000, 1);
x = [6, 8];
y = [10, 9];
offset = [3000, 5000];
A = numel(x);
tempData = zeros(max(x), max(y), A);
for jj = 1:A
for ii = 1:x(jj)
X = x(jj);
Y = y(jj);
off = offset(jj);
tempData(ii, 1:Y, jj) = data(off+ii-1:X:off+(X*Y)-1);
end
end
tempData
tempData2 = zeros(max(x), max(y), A);
for jj = 1:A
X = x(jj);
Y = y(jj);
off = offset(jj);
% MATLAB versions R2016b and newer
idx = [0:X-1].' + off + [0:X:(X*Y)-1];
% MATLAB versions before R2016b
%idx = bsxfun(@plus, bsxfun(@plus, [0:X-1].', off), [0:X:(X*Y)-1]);
tempData2(1:X, 1:Y, jj) = data(idx);
end
tempData2
disp(['Number of different array elements: ', ...
num2str(numel(find(tempData ~= tempData2)))]);
输出(缩短):
tempData =
ans(:,:,1) =
0.70758 0.71552 0.54604 0.73202 0.72717 0.16028 0.63080 0.48345 0.93159 0.96625
0.00320 0.11202 0.00179 0.90887 0.21830 0.91380 0.12110 0.31074 0.72834 0.52315
[...]
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
ans(:,:,2) =
0.11960 0.35942 0.62390 0.45457 0.63471 0.23471 0.75660 0.34019 0.06892 0.00000
0.46443 0.21113 0.55479 0.51218 0.83697 0.30117 0.13935 0.81838 0.80042 0.00000
[...]
tempData2 =
ans(:,:,1) =
0.70758 0.71552 0.54604 0.73202 0.72717 0.16028 0.63080 0.48345 0.93159 0.96625
0.00320 0.11202 0.00179 0.90887 0.21830 0.91380 0.12110 0.31074 0.72834 0.52315
[...]
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
ans(:,:,2) =
0.11960 0.35942 0.62390 0.45457 0.63471 0.23471 0.75660 0.34019 0.06892 0.00000
0.46443 0.21113 0.55479 0.51218 0.83697 0.30117 0.13935 0.81838 0.80042 0.00000
[...]
Number of different array elements: 0