矢量化代码比 Matlab 中的 for 循环慢
Vectorized code slower than for loop in Matlab
我有一个名为 gimg 的 8x8 矩阵。我已经使用这段代码为 5 个不同的 gimg 矩阵执行了这段代码,一个是矢量化的,另一个在 for 循环中。
tic
dm = zeros(size(gimg));
for x = 1:size(gimg, 1)
for y = 1:size(gimg, 2)
dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
end
end
toc
tic
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
dm = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
toc
这是结果,
Elapsed time is 0.000057 seconds.
Elapsed time is 0.000247 seconds.
Elapsed time is 0.000062 seconds.
Elapsed time is 0.000199 seconds.
Elapsed time is 0.000056 seconds.
Elapsed time is 0.000195 seconds.
Elapsed time is 0.000055 seconds.
Elapsed time is 0.000192 seconds.
Elapsed time is 0.000056 seconds.
Elapsed time is 0.000187 seconds.
是个位矩阵的原因吗?
我发现 matlab 中的特征加速显着改变了 for 循环的时间。所以我的问题是,现在是否值得使用 JIT 编译器的这些功能对代码进行矢量化?
更新:
这是我的 gimg 矩阵的一个例子
gimg =
259 42 0 0 0 0 0 0
42 1064 41 0 0 0 0 0
0 55 3444 196 0 0 0 0
0 0 215 3581 47 0 0 0
0 0 0 100 806 3 0 0
0 0 0 0 3 2 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
更新 2:@Divakar 代码的结果
>> test_vct
------------------------ With Original Loopy Approach
Elapsed time is 5.269883 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 6.314792 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 3.146764 seconds.
>>
因此,在我的计算机中,原始矢量化方法仍然较慢。
我的电脑规格和 Matlab 版本
- Matlab 2015a
- Windows 8.1 x64
- 英特尔 i7 860 2.80 Ghz
- 16 Gb RAM
- Nvidia Geforce GTS250
这似乎比这两个都快 -
dm = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
基准代码-
%// Random input
gimg = rand(8,8);
%// Number of trials (keep this a big number, as so to get runtimes of 1sec+
num_iter = 100000;
disp('------------------------ With Original Loopy Approach')
tic
for iter = 1:num_iter
dm = zeros(size(gimg));
for x = 1:size(gimg, 1)
for y = 1:size(gimg, 2)
dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
end
end
end
toc
disp('------------------------ With Original Vectorized Approach')
tic
for iter = 1:num_iter
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
dm2 = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
end
toc
disp('------------------------ With Proposed Vectorized Approach')
tic
for iter = 1:num_iter
dm3 = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
end
toc
结果-
------------------------ With Original Loopy Approach
Elapsed time is 4.996531 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 2.684011 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 1.338118 seconds.
我有一个名为 gimg 的 8x8 矩阵。我已经使用这段代码为 5 个不同的 gimg 矩阵执行了这段代码,一个是矢量化的,另一个在 for 循环中。
tic
dm = zeros(size(gimg));
for x = 1:size(gimg, 1)
for y = 1:size(gimg, 2)
dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
end
end
toc
tic
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
dm = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
toc
这是结果,
Elapsed time is 0.000057 seconds.
Elapsed time is 0.000247 seconds.
Elapsed time is 0.000062 seconds.
Elapsed time is 0.000199 seconds.
Elapsed time is 0.000056 seconds.
Elapsed time is 0.000195 seconds.
Elapsed time is 0.000055 seconds.
Elapsed time is 0.000192 seconds.
Elapsed time is 0.000056 seconds.
Elapsed time is 0.000187 seconds.
是个位矩阵的原因吗?
我发现 matlab 中的特征加速显着改变了 for 循环的时间。所以我的问题是,现在是否值得使用 JIT 编译器的这些功能对代码进行矢量化?
更新: 这是我的 gimg 矩阵的一个例子
gimg =
259 42 0 0 0 0 0 0
42 1064 41 0 0 0 0 0
0 55 3444 196 0 0 0 0
0 0 215 3581 47 0 0 0
0 0 0 100 806 3 0 0
0 0 0 0 3 2 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
更新 2:@Divakar 代码的结果
>> test_vct
------------------------ With Original Loopy Approach
Elapsed time is 5.269883 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 6.314792 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 3.146764 seconds.
>>
因此,在我的计算机中,原始矢量化方法仍然较慢。
我的电脑规格和 Matlab 版本
- Matlab 2015a
- Windows 8.1 x64
- 英特尔 i7 860 2.80 Ghz
- 16 Gb RAM
- Nvidia Geforce GTS250
这似乎比这两个都快 -
dm = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
基准代码-
%// Random input
gimg = rand(8,8);
%// Number of trials (keep this a big number, as so to get runtimes of 1sec+
num_iter = 100000;
disp('------------------------ With Original Loopy Approach')
tic
for iter = 1:num_iter
dm = zeros(size(gimg));
for x = 1:size(gimg, 1)
for y = 1:size(gimg, 2)
dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
end
end
end
toc
disp('------------------------ With Original Vectorized Approach')
tic
for iter = 1:num_iter
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
dm2 = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
end
toc
disp('------------------------ With Proposed Vectorized Approach')
tic
for iter = 1:num_iter
dm3 = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
end
toc
结果-
------------------------ With Original Loopy Approach
Elapsed time is 4.996531 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 2.684011 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 1.338118 seconds.