矢量化代码比 Matlab 中的 for 循环慢

Vectorized code slower than for loop in Matlab

我有一个名为 gimg 的 8x8 矩阵。我已经使用这段代码为 5 个不同的 gimg 矩阵执行了这段代码,一个是矢量化的,另一个在 for 循环中。

tic
dm = zeros(size(gimg));

for x = 1:size(gimg, 1)
    for y = 1:size(gimg, 2)
        dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
    end
end
toc

tic
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));  

dm = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
toc

这是结果,

Elapsed time is 0.000057 seconds.
Elapsed time is 0.000247 seconds.

Elapsed time is 0.000062 seconds.
Elapsed time is 0.000199 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000195 seconds.

Elapsed time is 0.000055 seconds.
Elapsed time is 0.000192 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000187 seconds.

是个位矩阵的原因吗?

我发现 matlab 中的特征加速显着改变了 for 循环的时间。所以我的问题是,现在是否值得使用 JIT 编译器的这些功能对代码进行矢量化?

更新: 这是我的 gimg 矩阵的一个例子

gimg =

         259          42           0           0           0           0           0           0
          42        1064          41           0           0           0           0           0
           0          55        3444         196           0           0           0           0
           0           0         215        3581          47           0           0           0
           0           0           0         100         806           3           0           0
           0           0           0           0           3           2           0           0
           0           0           0           0           0           0           0           0
           0           0           0           0           0           0           0           0

更新 2:@Divakar 代码的结果

>> test_vct
------------------------ With Original Loopy Approach
Elapsed time is 5.269883 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 6.314792 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 3.146764 seconds.
>> 

因此,在我的计算机中,原始矢量化方法仍然较慢。

我的电脑规格和 Matlab 版本

这似乎比这两个都快 -

dm = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);

基准代码-

%// Random input
gimg = rand(8,8);

%// Number of trials (keep this a big number, as so to get runtimes of 1sec+
num_iter = 100000;

disp('------------------------ With Original Loopy Approach')
tic
for iter = 1:num_iter
    dm = zeros(size(gimg));     
    for x = 1:size(gimg, 1)
        for y = 1:size(gimg, 2)
            dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
        end
    end
end
toc

disp('------------------------ With Original Vectorized Approach')
tic
for iter = 1:num_iter
    [x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
    dm2 = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
end
toc

disp('------------------------ With Proposed Vectorized Approach')
tic
for iter = 1:num_iter
    dm3 = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
end
toc

结果-

------------------------ With Original Loopy Approach
Elapsed time is 4.996531 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 2.684011 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 1.338118 seconds.