将 DICOM 数据读入元胞数组的性能问题

Question

我需要读取 4000 个或更多 DICOM 文件。我编写了以下代码来读取文件并将数据存储到元胞数组中，以便稍后处理它们。单个 DICOM 文件包含 128 * 931 数据。但是一旦我执行代码，就需要超过 55 分钟才能完成迭代。有人可以向我指出以下代码的性能问题吗？

% read the file information form the disk to memory
readFile=dir('d:\images','*.dcm');

for i=1:4000

   % Read the information form the dicom files in to arrays

   data{i}=dicomread(readFile(i).name);
   info{i}=dicominfo(readFile(i).name);

   data_double{i}=double(data{1,i}); % convert 16 bit data into double
   first_chip{i}=data_double{1,i}(1:129,1:129); % extracting first chip data into an array

end

Answer 1

您可以运行 分析器 来检查代码的哪一部分占用了大部分时间！但就我看来，它的迭代大小和花费的时间非常真实。如果您有一个多核处理器，您可以尝试使用并行计算（ parfor 循环），这应该会显着减少运行时间，具体取决于您拥有的内核数量。

一个建议是先提取 'first chip data'，然后将其转换为 double，因为转换过程需要大量时间。

Answer 2

您正在将 128*931*4000 像素读入内存（假设 16 位值，即接近 1 GB），将其转换为双倍 (4 GB) 并提取区域 (129*129*4000*8 = 0.5 GB）。您保留了所有这三个副本，这是一个可怕的数据量！尽量不要保留所有数据：

readFile = dir('d:\images','*.dcm');
first_chip = cell(size(readFile));
info = cell(size(readFile));
for ii = 1:numel(readFile)
   info{ii} = dicominfo(readFile(ii).name);
   data = dicomread(info{ii});
   data = (1:129,1:129); % extracting first chip data
   first_chip{ii} = double(data); % convert 16 bit data into double
end

这里，我有 pre-allocated the first_chip and info arrays. If you don't do this, the arrays will be re-allocated every time you add an element, causing expensive copies. I have also extracted the ROI first, then converted to double, . Finally, I am re-using the DICOM info structure to read the file. I don't know if this makes a big difference in speed, but it saves the dicomread function some effort.

但请注意，此过程仍需要相当长的时间。读取 DICOM 文件很复杂，而且需要时间。我建议您一次性全部阅读，然后将 first_chip 和 info 元胞数组保存到 MAT 文件中，这样以后读起来会快很多。

将 DICOM 数据读入元胞数组的性能问题

Performance issue with reading DICOM data into cell array

matlab

image-processing

matrix

cell

dicom