如何在 MATLAB 中加载 MNIST 数字和标签数据？

Question

我正在尝试运行 link

中给出的代码

https://github.com/bd622/DiscretHashing

离散散列是一种用于近似最近邻搜索的降维方法。我想加载 http://yann.lecun.com/exdb/mnist/ 中可用的 MNIST 数据库的实现。我已经从压缩的 gz 格式中提取了文件。

问题 1：

使用Reading MNIST Image Database binary file in MATLAB

中提供的解决方案读取MNIST数据库

我收到以下错误：

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in Reading (line 7)
A = fread(fid, 1, 'uint32');

代码如下：

clear all;
close all;

%//Open file
fid = fopen('t10k-images-idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//For each image, store into an individual cell
imageCellArray = cell(1, totalImages);
for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

更新：问题 1 已解决，修改后的代码为

clear all;
close all;

%//Open file
fid = fopen('t10k-images.idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//Read in total number of images
%//A = fread(fid, 4, 'uint8');
%//totalImages = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
totalImages = swapbytes(uint32(A));

%//Read in number of rows
%//A = fread(fid, 4, 'uint8');
%//numRows = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
numRows = swapbytes(uint32(A));

%//Read in number of columns
%//A = fread(fid, 4, 'uint8');
%//numCols = sum(bitshift(A', [24 16 8 0]));

%// OR
A = fread(fid, 1, 'uint32');
numCols = swapbytes(uint32(A));

for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

问题 2：

我无法理解如何在代码中应用 MNIST 的 4 个文件。代码包含变量

traindata = double(traindata);
testdata = double(testdata);

如何准备 MNIST 数据库以便我可以申请实施？

更新：我实施了解决方案，但我一直收到此错误

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in mnist_parse (line 11)
A = fread(fid1, 1, 'uint32');

这些是文件

demo.m %这是调用函数读入MNIST数据的主文件

clear all
clc
[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx1-ubyte');

[Testimages, Testlabels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

k=5;
digit = images(:,:,k);
lbl = label(k);

 function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

Answer 1

我是您所说的方法#1 的原作者。读取训练数据和测试标签的过程非常简单。在读取图像方面，您上面显示的代码可以完美读取文件并且采用元胞数组格式。但是，您缺少读取文件中图像、行和列的数量。请注意，此文件的 MNIST 格式采用以下方式。左列是您引用的相对于开头的偏移量（以字节为单位）：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000803(2051) magic number
0004     32 bit integer  60000            number of images
0008     32 bit integer  28               number of rows
0012     32 bit integer  28               number of columns
0016     unsigned byte   ??               pixel
0017     unsigned byte   ??               pixel
........
xxxx     unsigned byte   ??               pixel

前四个字节是一个幻数：2051，以确保您正确读取文件。接下来的四个字节表示图像总数，接下来的四个字节是行，最后四个字节是列。应该有 60000 张大小为 28 行 x 28 列的图像。在此之后，像素以行主要格式交错，因此您必须遍历 28 x 28 像素系列并存储它们。在这种情况下，我将它们存储在一个元胞数组中，并且此元胞数组中的每个元素都是一个数字。同样的格式也适用于测试数据，但有 10000 张图像。

至于实际标签，格式大致相同，但略有不同：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000801(2049) magic number (MSB first)
0004     32 bit integer  60000            number of items
0008     unsigned byte   ??               label
0009     unsigned byte   ??               label
........
xxxx     unsigned byte   ??               label

前四个字节是一个神奇的数字：2049，然后第二组四个字节告诉你有多少个标签，最后数据集中每个对应的数字恰好有 1 个字节。测试数据也是同样的格式，但是有 10000 个标签。因此，一旦您在标签集中读入了必要的数据，您只需调用一次 fread 并确保数据是无符号 8 位整数，以便在其余标签中读取。

现在你必须使用 swapbytes 的原因是因为 MATLAB 将以小端格式读取数据，这意味着 least significant byte 来自首先读入一组字节。完成后，您可以使用 swapbytes 重新排列此顺序。

因此，我已经为您修改了这段代码，使其成为一个接受一组两个字符串的实际函数：数字图像文件的完整路径和数字的完整路径。我还更改了代码，以便图像是 3D 数字矩阵，而不是元胞数组，以便更快地处理。请注意，当您开始读取实际图像数据时，每个像素都是无符号 8 位整数，因此无需进行任何字节交换。这仅在一次 fread 调用中读取多个字节时才需要：

function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

要调用此函数，只需指定图像文件和标签文件的路径即可。假设你是运行这个文件在文件所在的同一目录下，你会为训练图像执行以下操作之一：

[images, labels] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');

此外，您还可以对测试图像执行以下操作：

[images, labels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

要访问第 k 个数字，您只需执行以下操作：

digit = images(:,:,k);

第 k 位对应的标签为：

lbl = label(k);

为了最终将此数据转换为我在 Github 上看到的代码可接受的格式，他们假设行对应于训练示例，列对应于特征。如果您希望拥有这种格式，只需重塑数据，使图像像素分布在列中。

因此，只需这样做：

[trainingdata, traingnd] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');
trainingdata = double(reshape(trainingdata, size(trainingdata,1)*size(trainingdata,2), []).');
traingnd = double(traingnd);

[testdata, testgnd] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');
testdata = double(reshape(testdata, size(testdata,1)*size(testdata_data,2), []).');
testgnd = double(testgnd);

以上使用与脚本中相同的变量，因此您应该能够插入它并且它应该可以工作。第二行重塑矩阵，使每个数字都在一列中，但我们需要转置它，以便每个数字都在一列中。我们还需要转换为 double，因为那是 Github 代码所做的。同样的逻辑也适用于测试数据。另请注意，我已明确将训练和测试标签转换为 double，以确保在您决定对此数据使用的任何算法中实现最大兼容性。

快乐的数字黑客！

如何在 MATLAB 中加载 MNIST 数字和标签数据？

How do I load in the MNIST digits and label data in MATLAB?

matlab

image

image-processing

mnist