如何使用 TensorFlow 从 MNIST 获得对一张图像的预测？

Question

我遵循了本教程 https://www.tensorflow.org/tutorials/layers 并训练了一个模型来识别 MNIST 集中的手写数字。

以下代码按预期工作并为集合中的每个图像打印概率和 class

mnist = tf.contrib.learn.datasets.load_dataset("mnist")
train_data = mnist.train.images  # Returns np.array
tf.reset_default_graph()  
with tf.Session() as sess:

  mnist_classifier = tf.estimator.Estimator(model_fn=cnn_model_fn, model_dir="model/")

  pred = mnist_classifier.predict(input_fn=tf.estimator.inputs.numpy_input_fn(
      x={"x": train_data},
      shuffle=False))

  for p in pred:
    print(p)

但是，当我尝试使用

仅预测一张图像时

mnist_classifier.predict(input_fn=tf.estimator.inputs.numpy_input_fn(
          x={"x": train_data[0]},
          shuffle=False))

我的程序失败并且 TensorFlow 报告

InvalidArgumentError: Input to reshape is a tensor with 128 values,
but the requested shape requires a multiple of 784

这让我很困惑，因为当我打印集合中第一张图像的长度时，它报告 784

print("length of input: {}".format(len(train_data[0]))

我如何得到一张图片的预测结果？

Answer 1

这很可能与您在创建 single-item 数据集时删除了批处理维度有关。我的意思是你应该使用

mnist_classifier.predict(input_fn=tf.estimator.inputs.numpy_input_fn(
      x={"x": np.array([train_data[0])]},
      shuffle=False))

相反。请注意围绕 train_data[0] 的附加列表。这将采用形状为 [1, 784] 的数组并创建一个包含一个元素的数据集，这又是一个包含 784 个元素的向量。正如您现在的代码一样，您基本上是在创建一个包含 784 个元素的数据集，每个元素都是一个数字。这会导致形状不匹配。

Answer 2

您也可以使用 tf.expand_dims。 documentation 表示：

如果要向单个元素添加批量维度，此操作很有用。例如，如果您有一张形状为 [height, width, channels] 的图像，您可以使用 expand_dims(image, 0) 将其作为一个图像的批处理，这将使形状为 [1, height, width, channels].

如何使用 TensorFlow 从 MNIST 获得对一张图像的预测？

How to get predictions for one image from the MNIST with TensorFlow?

python

mnist

tensorflow