将队列附加到 tensorflow 中的 numpy 数组以获取数据而不是文件？

Question

我已阅读 CNN Tutorial on the TensorFlow 并且我正尝试在我的项目中使用相同的模型。现在的问题是数据读取。我有大约 25000 张用于训练的图像和大约 5000 张用于测试和验证的图像。这些文件是 png 格式，我可以读取它们并将它们转换为 numpy.ndarray。

教程中的 CNN 示例使用队列从提供的文件列表中获取记录。我试图通过将我的图像重塑为一维数组并在其前面附加一个标签值来创建我自己的二进制文件。所以我的数据看起来像这样

[[1,12,34,24,53,...,105,234,102],
 [12,112,43,24,52,...,115,244,98],
....
]

上述数组的单行长度为 22501 大小，其中第一个元素是标签。

我将文件转储到使用 pickle 并尝试使用 tf.FixedLengthRecordReader 从文件中读取为 demonstrated in example

我正在做与 cifar10_input.py 中给出的相同的事情来读取二进制文件并将它们放入记录对象中。

现在，当我从文件中读取标签和图像值时，它们是不同的。我可以理解这是因为 pickle 也会在二进制文件中转储大括号和方括号的额外信息，并且它们会更改固定长度的记录大小。

上面的示例使用文件名并将其传递给队列以获取文件，然后队列从文件中读取单个记录。

我想知道我是否可以将上面定义的 numpy 数组而不是文件名传递给某些 reader 并且它可以从该数组而不是文件中一条一条地获取记录。

Answer 1

可能使您的数据与 CNN 示例代码一起工作的最简单方法是制作 read_cifar10() 的修改版本并改用它：

写出一个包含您的 numpy 数组内容的二进制文件。
```
import numpy as np
images_and_labels_array = np.array([[...], ...],  # [[1,12,34,24,53,...,102],
                                                  #  [12,112,43,24,52,...,98],
                                                  #  ...]
                                   dtype=np.uint8)

images_and_labels_array.tofile("/tmp/images.bin")
```
此文件类似于 CIFAR10 数据文件中使用的格式。您可能想要生成多个文件以获得读取并行性。请注意，ndarray.tofile() 以 row-major 顺序写入二进制数据，没有其他元数据；酸洗数组将添加 TensorFlow 的解析例程无法理解的 Python-specific 元数据。

编写 read_cifar10() 的修改版本来处理您的记录格式。

def read_my_data(filename_queue):

  class ImageRecord(object):
    pass
  result = ImageRecord()

  # Dimensions of the images in the dataset.
  label_bytes = 1
  # Set the following constants as appropriate.
  result.height = IMAGE_HEIGHT
  result.width = IMAGE_WIDTH
  result.depth = IMAGE_DEPTH
  image_bytes = result.height * result.width * result.depth
  # Every record consists of a label followed by the image, with a
  # fixed number of bytes for each.
  record_bytes = label_bytes + image_bytes

  assert record_bytes == 22501  # Based on your question.

  # Read a record, getting filenames from the filename_queue.  No
  # header or footer in the binary, so we leave header_bytes
  # and footer_bytes at their default of 0.
  reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
  result.key, value = reader.read(filename_queue)

  # Convert from a string to a vector of uint8 that is record_bytes long.
  record_bytes = tf.decode_raw(value, tf.uint8)

  # The first bytes represent the label, which we convert from uint8->int32.
  result.label = tf.cast(
      tf.slice(record_bytes, [0], [label_bytes]), tf.int32)

  # The remaining bytes after the label represent the image, which we reshape
  # from [depth * height * width] to [depth, height, width].
  depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
                           [result.depth, result.height, result.width])
  # Convert from [depth, height, width] to [height, width, depth].
  result.uint8image = tf.transpose(depth_major, [1, 2, 0])

  return result

修改 distorted_inputs() 以使用您的新数据集：

def distorted_inputs(data_dir, batch_size):
  """[...]"""
  filenames = ["/tmp/images.bin"]  # Or a list of filenames if you
                                   # generated multiple files in step 1.
  for f in filenames:
    if not gfile.Exists(f):
      raise ValueError('Failed to find file: ' + f)

  # Create a queue that produces the filenames to read.
  filename_queue = tf.train.string_input_producer(filenames)

  # Read examples from files in the filename queue.
  read_input = read_my_data(filename_queue)
  reshaped_image = tf.cast(read_input.uint8image, tf.float32)

  # [...] (Maybe modify other parameters in here depending on your problem.)

鉴于您的起点，这只是一组最少的步骤。使用 TensorFlow ops 进行 PNG 解码可能更有效，但这将是一个更大的变化。

Answer 2

在你的问题中，你特别问了：

I want to know if I can pass the numpy array as defined above instead of the filenames to some reader and it can fetch records one by one from that array instead of the files.

您可以将 numpy 数组直接提供给队列，但这对 cifar10_input.py 代码的更改将比我的建议的更具侵入性.

和以前一样，假设您的问题中有以下数组：

import numpy as np
images_and_labels_array = np.array([[...], ...],  # [[1,12,34,24,53,...,102],
                                                  #  [12,112,43,24,52,...,98],
                                                  #  ...]
                                   dtype=np.uint8)

然后您可以定义一个包含全部数据的队列，如下所示：

q = tf.FIFOQueue([tf.uint8, tf.uint8], shapes=[[], [22500]])
enqueue_op = q.enqueue_many([image_and_labels_array[:, 0], image_and_labels_array[:, 1:]])

...然后调用 sess.run(enqueue_op) 填充队列。

另一种更有效的方法是 将记录馈入 队列，您可以从并行线程执行此操作（请参阅了解有关如何执行的更多详细信息这会起作用）：

# [With q as defined above.]
label_input = tf.placeholder(tf.uint8, shape=[])
image_input = tf.placeholder(tf.uint8, shape=[22500])

enqueue_single_from_feed_op = q.enqueue([label_input, image_input])

# Then, to enqueue a single example `i` from the array.
sess.run(enqueue_single_from_feed_op,
         feed_dict={label_input: image_and_labels_array[i, 0],
                    image_input: image_and_labels_array[i, 1:]})

或者，一次入队一个批次，这样效率会更高：

label_batch_input = tf.placeholder(tf.uint8, shape=[None])
image_batch_input = tf.placeholder(tf.uint8, shape=[None, 22500])

enqueue_batch_from_feed_op = q.enqueue([label_batch_input, image_batch_input])

# Then, to enqueue a batch examples `i` through `j-1` from the array.
sess.run(enqueue_single_from_feed_op,
         feed_dict={label_input: image_and_labels_array[i:j, 0],
                    image_input: image_and_labels_array[i:j, 1:]})

Answer 3

I want to know if I can pass the numpy array as defined above instead of the filenames to some reader and it can fetch records one by one from that array instead of the files.

tf.py_func, that wraps a python function and uses it as a TensorFlow operator, might help. Here's an 。

但是，既然你提到你的图像存储在 png 文件中，我认为最简单的解决方案是替换 this:

reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
result.key, value = reader.read(filename_queue)

有了这个：

result.key, value = tf.WholeFileReader().read(filename_queue))
value = tf.image.decode_jpeg(value)

将队列附加到 tensorflow 中的 numpy 数组以获取数据而不是文件？

Attach a queue to a numpy array in tensorflow for data fetch instead of files?

python

machine-learning

tensorflow