TensorFlow：为什么在从 TFRecord 文件解析 TF 示例时需要重塑非稀疏元素一次？

Question

在 GitHub 的 TensorFlow 文档中，有以下代码：

# Reshape non-sparse elements just once:
for k in self._keys_to_features:
  v = self._keys_to_features[k]
  if isinstance(v, parsing_ops.FixedLenFeature):
    example[k] = array_ops.reshape(example[k], v.shape)

我想知道为什么在从 TFRecord 文件解析后需要重塑 FixedLenFeature 张量。

事实上，FixedLenFeature 和 VarLenFeature 之间有什么区别，它们与 Tensor 有什么关系？在这种情况下我正在加载图像，那么为什么它们都被归类为 FixedLenFeature？ VarLenFeature 的示例是什么？

Answer 1

张量存储在磁盘上，没有形状信息 Example protocol buffer format（TFRecord 文件是示例的集合）。 .proto 文件中的文档描述得相当好，但基本要点是张量条目以 row-major 顺序存储，没有形状信息，因此在读取张量时必须提供。请注意，将 Tensor 存储在内存中的情况类似：形状信息是分开保存的，仅重塑 Tensor 仅更改元数据（另一方面，转置之类的东西可能很昂贵）。

VarLenFeatures 是诸如句子之类的序列，它们很难像常规张量一样批处理在一起，因为生成的形状会参差不齐。 parse_example documentation 有一些很好的例子。图片是固定长度的，如果你加载一批，它们将具有相同的形状（例如，它们都是 32x32 像素，所以一批 10 个可以具有 10x32x32 的形状）。

TensorFlow：为什么在从 TFRecord 文件解析 TF 示例时需要重塑非稀疏元素一次？

TensorFlow: Why is there a need to reshape non-sparse elements once when parsing a TF-example from TFRecord files?

python

machine-learning

computer-vision

deep-learning

tensorflow