从数据集中训练词嵌入时不提供梯度

No gradients provided when training word embeddings from dataset

我正在尝试从 TF2 数据集训练自定义词嵌入。我的文本已经编码为整数,我的模型在示例数据集(即从 tfds catalog 加载的数据集)上运行良好。但是当我从我的(批处理的)数据集中输入我的张量时,模型无法使用 ValueError: No gradients provided for any variable: ['embed/embeddings:0', 'relu/kernel:0', 'relu/bias:0', 'out/kernel:0', 'out/bias:0'].

开始训练

不明白为什么会这样。导致相同错误的类似代码示例:

import tensorflow as tf
from tensorflow import keras 
from tensorflow.keras import layers

# build example dataset
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)

model = keras.Sequential([
  layers.Embedding(100, 18, name='embed'),
  layers.GlobalAveragePooling1D(),
  layers.Dense(16, activation='relu', name='relu'),
  layers.Dense(1, name='out')
], name="embedder")

model.summary() # shows 2,121 trainable params
model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=['accuracy'])

model.fit( # breaks here
    dataset,
    epochs=10,
    validation_data=dataset)

tf.keras.Sequential.fitx 参数说(original documentation,由于正在使用数据集,我将似乎相关的部分加粗):

Input data. It could be:

  • A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
  • A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs).
  • A dict mapping input names to the corresponding array/tensors, if the model has named inputs.
  • A tf.data dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).
  • A generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample_weights). A more detailed description of unpacking behavior for iterator types (Dataset, generator, Sequence) is given below.

但是,数据集似乎已设置为仅产生单个张量(而不是张量元组):

tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)
print(dataset.element_spec)
>>> TensorSpec(shape=(3,), dtype=tf.int32, name=None)

使用 targets 列表构建数据集应该为模型提供所需的内容:

tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices((
  tensor,
  [[0.0], [1.0], [1.0]]  # Arbitrary targets.
))
print(dataset.element_spec)
>>> (TensorSpec(shape=(3, 1), dtype=tf.int32, name=None), TensorSpec(shape=(1,), dtype=tf.float32, name=None))