从数据集中训练词嵌入时不提供梯度
No gradients provided when training word embeddings from dataset
我正在尝试从 TF2 数据集训练自定义词嵌入。我的文本已经编码为整数,我的模型在示例数据集(即从 tfds catalog 加载的数据集)上运行良好。但是当我从我的(批处理的)数据集中输入我的张量时,模型无法使用 ValueError: No gradients provided for any variable: ['embed/embeddings:0', 'relu/kernel:0', 'relu/bias:0', 'out/kernel:0', 'out/bias:0'].
开始训练
不明白为什么会这样。导致相同错误的类似代码示例:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# build example dataset
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)
model = keras.Sequential([
layers.Embedding(100, 18, name='embed'),
layers.GlobalAveragePooling1D(),
layers.Dense(16, activation='relu', name='relu'),
layers.Dense(1, name='out')
], name="embedder")
model.summary() # shows 2,121 trainable params
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
model.fit( # breaks here
dataset,
epochs=10,
validation_data=dataset)
tf.keras.Sequential.fit
的 x
参数说(original documentation,由于正在使用数据集,我将似乎相关的部分加粗):
Input data. It could be:
- A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
- A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs).
- A
dict
mapping input names to the corresponding array/tensors, if the model has named inputs.
- A
tf.data dataset
. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).
- A generator or
keras.utils.Sequence
returning (inputs, targets) or (inputs, targets, sample_weights). A more detailed description of unpacking behavior for iterator types (Dataset, generator, Sequence) is given below.
但是,数据集似乎已设置为仅产生单个张量(而不是张量元组):
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)
print(dataset.element_spec)
>>> TensorSpec(shape=(3,), dtype=tf.int32, name=None)
使用 targets
列表构建数据集应该为模型提供所需的内容:
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices((
tensor,
[[0.0], [1.0], [1.0]] # Arbitrary targets.
))
print(dataset.element_spec)
>>> (TensorSpec(shape=(3, 1), dtype=tf.int32, name=None), TensorSpec(shape=(1,), dtype=tf.float32, name=None))
我正在尝试从 TF2 数据集训练自定义词嵌入。我的文本已经编码为整数,我的模型在示例数据集(即从 tfds catalog 加载的数据集)上运行良好。但是当我从我的(批处理的)数据集中输入我的张量时,模型无法使用 ValueError: No gradients provided for any variable: ['embed/embeddings:0', 'relu/kernel:0', 'relu/bias:0', 'out/kernel:0', 'out/bias:0'].
不明白为什么会这样。导致相同错误的类似代码示例:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# build example dataset
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)
model = keras.Sequential([
layers.Embedding(100, 18, name='embed'),
layers.GlobalAveragePooling1D(),
layers.Dense(16, activation='relu', name='relu'),
layers.Dense(1, name='out')
], name="embedder")
model.summary() # shows 2,121 trainable params
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
model.fit( # breaks here
dataset,
epochs=10,
validation_data=dataset)
tf.keras.Sequential.fit
的 x
参数说(original documentation,由于正在使用数据集,我将似乎相关的部分加粗):
Input data. It could be:
- A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
- A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs).
- A
dict
mapping input names to the corresponding array/tensors, if the model has named inputs.- A
tf.data dataset
. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).- A generator or
keras.utils.Sequence
returning (inputs, targets) or (inputs, targets, sample_weights). A more detailed description of unpacking behavior for iterator types (Dataset, generator, Sequence) is given below.
但是,数据集似乎已设置为仅产生单个张量(而不是张量元组):
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices(tensor)
print(dataset.element_spec)
>>> TensorSpec(shape=(3,), dtype=tf.int32, name=None)
使用 targets
列表构建数据集应该为模型提供所需的内容:
tensor = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dataset = tf.data.Dataset.from_tensor_slices((
tensor,
[[0.0], [1.0], [1.0]] # Arbitrary targets.
))
print(dataset.element_spec)
>>> (TensorSpec(shape=(3, 1), dtype=tf.int32, name=None), TensorSpec(shape=(1,), dtype=tf.float32, name=None))