使用 Keras 微调通用句子编码器

Question

我正在尝试微调 Universal Sentence Encoder 并将新的编码器层用于其他用途。

import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub

module_url = "universal-sentence-encoder"
model = Sequential([
    hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
    Dropout(0.5, name="dropout"),
    Dense(256, activation="relu", name="dense"),
    Dense(len(y), activation="sigmoid", name="activation")
])

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)

这成功了。损失下降了，准确性也不错。现在我只想提取 Universal Sentence Encoder 层。但是，这就是我得到的。

你知道我该如何解决这个 nan 问题吗？我希望看到数值的编码。
是否只能按照的建议将tuned_use图层保存为模型？理想情况下，我想像 Universal Sentence Encoder 一样保存 tuned_use 图层，以便我可以像 hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).

Answer 1

希望这会对某人有所帮助，我最终使用 universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5 解决了这个问题。我花了很多时间进行故障排除，但这很困难，因为输入数据没有问题并且模型已成功训练。这可能是由于梯度爆炸问题，但无法将 gradient clipping 或 Leaky ReLU 添加到原始架构中。

使用 Keras 微调通用句子编码器

Fine tune Universal Sentence Encoder with Keras

python

keras

tensorflow

tensorflow-hub