TensorFlow 文本生成 RNN 示例在 TF 2.6 上失败,tf.sparse.to_dense(),无效参数:重复索引 [1] = [0]

TensorFlow text generation RNN example failing on TF 2.6, tf.sparse.to_dense(), Invalid argument: indices[1] = [0] is repeated

我正在尝试 运行 通过 TensorFlow 文本生成 RNN 示例,

https://github.com/tensorflow/text/blob/master/docs/tutorials/text_generation.ipynb

运行 在安装了 TensorFlow 2.6 的本地 Windows 计算机上。

我能够 运行 成功地训练 RNN 模型。我收到一个“张量”对象没有属性“numpy”的错误,但添加了,

tf.compat.v1.enable_eager_execution()

这解决了它。

但现在尝试用一些文本测试模型时出现错误,

Invalid argument: indices[1] = [0] is repeated

这发生在 OneStep 函数的 tf.sparse.to_dense 中。

class OneStep(tf.keras.Model):
  def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
    super().__init__()
    self.temperature = temperature
    self.model = model
    self.chars_from_ids = chars_from_ids
    self.ids_from_chars = ids_from_chars

    print(len(ids_from_chars.get_vocabulary()))
    # Create a mask to prevent "[UNK]" from being generated.
    skip_ids = self.ids_from_chars(['[UNK]'])[:, None]
    sparse_mask = tf.SparseTensor(
        # Put a -inf at each bad index.
        values=[-float('inf')]*len(skip_ids),
        indices=skip_ids,
        # Match the shape to the vocabulary
        dense_shape=[len(ids_from_chars.get_vocabulary())])
    print(sparse_mask)
    self.prediction_mask = tf.sparse.to_dense(sparse_mask)

我添加了一些调试来打印 ids_from_chars

76
SparseTensor(indices=tf.Tensor(
[[0]
[0]], shape=(2, 1), dtype=int64), values=tf.Tensor([-inf -inf], shape=(2,), dtype=float32), dense_shape=tf.Tensor([76], shape=(1,), dtype=int64))
2021-08-25 15:28:23.935194: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at sparse_to_dense_op.cc:162 : Invalid argument: indices[1] = [0] is repeated
Traceback (most recent call last):
File "app.py", line 1041, in test_nlp_text_generation
result = text_generation.predictionfunction(text, analytic_id)
File "D:\Projects\python-run-2\text_generation.py", line 238, in predictionfunction
one_step_model = OneStep(model, chars_from_ids, ids_from_chars)
File "D:\Projects\python-run-2\text_generation.py", line 166, in __init__
self.prediction_mask = tf.sparse.to_dense(sparse_mask)
File "D:\Users\james\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\sparse_ops.py", line 1721, in sparse_tensor_to_dense
name=name)
File "D:\Users\james\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_sparse_ops.py", line 3161, in sparse_to_dense
_ops.raise_from_not_ok_status(e, name)
File "D:\Users\james\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 6941, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1] = [0] is repeated [Op:SparseToDense]

另外,我以前在我的电脑上有这个例子 运行 没问题。刚刚重新安装了 TensorFlow 并再次从头开始尝试演示。

知道是什么导致了这个错误,或者如何解决它?

我通过下面的代码重现了这个错误

import tensorflow as tf

#there are same values in the tensor
skip_ids = tf.constant([[0], [0]], dtype=tf.int64)

sparse_mask = tf.SparseTensor(
    # Put a -inf at each bad index.
    values=[-float('inf')] * len(skip_ids),
    indices=skip_ids,
    # Match the shape to the vocabulary
    dense_shape=[76])
print(sparse_mask)

prediction_mask = tf.sparse.to_dense(sparse_mask)

您的索引具有相同的值,不允许在相同的位置赋值。在之前获取索引张量中的唯一值:

import tensorflow as tf

skip_ids = tf.constant([[0], [0]], dtype=tf.int64)

# get unique indices
tmp1 = tf.reshape(skip_ids, shape=(-1,))
uniques, idx, counts = tf.unique_with_counts(tmp1)
uniques_ids = tf.expand_dims(uniques, axis=1)


sparse_mask = tf.SparseTensor(
    # Put a -inf at each bad index.
    values=[-float('inf')] * len(uniques_ids),
    indices=uniques_ids,
    # Match the shape to the vocabulary
    dense_shape=[76])
print(sparse_mask)

prediction_mask = tf.sparse.to_dense(sparse_mask)

print(prediction_mask)

我的tensorflow版本是2.1.0