计算 Keras 中多输出模型的梯度,转换为 Tensorflow DType 错误

Computing gradients of a multi-output model in Keras giving conversion to Tensorflow DType error

我在 Keras 中有一个多输出模型(准确地说是 18 个输出),每个输出都有一个损失函数。我试图在 faster-RCNN 中模仿区域提议网络。在训练之前,我想确保我的模型的梯度是有序的,我有一个片段如下:

with tf.GradientTape() as tape:
    loss = RegionProposalNetwork.evaluate(first_batch)[0]
    t = tape.watched_variables()
grads = tape.gradient(loss, RegionProposalNetwork.trainable_variables)
print(grads)

变量first_batch是使用take()从tf.data对象中获得的。功能。返回值loss是一个大小为19的数组,其中loss[0]是所有损失函数的总和,a.k.a,整体损失。在能够打印渐变数组之前,我收到错误 message/trace:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/James/PycharmProjects/Masters/models/MoreTesting.py", line 469, in <module>
    grads = tape.gradient(loss, RegionProposalNetwork.trainable_variables)
  File "C:\Users\James\Anaconda3\envs\masters\lib\site-packages\tensorflow\python\eager\backprop.py", line 1034, in gradient
    if not backprop_util.IsTrainable(t):
  File "C:\Users\James\Anaconda3\envs\masters\lib\site-packages\tensorflow\python\eager\backprop_util.py", line 30, in IsTrainable
    dtype = dtypes.as_dtype(dtype)
  File "C:\Users\James\Anaconda3\envs\masters\lib\site-packages\tensorflow\python\framework\dtypes.py", line 650, in as_dtype
    (type_value,))
TypeError: Cannot convert value 29.614826202392578 to a TensorFlow DType.

其中浮点数 29.614826202392578 是这次调用模型评估函数的总损失。我不确定这个错误是什么意思。作为参考,所有输入数据类型和中间层结果都是 tf.float32 值的张量。任何见解表示赞赏。

编辑:如果我尝试使用 tf.convert_to_tensor 将损失转换为张量,我不再收到错误,但是返回的梯度都是 None。我已经测试过我的模型权重已更新,将调用 fit(),所以出了点问题。

我遇到的问题是 return 值描述 here :

Return Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.

不是张量。类似地,model.predict() 将不起作用,因为结果是一个 numpy 数组,破坏了梯度计算。为了计算梯度,如果我只是简单地在测试输入数据上调用模型,然后计算相对于真实值的损失函数,循环就会起作用,a.k.a

with tf.GradientTape() as tape:
     model_output = model(input)
     loss = loss_fn(output, model_output)
gradients = tape.gradient(loss, model.trainable_variables)

# And if you are using a generator, 
batch = data_iterator.get_next()
input = batch[0]
output = batch[1]