在不破坏 TensorFlow 中的梯度的情况下限制小数点后的数字（四舍五入）

Question

我正在训练 Tensorflow 模型并且需要高精度的输出。我的输出格式是：

U = X1.Y1Y2Y3Y4Y5Y6
V = X1.Y1Y2Y3Y4Y5Y6

其中X1为小数点前数位，Y1,..,Y6为小数点后数位。显然，圆操作不能用，因为它破坏了梯度。我想出了以下想法：

U = tf.cast(tf.cast(U,'float16'),'float32')
W = U+1e-4*V

这样，不同的数字可以由不同的TensorFlow变量控制，训练效率应该更高。我期待得到如下输出：

U= X1.Y1Y2Y3000

和 Y4=Y5=Y6=0。然而，数字 Y4、Y5 和 Y6 得到了随机值。

我的问题：

从 flaot16 到 float32 的上转换是否会出现这种行为？
我可以修改 tf.cast 行为吗？

Python代码：

x = tf.constant(1.222222222222222222222,'float32')
print(x.numpy())
x_ = tf.cast(tf.cast(x,'float16'),'float32')
print(x_.numpy())

输出：

1.2222222
1.2226562

Answer 1

转换为较低的位深度，例如将 float32 转换为 float16，实际上是以 2 为基数进行舍入，将较低的位替换为零。这与以 10 为基数的舍入不同；它不一定会用零替换较低的 base-10 十进制数字。

假设“base-2 舍入”就足够了，TensorFlow 的“fake_quant”操作可用于此目的，例如 tf.quantization.fake_quant_with_min_max_args. They simulate the effect of converting to lower bit-depth, yet are differentiable. The Post-training quantization 指南也可能有帮助。

另一个想法：如果你需要破解某些东西的梯度，请查看实用程序 tf.custom_gradient and tf.stop_gradient。

在不破坏 TensorFlow 中的梯度的情况下限制小数点后的数字（四舍五入）

Limit the digits after decimal point (rounding) without breaking the gradient in TensorFlow

floating-point

rounding

tensorflow