TensorFlow 量化感知训练的量化节点中附加参数的用途
Purpose of additional parameters in Quantization Nodes of TensorFlow Quantization Aware Training
目前,我正在尝试了解 TensorFlow 中的量化感知训练。我知道,需要假量化节点来收集动态范围信息作为量化操作的校准。当我将同一个模型与 "plain" Keras 模型和一次作为量化感知模型进行比较时,后者具有更多参数,这是有道理的,因为我们需要在量化感知训练期间存储激活的最小值和最大值。
考虑以下示例:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model
def get_model(in_shape):
inpt = layers.Input(shape=in_shape)
dense1 = layers.Dense(256, activation="relu")(inpt)
dense2 = layers.Dense(128, activation="relu")(dense1)
out = layers.Dense(10, activation="softmax")(dense2)
model = Model(inpt, out)
return model
该模型具有以下摘要:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_3 (Dense) (None, 256) 200960
_________________________________________________________________
dense_4 (Dense) (None, 128) 32896
_________________________________________________________________
dense_5 (Dense) (None, 10) 1290
=================================================================
Total params: 235,146
Trainable params: 235,146
Non-trainable params: 0
_________________________________________________________________
但是,如果我让我的模型优化感知,它会打印以下摘要:
import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
# q_aware stands for for quantization aware.
q_aware_model = quantize_model(standard)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
quantize_layer (QuantizeLaye (None, 784) 3
_________________________________________________________________
quant_dense_3 (QuantizeWrapp (None, 256) 200965
_________________________________________________________________
quant_dense_4 (QuantizeWrapp (None, 128) 32901
_________________________________________________________________
quant_dense_5 (QuantizeWrapp (None, 10) 1295
=================================================================
Total params: 235,164
Trainable params: 235,146
Non-trainable params: 18
_________________________________________________________________
我有两个问题:
- 在输入层之后
quantize_layer
有 3 个参数的目的是什么?
- 为什么我们每层有 5 个额外的不可训练参数,它们的具体用途是什么?
我感谢任何提示或进一步 material 帮助我(和其他偶然发现这个问题的人)理解量化意识训练。
量化层用于将浮点输入转换为 int8。量化参数用于输出min/max和零点计算
量化密集层需要一些额外的参数:min/max 用于内核,min/max/零点用于输出激活。
目前,我正在尝试了解 TensorFlow 中的量化感知训练。我知道,需要假量化节点来收集动态范围信息作为量化操作的校准。当我将同一个模型与 "plain" Keras 模型和一次作为量化感知模型进行比较时,后者具有更多参数,这是有道理的,因为我们需要在量化感知训练期间存储激活的最小值和最大值。
考虑以下示例:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model
def get_model(in_shape):
inpt = layers.Input(shape=in_shape)
dense1 = layers.Dense(256, activation="relu")(inpt)
dense2 = layers.Dense(128, activation="relu")(dense1)
out = layers.Dense(10, activation="softmax")(dense2)
model = Model(inpt, out)
return model
该模型具有以下摘要:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_3 (Dense) (None, 256) 200960
_________________________________________________________________
dense_4 (Dense) (None, 128) 32896
_________________________________________________________________
dense_5 (Dense) (None, 10) 1290
=================================================================
Total params: 235,146
Trainable params: 235,146
Non-trainable params: 0
_________________________________________________________________
但是,如果我让我的模型优化感知,它会打印以下摘要:
import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
# q_aware stands for for quantization aware.
q_aware_model = quantize_model(standard)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
quantize_layer (QuantizeLaye (None, 784) 3
_________________________________________________________________
quant_dense_3 (QuantizeWrapp (None, 256) 200965
_________________________________________________________________
quant_dense_4 (QuantizeWrapp (None, 128) 32901
_________________________________________________________________
quant_dense_5 (QuantizeWrapp (None, 10) 1295
=================================================================
Total params: 235,164
Trainable params: 235,146
Non-trainable params: 18
_________________________________________________________________
我有两个问题:
- 在输入层之后
quantize_layer
有 3 个参数的目的是什么? - 为什么我们每层有 5 个额外的不可训练参数,它们的具体用途是什么?
我感谢任何提示或进一步 material 帮助我(和其他偶然发现这个问题的人)理解量化意识训练。
量化层用于将浮点输入转换为 int8。量化参数用于输出min/max和零点计算
量化密集层需要一些额外的参数:min/max 用于内核,min/max/零点用于输出激活。