Keras 模型摘要不正确

Keras model summary incorrect

我正在使用

进行数据扩充
data_gen=image.ImageDataGenerator(rotation_range=20,width_shift_range=0.2,height_shift_range=0.2,
                                  zoom_range=0.15,horizontal_flip=False)

iter=data_gen.flow(X_train,Y_train,batch_size=64)

data_gen.flow()需要一个4阶数据矩阵,所以X_train的形状是(60000, 28, 28, 1)。我们需要传递相同的形状,即 (60000, 28, 28, 1) 同时定义模型的架构如下;

model=Sequential()
model.add(Dense(units=64,activation='relu',kernel_initializer='he_normal',input_shape=(28,28,1)))
model.add(Flatten())    
model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
model.summary()

model.add(Flatten()) 用于处理 rank-2 问题。现在问题出在 model.summary() 上。它给出了错误的输出,如下所示;

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 28, 28, 64)        128       
_________________________________________________________________
flatten_1 (Flatten)          (None, 50176)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                501770    
=================================================================
Total params: 501,898
Trainable params: 501,898
Non-trainable params: 0

dense_1 (Dense)Output Shape 应该是 (None,64)Param # 应该是 (28*28*64)+6450240dense_2 (Dense)Output Shape 是正确的,但 Param # 应该是 (64*10)+10,即 650

为什么会发生这种情况,如何解决这个问题?

总结没有错。 keras Dense 层始终在输入的最后一个维度上工作。

参考:https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

Input shape:

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would > be a 2D input with shape (batch_size, input_dim). Output shape:

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

在每个 Dense 层之前,您需要手动应用 Flatten() 以确保您传递的是二维数据。

您想要的 output_shape 的一个解决方法是:

model=Sequential()
model.add(Dense(units=1,activation='linear', use_bias = False, trainable = False, kernel_initializer=tf.keras.initializers.Ones(),input_shape=(28,28,1)))
model.add(Flatten())
model.add(Dense(units=64,activation='relu'))    
model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
model.summary()

第一层只有一层,用ones初始化,没有偏置,所以它只是将输入乘以1,然后传递到下一层进行展平。这会从模型中删除不必要的参数。

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 28, 28, 1)         2         
_________________________________________________________________
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 50,892
Trainable params: 50,892
Non-trainable params: 0