Keras 中顺序网络的输出形状错误
Output shape of Sequential Network is wrong in Keras
我有一个顺序网络,它接受长度为 20 个单词的向量句子,旨在根据标签对句子进行分类。每个词有 300 个维度。因此每个句子都有一个形状 (20, 300)。该数据集目前有 11 个样本,因此完整 x_train 的形状为 (11, 20, 300)
下面是我的网络的代码:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
这给出:
Epoch 1/10
11/11 [==============================] - 0s 1ms/step - loss: 2.9727 - accuracy: 0.0455
Epoch 2/10
11/11 [==============================] - 0s 1ms/step - loss: 2.7716 - accuracy: 0.0682
Epoch 3/10
11/11 [==============================] - 0s 1ms/step - loss: 2.6279 - accuracy: 0.0682
Epoch 4/10
11/11 [==============================] - 0s 1ms/step - loss: 2.4878 - accuracy: 0.0682
Epoch 5/10
11/11 [==============================] - 0s 1ms/step - loss: 2.3145 - accuracy: 0.0545
Epoch 6/10
11/11 [==============================] - 0s 1ms/step - loss: 2.0505 - accuracy: 0.0545
Epoch 7/10
11/11 [==============================] - 0s 1ms/step - loss: 1.7010 - accuracy: 0.0545
Epoch 8/10
11/11 [==============================] - 0s 992us/step - loss: 1.2874 - accuracy: 0.0545
Epoch 9/10
11/11 [==============================] - 0s 891us/step - loss: 0.9628 - accuracy: 0.0545
Epoch 10/10
11/11 [==============================] - 0s 794us/step - loss: 0.7960 - accuracy: 0.0545
(None, 20, 300)
(None, 20, 20)
(None, 20, 1)
为什么我的输出层返回 (20,1)?它的形状必须是 (1),因为我的标签只是一个整数。我很困惑,不确定如果形状错误它是如何计算损失的。
任何帮助将不胜感激/
谢谢
使用当前代码,这是预期的输出。为多维输入添加一个简单的密集层只会改变最后一个维度的大小。如果你注意到,在 CNN 中,出于同样的原因,我们通常会在卷积层之后添加一个 Flatten
。 Flatten
层本质上是重塑输入数组以删除额外的维度(每个样本现在都是一维的)。更新后的代码应该是:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Flatten()) #This is the code change
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
我有一个顺序网络,它接受长度为 20 个单词的向量句子,旨在根据标签对句子进行分类。每个词有 300 个维度。因此每个句子都有一个形状 (20, 300)。该数据集目前有 11 个样本,因此完整 x_train 的形状为 (11, 20, 300)
下面是我的网络的代码:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
这给出:
Epoch 1/10
11/11 [==============================] - 0s 1ms/step - loss: 2.9727 - accuracy: 0.0455
Epoch 2/10
11/11 [==============================] - 0s 1ms/step - loss: 2.7716 - accuracy: 0.0682
Epoch 3/10
11/11 [==============================] - 0s 1ms/step - loss: 2.6279 - accuracy: 0.0682
Epoch 4/10
11/11 [==============================] - 0s 1ms/step - loss: 2.4878 - accuracy: 0.0682
Epoch 5/10
11/11 [==============================] - 0s 1ms/step - loss: 2.3145 - accuracy: 0.0545
Epoch 6/10
11/11 [==============================] - 0s 1ms/step - loss: 2.0505 - accuracy: 0.0545
Epoch 7/10
11/11 [==============================] - 0s 1ms/step - loss: 1.7010 - accuracy: 0.0545
Epoch 8/10
11/11 [==============================] - 0s 992us/step - loss: 1.2874 - accuracy: 0.0545
Epoch 9/10
11/11 [==============================] - 0s 891us/step - loss: 0.9628 - accuracy: 0.0545
Epoch 10/10
11/11 [==============================] - 0s 794us/step - loss: 0.7960 - accuracy: 0.0545
(None, 20, 300)
(None, 20, 20)
(None, 20, 1)
为什么我的输出层返回 (20,1)?它的形状必须是 (1),因为我的标签只是一个整数。我很困惑,不确定如果形状错误它是如何计算损失的。
任何帮助将不胜感激/ 谢谢
使用当前代码,这是预期的输出。为多维输入添加一个简单的密集层只会改变最后一个维度的大小。如果你注意到,在 CNN 中,出于同样的原因,我们通常会在卷积层之后添加一个 Flatten
。 Flatten
层本质上是重塑输入数组以删除额外的维度(每个样本现在都是一维的)。更新后的代码应该是:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Flatten()) #This is the code change
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)