迁移学习神经网络不是在学习

Transfer Learning Neural Network is not learning

首先,我知道这里有一个类似的线程: https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn

但不幸的是,这并没有帮助。我的代码中可能有一个我找不到的错误。我想做的是对一些 WAV 文件进行分类。但是模型不学习。

起初,我正在收集文件并将它们保存在一个数组中。

其次,创建新目录,一个用于训练数据,一个用于验证数据。 接下来,我正在读取 WAV 文件,创建频谱图,并将它们全部保存到 train 目录中。 之后,我将 20% 的数据从 train 目录移动到 val 目录。

注意:在创建频谱图时,我正在检查 WAV 的长度。如果太短(少于 2 秒),我会加倍。在这个频谱图中,我随机剪切了一个块并只保存了这个。因此,所有图像确实具有相同的高度和宽度。

下一步,我将加载火车和验证图像。而这里我也在做归一化。

IMG_WIDTH=300
IMG_HEIGHT=300
IMG_DIM = (IMG_WIDTH, IMG_HEIGHT, 3)

train_files = glob.glob(DBMEL_PATH + "*",recursive=True)
train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in train_files]
train_imgs = np.array(train_imgs) / 255                  # normalizing Data
train_labels = [fn.split('\')[-1].split('.')[1].strip() for fn in train_files]

validation_files = glob.glob(DBMEL_VAL_PATH + "*",recursive=True)
validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in validation_files]
validation_imgs = np.array(validation_imgs) / 255         # normalizing Data
validation_labels = [fn.split('\')[-1].split('.')[1].strip() for fn in validation_files]

我检查了变量并打印了它们。我想这工作得很好。数组分别包含总数据的 80% 和 20%。

#Train dataset shape: (3756, 300, 300, 3) 
#Validation dataset shape: (939, 300, 300, 3)

接下来,我还实现了一个One-Hot-Encoder。 到目前为止,一切都很好。在下一步中,我创建空的 DataGenerators,因此没有任何数据扩充。调用数据生成器时,一次用于训练数据,一次用于验证数据,我将传递图像数组 (train_imgs、validation_imgs) 和一次热编码标签 ( train_labels_enc, validation_labels_enc).

好的。现在到了棘手的部分。 首先,create/load一个预训练网络

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import Model
import tensorflow.keras

input_shape=(IMG_HEIGHT,IMG_WIDTH,3)
restnet = ResNet50(include_top=False, weights='imagenet', input_shape=(IMG_HEIGHT,IMG_WIDTH,3))

output = restnet.layers[-1].output
output = tensorflow.keras.layers.Flatten()(output)

restnet = Model(restnet.input, output)

for layer in restnet.layers:
    layer.trainable = False

现在终于创建了模型本身。在创建模型时,我使用预训练网络进行迁移学习。我想一定是哪里出了问题。

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer
from tensorflow.keras.models import Sequential
from tensorflow.keras import optimizers


model = Sequential()
model.add(restnet)   #  <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
model.summary()

并且模型 运行 具有此

history = model.fit_generator(train_generator, 
                              steps_per_epoch=100, 
                              epochs=100,
                              validation_data=val_generator, 
                              validation_steps=10, 
                              verbose=1
                             )

但即使在 50 个 epoch 之后,准确度仍停留在 0.15 左右

Epoch 1/100
100/100 [==============================] - 711s 7s/step - loss: 10.6419 - accuracy: 0.1530 - val_loss: 1.9416 - val_accuracy: 0.1467
Epoch 2/100
100/100 [==============================] - 733s 7s/step - loss: 1.9595 - accuracy: 0.1550 - val_loss: 1.9372 - val_accuracy: 0.1267
Epoch 3/100
100/100 [==============================] - 731s 7s/step - loss: 1.9940 - accuracy: 0.1444 - val_loss: 1.9388 - val_accuracy: 0.1400
Epoch 4/100
100/100 [==============================] - 735s 7s/step - loss: 1.9416 - accuracy: 0.1535 - val_loss: 1.9380 - val_accuracy: 0.1733
Epoch 5/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1656 - val_loss: 1.9345 - val_accuracy: 0.1533
Epoch 6/100
100/100 [==============================] - 741s 7s/step - loss: 1.9364 - accuracy: 0.1667 - val_loss: 1.9286 - val_accuracy: 0.1767
Epoch 7/100
100/100 [==============================] - 740s 7s/step - loss: 1.9389 - accuracy: 0.1523 - val_loss: 1.9305 - val_accuracy: 0.1400
Epoch 8/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1623 - val_loss: 1.9441 - val_accuracy: 0.1667
Epoch 9/100
100/100 [==============================] - 735s 7s/step - loss: 1.9391 - accuracy: 0.1582 - val_loss: 1.9458 - val_accuracy: 0.1333
Epoch 10/100
100/100 [==============================] - 734s 7s/step - loss: 1.9381 - accuracy: 0.1602 - val_loss: 1.9372 - val_accuracy: 0.1700
Epoch 11/100
100/100 [==============================] - 739s 7s/step - loss: 1.9392 - accuracy: 0.1623 - val_loss: 1.9302 - val_accuracy: 0.2167
Epoch 12/100
100/100 [==============================] - 741s 7s/step - loss: 1.9368 - accuracy: 0.1627 - val_loss: 1.9326 - val_accuracy: 0.1467
Epoch 13/100
100/100 [==============================] - 740s 7s/step - loss: 1.9381 - accuracy: 0.1513 - val_loss: 1.9312 - val_accuracy: 0.1733
Epoch 14/100
100/100 [==============================] - 736s 7s/step - loss: 1.9396 - accuracy: 0.1542 - val_loss: 1.9407 - val_accuracy: 0.1367
Epoch 15/100
100/100 [==============================] - 741s 7s/step - loss: 1.9393 - accuracy: 0.1597 - val_loss: 1.9336 - val_accuracy: 0.1333
Epoch 16/100
100/100 [==============================] - 737s 7s/step - loss: 1.9375 - accuracy: 0.1659 - val_loss: 1.9354 - val_accuracy: 0.1267
Epoch 17/100
100/100 [==============================] - 741s 7s/step - loss: 1.9422 - accuracy: 0.1487 - val_loss: 1.9307 - val_accuracy: 0.1567
Epoch 18/100
100/100 [==============================] - 738s 7s/step - loss: 1.9399 - accuracy: 0.1680 - val_loss: 1.9408 - val_accuracy: 0.1567
Epoch 19/100
100/100 [==============================] - 743s 7s/step - loss: 1.9405 - accuracy: 0.1610 - val_loss: 1.9335 - val_accuracy: 0.1533
Epoch 20/100
100/100 [==============================] - 738s 7s/step - loss: 1.9410 - accuracy: 0.1575 - val_loss: 1.9331 - val_accuracy: 0.1533
Epoch 21/100
100/100 [==============================] - 746s 7s/step - loss: 1.9395 - accuracy: 0.1639 - val_loss: 1.9344 - val_accuracy: 0.1733
Epoch 22/100
100/100 [==============================] - 746s 7s/step - loss: 1.9393 - accuracy: 0.1585 - val_loss: 1.9354 - val_accuracy: 0.1667
Epoch 23/100
100/100 [==============================] - 746s 7s/step - loss: 1.9398 - accuracy: 0.1599 - val_loss: 1.9352 - val_accuracy: 0.1500
Epoch 24/100
100/100 [==============================] - 746s 7s/step - loss: 1.9392 - accuracy: 0.1585 - val_loss: 1.9449 - val_accuracy: 0.1667
Epoch 25/100
100/100 [==============================] - 746s 7s/step - loss: 1.9399 - accuracy: 0.1495 - val_loss: 1.9352 - val_accuracy: 0.1600

谁能帮忙找出问题所在?

我自己解决了这个问题。

我换了这个

model = Sequential()
model.add(restnet)   #  <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
model.summary()

有了这个:

base_model = tf.keras.applications.MobileNetV2(input_shape = (224, 224, 3), include_top = False, weights = "imagenet")

model = Sequential()
model.add(base_model)
model.add(tf.keras.layers.GlobalAveragePooling2D())
model.add(Dropout(0.2))
model.add(Dense(number_classes, activation="softmax"))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.00001),
              loss="categorical_crossentropy",
              metrics=['accuracy'])
model.summary()

而且我还发现了一件事。与某些教程相反,在使用频谱图时使用数据增强没有用。 在没有数据增强的情况下,我得到了 0.99 的训练精度和 0.72 的验证精度。但是通过数据增强,我在训练精度上只有 0.75,在验证精度上只有 0.16。