卷积自动编码器未在 (62,47,1) 数据集上训练，"Expected Shape Error"

Question

我正在尝试对 The Wild 数据集中的人脸实施卷积自动编码器，该数据集由 62x47x3 形状的图像组成。

但是，mnist 数据集上的 Keras 卷积自动编码器示例不适用于我正在训练的这个新数据集。

它抛出这个错误

Error when checking target: expected conv2d_102 to have shape (60, 44, 3) but got array with shape (62, 47, 3)

关于某个图层接收到错误的形状输入，即使在包含

之后

padding='same'

应该使输入和输出形状相等的命令。

我试过只在网络中使用灰度图像，但这并没有什么不同。

这是我正在使用的主要代码


import tensorflow
import keras
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Model, Sequential
from keras.layers import Dense, Conv2D, Dropout, BatchNormalization, Input, Reshape, Flatten, Deconvolution2D, Conv2DTranspose, MaxPooling2D, UpSampling2D, LeakyReLU
from keras.layers.advanced_activations import LeakyReLU
from keras.optimizers import adam

from sklearn.datasets import fetch_lfw_people
from sklearn.model_selection import train_test_split

#importing the dataset in color cause that's dope
lfw_data = fetch_lfw_people(color=True)

#putting the data of images into a variable
x = lfw_data.images

#making a train and validation set
(x_train,x_test) = train_test_split(x, test_size=0.25)

#normalizing the pixel values
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print(x_train.shape)

x_train = x_train.reshape(len(x_train), 62,47,3)
x_test = x_test.reshape(len(x_test), 62,47,3)

from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K

input_img = Input(shape=(62, 47, 3))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.summary()

模型汇总输出为

Model: "model_14"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_18 (InputLayer)        (None, 62, 47, 3)         0         
_________________________________________________________________
conv2d_103 (Conv2D)          (None, 62, 47, 16)        448       
_________________________________________________________________
max_pooling2d_45 (MaxPooling (None, 31, 24, 16)        0         
_________________________________________________________________
conv2d_104 (Conv2D)          (None, 31, 24, 8)         1160      
_________________________________________________________________
max_pooling2d_46 (MaxPooling (None, 16, 12, 8)         0         
_________________________________________________________________
conv2d_105 (Conv2D)          (None, 16, 12, 8)         584       
_________________________________________________________________
max_pooling2d_47 (MaxPooling (None, 8, 6, 8)           0         
_________________________________________________________________
conv2d_106 (Conv2D)          (None, 8, 6, 8)           584       
_________________________________________________________________
up_sampling2d_42 (UpSampling (None, 16, 12, 8)         0         
_________________________________________________________________
conv2d_107 (Conv2D)          (None, 16, 12, 8)         584       
_________________________________________________________________
up_sampling2d_43 (UpSampling (None, 32, 24, 8)         0         
_________________________________________________________________
conv2d_108 (Conv2D)          (None, 30, 22, 16)        1168      
_________________________________________________________________
up_sampling2d_44 (UpSampling (None, 60, 44, 16)        0         
_________________________________________________________________
conv2d_109 (Conv2D)          (None, 60, 44, 1)         145       
=================================================================
Total params: 4,673
Trainable params: 4,673
Non-trainable params: 0
____________________________

当我尝试训练时

#train for 100 epochs
history = autoencoder.fit(x_train, x_train,epochs=100,batch_size=256, shuffle=True, validation_data=(x_test, x_test))

我收到此错误消息

Error when checking target: expected conv2d_102 to have shape (60, 44, 3) but got array with shape (62, 47, 3)

任何关于它抛出此错误的原因的帮助或解释都会很棒！

Answer 1

这是因为池化和填充不匹配。您的数据具有形状 (62,47)，但您的模型输出 (60,44)。您需要适当调整模型或数据。

根据池的工作原理（除以 2），并考虑到您有 3 个池，如果图像大小是 2^3 = 8 的倍数，则您的图像大小只能正确匹配池。因为尺寸 64 和 48 是非常接近图像的大小，似乎最简单的解决方案是向图像添加填充。

所以，让你的数据有大小(64,48)。 - 这将允许最多 4 个池化，而无需在模型中使用自定义填充。

x_train = np.pad(x_train, ((0,0), (1,1), (0,1), (0,0)), mode='constant')
x_test = np.pad(x_test, ((0,0), (1,1), (0,1), (0,0)), mode='constant')

不要忘记将 padding='same' 设置为所有层。有一个卷积漏掉了（最后一个）

也许列出的某些模式 here 可能比其他模式表现更好。（例如，我会尝试 mode='edge'。）

卷积自动编码器未在 (62,47,1) 数据集上训练，"Expected Shape Error"

Convolutional Autoencoder Not Training on (62,47,1) dataset, "Expected Shape Error"

python

convolution

autoencoder

keras