keras-tensorflow CAE 维度不匹配
keras-tensorflow CAE dimension mismatch
我基本上是在按照 this 指南构建带有 tensorflow 后端的卷积自动编码器。与指南的主要区别在于我的数据是 257x257 灰度图像。以下代码:
TRAIN_FOLDER = 'data/OIRDS_gray/'
EPOCHS = 10
SHAPE = (257,257,1)
FILELIST = os.listdir(TRAIN_FOLDER)
def loadTrainData():
train_data = []
for fn in FILELIST:
img = misc.imread(TRAIN_FOLDER + fn)
img = np.reshape(img,(len(img[0,:]), len(img[:,0]), SHAPE[2]))
if img.shape != SHAPE:
print "image shape mismatch!"
print "Expected: "
print SHAPE
print "but got:"
print img.shape
sys.exit()
train_data.append (img)
train_data = np.array(train_data)
train_data = train_data.astype('float32')/ 255
return np.array(train_data)
def createModel():
input_img = Input(shape=SHAPE)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu',padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid',padding='same')(x)
return Model(input_img, decoded)
x_train = loadTrainData()
autoencoder = createModel()
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
print x_train.shape
autoencoder.summary()
# Run the network
autoencoder.fit(x_train, x_train,
epochs=EPOCHS,
batch_size=128,
shuffle=True)
给我一个错误:
ValueError: Error when checking target: expected conv2d_7 to have shape (None, 260, 260, 1) but got array with shape (859, 257, 257, 1)
如您所见,这不是 theano/tensorflow 后端暗淡排序的标准问题,而是其他问题。我用 print x_train.shape
:
检查了我的数据是否符合预期
(859, 257, 257, 1)
我也运行 autoencoder.summary()
:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 257, 257, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 257, 257, 16) 160
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 129, 129, 16) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 129, 129, 8) 1160
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 65, 65, 8) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 65, 65, 8) 584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 33, 33, 8) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 33, 33, 8) 584
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 66, 66, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 66, 66, 8) 584
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 132, 132, 8) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 132, 132, 16) 1168
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 264, 264, 16) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 264, 264, 1) 145
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
现在我不确定问题出在哪里,但看起来 conv2d_6 附近确实出了问题(Param # 太高了)。我知道 CAE 的原理是如何工作的,但我还不太熟悉确切的技术细节,我试图主要通过弄乱反卷积填充(而不是相同,使用有效)来解决这个问题。我得到调光匹配的结果是 (None, 258, 258, 1)
。我通过在反卷积方面盲目地尝试不同的填充组合来实现这一点,这并不是解决问题的明智方法...
此时我不知所措,如有任何帮助,我们将不胜感激
由于您的输入和输出数据相同,因此您最终的输出形状应该与输入形状相同。
最后一个卷积层的形状应为 (None, 257,257,1)
。
出现问题是因为您的图像大小为奇数 (257)。
当您应用 MaxPooling
时,它应该将数字除以二,因此它选择向上或向下舍入(向上或向下舍入,请参阅 129,来自 257/2 = 128.5)
稍后,当您执行 UpSampling
时,模型不知道当前尺寸已四舍五入,它只是将值加倍。这依次发生在最终结果中增加了 7 个像素。
您可以尝试裁剪结果或填充输入。
我通常使用兼容尺寸的图像。如果你有 3 MaxPooling
层,你的尺寸应该是 2³ 的倍数。答案是264。
直接填充输入数据:
x_train = numpy.lib.pad(x_train,((0,0),(3,4),(3,4),(0,0)),mode='constant')
这将需要 SHAPE=(264,264,1)
模型内部填充:
import keras.backend as K
input_img = Input(shape=SHAPE)
x = Lambda(lambda x: K.spatial_2d_padding(x, padding=((3, 4), (3, 4))), output_shape=(264,264,1))(input_img)
裁剪结果:
在您不直接更改实际数据(numpy 数组)的任何情况下都需要这样做。
decoded = Lambda(lambda x: x[:,3:-4,3:-4,:], output_shape=SHAPE)(x)
我基本上是在按照 this 指南构建带有 tensorflow 后端的卷积自动编码器。与指南的主要区别在于我的数据是 257x257 灰度图像。以下代码:
TRAIN_FOLDER = 'data/OIRDS_gray/'
EPOCHS = 10
SHAPE = (257,257,1)
FILELIST = os.listdir(TRAIN_FOLDER)
def loadTrainData():
train_data = []
for fn in FILELIST:
img = misc.imread(TRAIN_FOLDER + fn)
img = np.reshape(img,(len(img[0,:]), len(img[:,0]), SHAPE[2]))
if img.shape != SHAPE:
print "image shape mismatch!"
print "Expected: "
print SHAPE
print "but got:"
print img.shape
sys.exit()
train_data.append (img)
train_data = np.array(train_data)
train_data = train_data.astype('float32')/ 255
return np.array(train_data)
def createModel():
input_img = Input(shape=SHAPE)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu',padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid',padding='same')(x)
return Model(input_img, decoded)
x_train = loadTrainData()
autoencoder = createModel()
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
print x_train.shape
autoencoder.summary()
# Run the network
autoencoder.fit(x_train, x_train,
epochs=EPOCHS,
batch_size=128,
shuffle=True)
给我一个错误:
ValueError: Error when checking target: expected conv2d_7 to have shape (None, 260, 260, 1) but got array with shape (859, 257, 257, 1)
如您所见,这不是 theano/tensorflow 后端暗淡排序的标准问题,而是其他问题。我用 print x_train.shape
:
(859, 257, 257, 1)
我也运行 autoencoder.summary()
:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 257, 257, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 257, 257, 16) 160
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 129, 129, 16) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 129, 129, 8) 1160
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 65, 65, 8) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 65, 65, 8) 584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 33, 33, 8) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 33, 33, 8) 584
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 66, 66, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 66, 66, 8) 584
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 132, 132, 8) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 132, 132, 16) 1168
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 264, 264, 16) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 264, 264, 1) 145
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
现在我不确定问题出在哪里,但看起来 conv2d_6 附近确实出了问题(Param # 太高了)。我知道 CAE 的原理是如何工作的,但我还不太熟悉确切的技术细节,我试图主要通过弄乱反卷积填充(而不是相同,使用有效)来解决这个问题。我得到调光匹配的结果是 (None, 258, 258, 1)
。我通过在反卷积方面盲目地尝试不同的填充组合来实现这一点,这并不是解决问题的明智方法...
此时我不知所措,如有任何帮助,我们将不胜感激
由于您的输入和输出数据相同,因此您最终的输出形状应该与输入形状相同。
最后一个卷积层的形状应为 (None, 257,257,1)
。
出现问题是因为您的图像大小为奇数 (257)。
当您应用 MaxPooling
时,它应该将数字除以二,因此它选择向上或向下舍入(向上或向下舍入,请参阅 129,来自 257/2 = 128.5)
稍后,当您执行 UpSampling
时,模型不知道当前尺寸已四舍五入,它只是将值加倍。这依次发生在最终结果中增加了 7 个像素。
您可以尝试裁剪结果或填充输入。
我通常使用兼容尺寸的图像。如果你有 3 MaxPooling
层,你的尺寸应该是 2³ 的倍数。答案是264。
直接填充输入数据:
x_train = numpy.lib.pad(x_train,((0,0),(3,4),(3,4),(0,0)),mode='constant')
这将需要 SHAPE=(264,264,1)
模型内部填充:
import keras.backend as K
input_img = Input(shape=SHAPE)
x = Lambda(lambda x: K.spatial_2d_padding(x, padding=((3, 4), (3, 4))), output_shape=(264,264,1))(input_img)
裁剪结果:
在您不直接更改实际数据(numpy 数组)的任何情况下都需要这样做。
decoded = Lambda(lambda x: x[:,3:-4,3:-4,:], output_shape=SHAPE)(x)