如何修复简单自动编码器中的维度错误?

How do I fix dimension error in a simple Autoencoder?

我是 python 和自动编码器的新手。我只是想构建一个简单的自动编码器作为开始,但我不断收到此错误:

ValueError: Error when checking target: expected conv2d_39 to have 4 dimensions, but got array with shape (32, 3)

除了flow_from_directory方法外,还有更好的方法获取自己的数据吗?我像 this 一样构建了自动编码器,但我去掉了一些层。

我不知道,但我是在向自动编码器提供从 flow_from_directory 方法生成的元组吗?有没有办法将这个元组转换为自动编码器接受的格式?

import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense, Input, Conv2D, 
UpSampling2D, MaxPooling2D
from keras.optimizers import RMSprop

IMG_WIDTH, IMG_HEIGHT = 112, 112
input_img = Input(shape=(IMG_WIDTH, IMG_HEIGHT,3))

#encoder
def encoder(input_img):
    # 1x112x112x3
    conv1 = Conv2D(32,(3,3), activation='relu', padding='same') 
    (input_img) 
    # 32x112x112
    pool1 = MaxPooling2D(pool_size=(2,2))(conv1)
    # 32x56x56
    return pool1

#decoder
def decoder(pool1):
    # 32x56x56
    up1 = UpSampling2D((2,2))(pool1)
    # 32x112x112
    decoded = Conv2D(1,(3,3),activation='sigmoid',padding='same')(up1)
    # 1x112x112
    return decoded

autoencoder = Model(input_img, decoder(encoder(input_img)))
autoencoder.compile(loss='mean_squared_error', optimizer=RMSprop())

datagen = ImageDataGenerator(rescale=1./255)

training_set = datagen.flow_from_directory(
    r'C:\Users\user\Desktop\dataset\train',
    target_size=(112,112),
    batch_size=32,
    class_mode='categorical')

test_set = datagen.flow_from_directory(
    r'C:\Users\user\Desktop\dataset\validation',
    target_size=(112,112),
    batch_size=32,
    class_mode='categorical')

history = autoencoder.fit_generator(
    training_set,
    steps_per_epoch=2790,
    epochs=5,
    validation_data=test_set,
    validation_steps=1145)

这是模型摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_14 (InputLayer)        (None, 112, 112, 3)       0         
_________________________________________________________________
conv2d_42 (Conv2D)           (None, 112, 112, 32)      896       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 56, 56, 32)        0         
_________________________________________________________________
up_sampling2d_4 (UpSampling2 (None, 112, 112, 32)      0         
_________________________________________________________________
conv2d_43 (Conv2D)           (None, 112, 112, 1)       289       
=================================================================
Total params: 1,185
Trainable params: 1,185
Non-trainable params: 0
_________________________________________________________________

我正在处理 512x496 OCT 图像。

我相信您为网络提供的是标签,而不是图像。尝试在构建数据生成器时将 class_mode 显式设置为 None——它默认为 categorical.

由于您正在构建自动编码器,因此模型的输出必须与输入相同,因此您的代码存在两个问题:

  1. 您必须将生成器的 class_mode 参数设置为 'input' 以使生成的标签与生成的输入相同。

  2. 最后一层必须有 3 个过滤器,因为输入图像有 3 个通道:decoded = Conv2D(3, ...).