多个图像输入到同一个 ResNet 导致不匹配的输入

Several Image Inputs to same ResNet Resulting in Unmatched Inputs

我正在尝试构建一个网络,其中 ResNet 对三个输入图像分别进行特征检测。在特征检测之后,三个平行分支与密集层相结合。尝试为模型提供一些输入时会抛出错误。

#basis model

in1 = Input(shape=(224, 224, 3), name='base_image')
in2 = Input(shape=(224, 224, 3), name='image1')
in3 = Input(shape=(224, 224, 3), name='image2')

ResNet = ResNet50(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
ResNet.trainable = False

out1 = ResNet(in1)
out2 = ResNet(in2)
out3 = ResNet(in3)

basis1 = GlobalAveragePooling2D()(out1)
basis1 = Dropout(0.7)(basis1)
basis1 = Flatten()(basis1)

basis2 = GlobalAveragePooling2D()(out2)
basis2 = Dropout(0.7)(basis2)
basis2 = Flatten()(basis2)

basis3 = GlobalAveragePooling2D()(out3)
basis3 = Dropout(0.7)(basis3)
basis3 = Flatten()(basis3)


#own model
concat = Concatenate()([basis1, basis2, basis3])
dense_1 = Dense(2048, activation='relu')(concat)
dense_2 = Dense(1024, activation='relu')(dense_1)
output = Dense(1, activation='softmax')(dense_2)

my_model = Model(inputs = [in1, in2, in3], outputs=output)

模型如下所示:

图像数组(绝对)returns 形状为 (224, 224, 3) 的图像

testX = [
    [images[0], images[1], images[2]],
    [images[3], images[4], images[5]],
    [images[6], images[7], images[8]]
]

testY = [
    [1.0],
    [0.0],
    [1.0]
]


my_model.compile(optimizer=SGD(learning_rate=0.001, momentum=0.9, nesterov=True), loss='binary_crossentropy', metrics=['binary_accuracy'])
my_model.fit(testX, y=testY, epochs = 5,  verbose=2)

导致 fit() 出现以下错误:

ValueError: Data cardinality is ambiguous:
  x sizes: 224, 224, 224, 224, 224, 224, 224, 224, 224
  y sizes: 1, 1, 1
Make sure all arrays contain the same number of samples.

似乎第一个子数组被这种方法忽略了?我已经坚持了很长时间。

我认为你需要

ResNet1 = ResNet50(include_top=False,  weights="imagenet",  input_shape=(224, 224, 3)
ResNet2 = ResNet50(include_top=False,  weights="imagenet",  input_shape=(224, 224, 3)
ResNet3 = ResNet50(include_top=False,  weights="imagenet",  input_shape=(224, 224, 3)
out1 = ResNet1(in1)
out2 = ResNet2(in2)
out3 = ResNet3(in3)
basis1 = GlobalAveragePooling2D()(out1) # this make a vector so you don't need flatten layer                                 
basis1 = Dropout(0.7)(basis1)
basis2 = GlobalAveragePooling2D()(out2) # this make a vector so you don't need flatten layer                                 
basis2 = Dropout(0.7)(basis2)
basis3 = GlobalAveragePooling2D()(out3) # this make a vector so you don't need flatten layer                                 
basis3 = Dropout(0.7)(basis3)
concat = Concatenate()([basis1, basis2, basis3])
dense_1 = Dense(2048, activation='relu')(concat) # I would reduce nodes t0 256
# I would add a dropout layer here Dropout(.3)
dense_2 = Dense(1024, activation='relu')(dense_1)# I would reduce nodes to 32
output = Dense(1, activation='softmax')(dense_2)

如果您查看模型图,所有输入都会进入一个 Resnet 模型。 另外,由于您使用的是 binary_crossentropy,我认为您的标签必须只是一个 1 或 0。

根据我的经验,在使用 model.fit() 时,使用单个输入总是比使用列表更好。稍后,手动索引输入张量以获取单个图像。在您的情况下,输入形状将是 (Batch Size, 3, 224, 224, 3).

inputs = Input(shape=(3, 224, 224, 3), name='images')

ResNet = ResNet50(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
ResNet.trainable = False

out1 = ResNet(inputs[:, 0])
out2 = ResNet(inputs[:, 1])
out3 = ResNet(inputs[:, 2])

...

my_model = Model(inputs=inputs, outputs=output)

此外,最好使用 numpy 构造输入和输出数组,而不是将它们保留为 python 列表,以便更好地控制管道:

testX = np.stack([
    np.stack([images[0], images[1], images[2]], axis=0),
    np.stack([images[3], images[4], images[5]], axis=0),
    np.stack([images[6], images[7], images[8]], axis=0)
], axis=0) # Shape: (3, 3, 224, 224, 3)

testY = np.stack([1.0, 0.0, 1.0], axis=0)[:, None] # Shape: (3, 1)