如何在 CNN 末尾连接 RNN 以用于训练视频帧？

Question

我正在尝试将视频分类为图像分类，从而使用帧作为分类方法。但我不知道如何编码。我正在使用 Inception ResNet 作为我的 CNN，但不知道任何 RNN 或如何使用它们。

Answer 1

这是 ML_machine，这是我想向您展示的内容，这是一个 CNN 的实现，用于对 mnist 数据进行分类，它不是我的，来自 here

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

要将此 CNN 后接全连接层转换为 CNN 转换为 RNN，只需更改行

model.add(Dense(num_classes, activation='softmax'))

进入

model.add(SimpleRNN(num_classes, activation='softmax'))

(当然要导入)

您可能需要更改网络的输入维度and/or TimeDistribute 整个 CNN 部分，我在某些版本的 tensorflow 中遇到了问题，而其他版本则没有

编辑：

我自己用我给你的代码遇到了一些问题，这比我想象的要难，因为用循环网络结束 CNN 网络的维度，这是我设法做到的：

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=in_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
# NO MORE POOLING
model.add(Dropout(0.25))
# Reshape with the first argument being the number of filter in your last conv layer
model.add(Reshape((64, -1)))
# Just write this Permute after, its complicated why
model.add(Permute((2, 1)))
# it can also be an LSTM
model.add(SimpleRNN(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

EDIT2，keras 中一个简单的完全连接的 NN 的虚拟示例：

trng_input = np.random.uniform(size=(1000, 4))
trng_output = np.column_stack([np.sin(trng_input).sum(axis=1), np.cos(trng_input).sum(axis=1)])

model = Sequential()
model.add(Dense(6, input_shape=trng_input.shape, activation='relu'))
model.add(Dense(2, activation='sigmoid'))
model.compile(loss='MSE', optimizer=keras.optimizer.Adam(), metrics=['accuracy'])

如何在 CNN 末尾连接 RNN 以用于训练视频帧？

How to connect RNN at the end of a CNN to use to train video frames?

neural-network

conv-neural-network

tensorflow

recurrent-neural-network