当通过 'fit_generator' 函数训练模型时,如何通过数据生成器生成验证数据?

How can I generate validation data through a data generator when the model is trained through 'fit_generator' function?

我正在通过下面添加的 data generator 生成图像字幕模型的训练数据。此模型基于提供的模型 here。在培训期间如何以类似的方式生成和设置验证数据?我确实具有验证图像及其标题的特征。

数据生成器:

def data_generator(all_train_captions, train_features, wordtoix, max_length, num_photos_per_batch, vocab_size):
    X1, X2, y = list(), list(), list()
    n = 0
    # loop for ever over images
    while True:
        for image_id, desc in all_train_captions.items():
            image_id = image_id.strip()

            n += 1
            # retrieve the photo feature
            photo = train_features[image_id]
            # encode the sequence
            seq = [wordtoix[word] for word in desc.split(' ') if word in wordtoix]
            # split one sequence into multiple X, y pairs
            for i in range(1, len(seq)):
                # split into input and output pair
                in_seq, out_seq = seq[:i], seq[i]
                # pad input sequence
                in_seq = pad_sequences([in_seq], maxlen=max_length)[0]
                # encode output sequence
                out_seq = to_categorical([out_seq], num_classes=vocab_size)[0]
                # store
                X1.append(photo)
                X2.append(in_seq)
                y.append(out_seq)
            # yield the batch data
            if n == num_photos_per_batch:
                yield [[array(X1), array(X2)], array(y)]
                X1, X2, y = list(), list(), list()
                n = 0

模型训练

generator = data_generator(all_train_captions, encoding_train, wordtoix, max_caption_length,
                           number_pics_per_bath, vocab_size)
history = model.fit_generator(generator, epochs=epochs, steps_per_epoch=steps, callbacks=callbacks, verbose=1)

p.s。在 TensorFlow 后端使用 Keras 作为深度学习库。

您需要另一个发电机。

一个用于训练,一个用于验证。 只需创建两个生成器,一个使用训练数据,另一个使用验证数据。

train_generator = data_generator(all_train_captions, encoding_train, wordtoix, max_caption_length, 
                                 number_pics_per_bath, vocab_size)
val_generator = data_generator(all_val_captions, encoding_val, wordtoix, max_caption_length, 
                               number_pics_per_bath, vocab_size)