在 keras 中使用多个验证集
Using multiple validation sets with keras
我正在使用 model.fit()
方法使用 keras 训练模型。
我想使用多个验证集,这些验证集应在每个训练时期后分别进行验证,以便我为每个验证集获得一个损失值。如果可能,它们应该在训练期间显示,并由 keras.callbacks.History()
回调返回。
我在想这样的事情:
history = model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=[
(validation_data1, validation_targets1),
(validation_data2, validation_targets2)],
shuffle=True)
我目前不知道如何实现它。是否可以通过自己编写 Callback
来实现?或者您还会如何解决这个问题?
我最终基于 History
回调编写了自己的 Callback
来解决问题。我不确定这是否是最好的方法,但以下 Callback
记录了训练和验证集的损失和指标,例如 History
回调以及传递给的其他验证集的损失和指标构造函数。
class AdditionalValidationSets(Callback):
def __init__(self, validation_sets, verbose=0, batch_size=None):
"""
:param validation_sets:
a list of 3-tuples (validation_data, validation_targets, validation_set_name)
or 4-tuples (validation_data, validation_targets, sample_weights, validation_set_name)
:param verbose:
verbosity mode, 1 or 0
:param batch_size:
batch size to be used when evaluating on the additional datasets
"""
super(AdditionalValidationSets, self).__init__()
self.validation_sets = validation_sets
for validation_set in self.validation_sets:
if len(validation_set) not in [3, 4]:
raise ValueError()
self.epoch = []
self.history = {}
self.verbose = verbose
self.batch_size = batch_size
def on_train_begin(self, logs=None):
self.epoch = []
self.history = {}
def on_epoch_end(self, epoch, logs=None):
logs = logs or {}
self.epoch.append(epoch)
# record the same values as History() as well
for k, v in logs.items():
self.history.setdefault(k, []).append(v)
# evaluate on the additional validation sets
for validation_set in self.validation_sets:
if len(validation_set) == 3:
validation_data, validation_targets, validation_set_name = validation_set
sample_weights = None
elif len(validation_set) == 4:
validation_data, validation_targets, sample_weights, validation_set_name = validation_set
else:
raise ValueError()
results = self.model.evaluate(x=validation_data,
y=validation_targets,
verbose=self.verbose,
sample_weight=sample_weights,
batch_size=self.batch_size)
for metric, result in zip(self.model.metrics_names,results):
valuename = validation_set_name + '_' + metric
self.history.setdefault(valuename, []).append(result)
我现在这样使用:
history = AdditionalValidationSets([(validation_data2, validation_targets2, 'val2')])
model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data1, validation_targets1),
callbacks=[history]
shuffle=True)
考虑到当前 keras docs,您可以将回调传递给 evaluate
和 evaluate_generator
。因此,您可以使用不同的数据集多次调用 evaluate
。
我还没有测试过,所以如果你在下面评论你的体验,我很高兴。
我在 TensorFlow 2 上测试了这个并且它有效。您可以在每个纪元结束时根据需要评估任意数量的验证集:
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
res_eval_1 = self.model.evaluate(X_test_1, y_test_1, verbose = 0)
res_eval_2 = self.model.evaluate(X_test_2, y_test_2, verbose = 0)
print(res_eval_1)
print(res_eval_2)
以后:
my_val_callback = MyCustomCallback()
# Your model creation code
model.fit(..., callbacks=[my_val_callback])
我正在使用 model.fit()
方法使用 keras 训练模型。
我想使用多个验证集,这些验证集应在每个训练时期后分别进行验证,以便我为每个验证集获得一个损失值。如果可能,它们应该在训练期间显示,并由 keras.callbacks.History()
回调返回。
我在想这样的事情:
history = model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=[
(validation_data1, validation_targets1),
(validation_data2, validation_targets2)],
shuffle=True)
我目前不知道如何实现它。是否可以通过自己编写 Callback
来实现?或者您还会如何解决这个问题?
我最终基于 History
回调编写了自己的 Callback
来解决问题。我不确定这是否是最好的方法,但以下 Callback
记录了训练和验证集的损失和指标,例如 History
回调以及传递给的其他验证集的损失和指标构造函数。
class AdditionalValidationSets(Callback):
def __init__(self, validation_sets, verbose=0, batch_size=None):
"""
:param validation_sets:
a list of 3-tuples (validation_data, validation_targets, validation_set_name)
or 4-tuples (validation_data, validation_targets, sample_weights, validation_set_name)
:param verbose:
verbosity mode, 1 or 0
:param batch_size:
batch size to be used when evaluating on the additional datasets
"""
super(AdditionalValidationSets, self).__init__()
self.validation_sets = validation_sets
for validation_set in self.validation_sets:
if len(validation_set) not in [3, 4]:
raise ValueError()
self.epoch = []
self.history = {}
self.verbose = verbose
self.batch_size = batch_size
def on_train_begin(self, logs=None):
self.epoch = []
self.history = {}
def on_epoch_end(self, epoch, logs=None):
logs = logs or {}
self.epoch.append(epoch)
# record the same values as History() as well
for k, v in logs.items():
self.history.setdefault(k, []).append(v)
# evaluate on the additional validation sets
for validation_set in self.validation_sets:
if len(validation_set) == 3:
validation_data, validation_targets, validation_set_name = validation_set
sample_weights = None
elif len(validation_set) == 4:
validation_data, validation_targets, sample_weights, validation_set_name = validation_set
else:
raise ValueError()
results = self.model.evaluate(x=validation_data,
y=validation_targets,
verbose=self.verbose,
sample_weight=sample_weights,
batch_size=self.batch_size)
for metric, result in zip(self.model.metrics_names,results):
valuename = validation_set_name + '_' + metric
self.history.setdefault(valuename, []).append(result)
我现在这样使用:
history = AdditionalValidationSets([(validation_data2, validation_targets2, 'val2')])
model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data1, validation_targets1),
callbacks=[history]
shuffle=True)
考虑到当前 keras docs,您可以将回调传递给 evaluate
和 evaluate_generator
。因此,您可以使用不同的数据集多次调用 evaluate
。
我还没有测试过,所以如果你在下面评论你的体验,我很高兴。
我在 TensorFlow 2 上测试了这个并且它有效。您可以在每个纪元结束时根据需要评估任意数量的验证集:
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
res_eval_1 = self.model.evaluate(X_test_1, y_test_1, verbose = 0)
res_eval_2 = self.model.evaluate(X_test_2, y_test_2, verbose = 0)
print(res_eval_1)
print(res_eval_2)
以后:
my_val_callback = MyCustomCallback()
# Your model creation code
model.fit(..., callbacks=[my_val_callback])