Unable Train on Tpu Google Colab InternalError: 9 root error(s) found
Unable Train on Tpu Google Colab InternalError: 9 root error(s) found
BATCH SIZE = 64
HEIGHT ,WIDTH = 124,124
Train_data set = 14906 6 classes.
Validation_datat = 3726 6 classes.
with strategy.scope():
model = create_model()
model = complile_model(model,lr=0.0001)
callbacks = create_callbacks()
epochs = 5
steps_per_epoch = 14906//BATCH_SIZE
validation_steps = 3726//BATCH_SIZE
history = model.fit(train_dataset,
epochs=epochs,
steps_per_epoch=steps_per_epoch,
validation_data=validation_dataset,
validation_steps=validation_steps)
我正在尝试在 google collab 提供的 TPU 上训练它,但无法这样做,请就此帮助我。附上截图
由于 ImageDataGenerator 还在底层使用 PyFunction,因此它与 TPU 不兼容。相反,您必须使用 tf.data API 来加载图像。本教程介绍了如何操作。
数据集必须 repeat()
:
def get_dataset(filenames, batch_size):
dataset = (
tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTOTUNE)
.map(parse_tfrecord_fn, num_parallel_calls=AUTOTUNE)
.map(prepare_sample, num_parallel_calls=AUTOTUNE)
.repeat()
.shuffle(batch_size * 10)
.batch(batch_size)
.prefetch(AUTOTUNE)
)
return dataset
BATCH SIZE = 64
HEIGHT ,WIDTH = 124,124
Train_data set = 14906 6 classes.
Validation_datat = 3726 6 classes.
with strategy.scope():
model = create_model()
model = complile_model(model,lr=0.0001)
callbacks = create_callbacks()
epochs = 5
steps_per_epoch = 14906//BATCH_SIZE
validation_steps = 3726//BATCH_SIZE
history = model.fit(train_dataset,
epochs=epochs,
steps_per_epoch=steps_per_epoch,
validation_data=validation_dataset,
validation_steps=validation_steps)
我正在尝试在 google collab 提供的 TPU 上训练它,但无法这样做,请就此帮助我。附上截图
由于 ImageDataGenerator 还在底层使用 PyFunction,因此它与 TPU 不兼容。相反,您必须使用 tf.data API 来加载图像。本教程介绍了如何操作。
数据集必须 repeat()
:
def get_dataset(filenames, batch_size):
dataset = (
tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTOTUNE)
.map(parse_tfrecord_fn, num_parallel_calls=AUTOTUNE)
.map(prepare_sample, num_parallel_calls=AUTOTUNE)
.repeat()
.shuffle(batch_size * 10)
.batch(batch_size)
.prefetch(AUTOTUNE)
)
return dataset