TypeError: float() argument must be a string or a number, not 'BatchDataset' when data augmenting using fit_generator()
TypeError: float() argument must be a string or a number, not 'BatchDataset' when data augmenting using fit_generator()
我在训练模型时遇到应用数据增强的问题。具体说一下fit_generator()方法的使用。
我最初 运行 我的模型成功地没有使用 fit() 方法进行扩充,但是根据 others 建议使用 fit_generator()。当涉及到图像和标签时,这两种方法似乎都需要相同的输入,但是当 运行 执行以下代码时,我得到以下 ERROR:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_35/139227558.py in <module>
105
106 # train the network
--> 107 model.fit_generator(aug.flow(train_ds, batch_size=batch_size),
108 validation_data=val_ds, steps_per_epoch=len(train_ds[0]) // batch_size,
109 epochs=epochs)
/opt/conda/lib/python3.7/site-packages/keras/preprocessing/image.py in flow(self, x, y, batch_size, shuffle, sample_weight, seed, save_to_dir, save_prefix, save_format, subset)
894 save_prefix=save_prefix,
895 save_format=save_format,
--> 896 subset=subset)
897
898 def flow_from_directory(self,
/opt/conda/lib/python3.7/site-packages/keras/preprocessing/image.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, sample_weight, seed, data_format, save_to_dir, save_prefix, save_format, subset, dtype)
472 save_format=save_format,
473 subset=subset,
--> 474 **kwargs)
475
476
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/numpy_array_iterator.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, sample_weight, seed, data_format, save_to_dir, save_prefix, save_format, subset, dtype)
119 y = y[split_idx:]
120
--> 121 self.x = np.asarray(x, dtype=self.dtype)
122 self.x_misc = x_misc
123 if self.x.ndim != 4:
/opt/conda/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85
TypeError: float() argument must be a string or a number, not 'BatchDataset'
我已完成 google 尝试修复 TypeError: float() argument must be a string or a number, not 'BatchDataset' 错误,但无济于事。有人对前进有什么建议吗?
import pathlib
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
# Set data directory
data_dir = pathlib.Path("../input/validatedweaponsv6/images/")
# Set image size
img_height = 120
img_width = 120
# Hyperparameters
batch_size = 128
epochs = 50
learning_rate = 0.001
# Create the training dataset
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
label_mode='categorical',
validation_split=0.2,
subset="training",
shuffle=True,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
# Create the validation dataset
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
label_mode='categorical',
validation_split=0.2,
subset="validation",
shuffle=True,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
# Create sequential model
model = Sequential([
# Preprocessing
layers.Rescaling(1./127.5, offset=-1,
input_shape=(img_height, img_width, 3)),
# Encoder
layers.Conv2D(8, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(16, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, activation='relu'),
# layers.Conv2D(2, 3, activation='relu'), ???
layers.Flatten(),
# Decoder
layers.Dense(64, activation='relu'),
layers.Dropout(0.5),
layers.Dense(2, activation='softmax')
])
# Print the model to see the different output shapes
print(model.summary())
# Compile model
model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.SGD(learning_rate=learning_rate), metrics=['accuracy'])
# construct the training image generator for data augmentation
aug = tf.keras.preprocessing.image.ImageDataGenerator(rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
horizontal_flip=True, fill_mode="nearest")
# train the network
model.fit_generator(aug.flow(train_ds, batch_size=batch_size),
validation_data=val_ds, steps_per_epoch=len(train_ds[0]) // batch_size,
epochs=epochs)
# Print scores
score = model.evaluate(train_ds, verbose=0)
print('Validation loss:', score[0])
print('Validation accuracy:', score[1])
# Show loss and accuracy models
show_history(history)
感谢您看我的post! :)
首先,您提到的文章已有 3 年历史,有点过时了。从 tensorflow 2.1.0 开始,.fit 方法也接受生成器,目前它完全取代了 .fit_generator。我建议您尽可能更新您的 tensorflow。
其次,错误似乎不在 fit_generator 方法中,而是在您定义数据集的方式中。他们只是先打电话给 fit_generator,这就是为什么错误消息会追踪到你。
至于错误本身,我不明白嵌套生成器的部分,我认为它会在这里引起问题。您正在尝试将从 tf.keras.utils.image_dataset_from_directory 获得的批处理数据集传递给另一个生成器,这似乎是不可能的。
如果我没理解错的话,你的每张图片只有一个标签,每一张class的图片都存放在不同的文件夹中,所以我建议你使用flow_from_directory 直接 tf.keras.preprocessing.image.ImageDataGenerator 的方法。此生成器将读取和扩充图像,因此您可以删除 tf.keras.utils.image_dataset_from_directory 部分。
要使用此生成器,您需要具有以下格式的图像:
- root_directory
- class1 个文件夹
- class2 个文件夹
- 等等
你的代码将是这样的:
gen = tf.keras.preprocessing.image.ImageDataGenerator( #desired augmentation, ...)
train_generator = gen.flow_from_directory(directory = root_directory,
target_size=(256, 256), classes= *list of class names*,
class_mode='categorical', batch_size=32, shuffle=True, ...)
model.fit(train_generator, ...)
您也可以传递“validation_split”参数以获得单独的数据集用于训练和验证。阅读有关 ImageDataGenerator 和 flow_from_directory 方法的更多信息 in the official documentation。
我在训练模型时遇到应用数据增强的问题。具体说一下fit_generator()方法的使用。
我最初 运行 我的模型成功地没有使用 fit() 方法进行扩充,但是根据 others 建议使用 fit_generator()。当涉及到图像和标签时,这两种方法似乎都需要相同的输入,但是当 运行 执行以下代码时,我得到以下 ERROR:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_35/139227558.py in <module>
105
106 # train the network
--> 107 model.fit_generator(aug.flow(train_ds, batch_size=batch_size),
108 validation_data=val_ds, steps_per_epoch=len(train_ds[0]) // batch_size,
109 epochs=epochs)
/opt/conda/lib/python3.7/site-packages/keras/preprocessing/image.py in flow(self, x, y, batch_size, shuffle, sample_weight, seed, save_to_dir, save_prefix, save_format, subset)
894 save_prefix=save_prefix,
895 save_format=save_format,
--> 896 subset=subset)
897
898 def flow_from_directory(self,
/opt/conda/lib/python3.7/site-packages/keras/preprocessing/image.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, sample_weight, seed, data_format, save_to_dir, save_prefix, save_format, subset, dtype)
472 save_format=save_format,
473 subset=subset,
--> 474 **kwargs)
475
476
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/numpy_array_iterator.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, sample_weight, seed, data_format, save_to_dir, save_prefix, save_format, subset, dtype)
119 y = y[split_idx:]
120
--> 121 self.x = np.asarray(x, dtype=self.dtype)
122 self.x_misc = x_misc
123 if self.x.ndim != 4:
/opt/conda/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85
TypeError: float() argument must be a string or a number, not 'BatchDataset'
我已完成 google 尝试修复 TypeError: float() argument must be a string or a number, not 'BatchDataset' 错误,但无济于事。有人对前进有什么建议吗?
import pathlib
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
# Set data directory
data_dir = pathlib.Path("../input/validatedweaponsv6/images/")
# Set image size
img_height = 120
img_width = 120
# Hyperparameters
batch_size = 128
epochs = 50
learning_rate = 0.001
# Create the training dataset
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
label_mode='categorical',
validation_split=0.2,
subset="training",
shuffle=True,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
# Create the validation dataset
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
label_mode='categorical',
validation_split=0.2,
subset="validation",
shuffle=True,
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
# Create sequential model
model = Sequential([
# Preprocessing
layers.Rescaling(1./127.5, offset=-1,
input_shape=(img_height, img_width, 3)),
# Encoder
layers.Conv2D(8, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(16, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, activation='relu'),
# layers.Conv2D(2, 3, activation='relu'), ???
layers.Flatten(),
# Decoder
layers.Dense(64, activation='relu'),
layers.Dropout(0.5),
layers.Dense(2, activation='softmax')
])
# Print the model to see the different output shapes
print(model.summary())
# Compile model
model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.SGD(learning_rate=learning_rate), metrics=['accuracy'])
# construct the training image generator for data augmentation
aug = tf.keras.preprocessing.image.ImageDataGenerator(rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
horizontal_flip=True, fill_mode="nearest")
# train the network
model.fit_generator(aug.flow(train_ds, batch_size=batch_size),
validation_data=val_ds, steps_per_epoch=len(train_ds[0]) // batch_size,
epochs=epochs)
# Print scores
score = model.evaluate(train_ds, verbose=0)
print('Validation loss:', score[0])
print('Validation accuracy:', score[1])
# Show loss and accuracy models
show_history(history)
感谢您看我的post! :)
首先,您提到的文章已有 3 年历史,有点过时了。从 tensorflow 2.1.0 开始,.fit 方法也接受生成器,目前它完全取代了 .fit_generator。我建议您尽可能更新您的 tensorflow。
其次,错误似乎不在 fit_generator 方法中,而是在您定义数据集的方式中。他们只是先打电话给 fit_generator,这就是为什么错误消息会追踪到你。
至于错误本身,我不明白嵌套生成器的部分,我认为它会在这里引起问题。您正在尝试将从 tf.keras.utils.image_dataset_from_directory 获得的批处理数据集传递给另一个生成器,这似乎是不可能的。
如果我没理解错的话,你的每张图片只有一个标签,每一张class的图片都存放在不同的文件夹中,所以我建议你使用flow_from_directory 直接 tf.keras.preprocessing.image.ImageDataGenerator 的方法。此生成器将读取和扩充图像,因此您可以删除 tf.keras.utils.image_dataset_from_directory 部分。
要使用此生成器,您需要具有以下格式的图像:
- root_directory
- class1 个文件夹
- class2 个文件夹
- 等等
你的代码将是这样的:
gen = tf.keras.preprocessing.image.ImageDataGenerator( #desired augmentation, ...)
train_generator = gen.flow_from_directory(directory = root_directory,
target_size=(256, 256), classes= *list of class names*,
class_mode='categorical', batch_size=32, shuffle=True, ...)
model.fit(train_generator, ...)
您也可以传递“validation_split”参数以获得单独的数据集用于训练和验证。阅读有关 ImageDataGenerator 和 flow_from_directory 方法的更多信息 in the official documentation。