如何获取使用 ImageDataGenerator 为双输入 CNN 模型构建的数据集的标签?
how to get the labels of a dataset which is built using ImageDataGenerator for dual input CNN model?
有人可以帮助我获取 validation_set 的标签,当它获取一对图像作为输入并使用 ImageDataGenerator 提供图像批次时,如下所示:
GEN = ImageDataGenerator(rescale = 1./255)
def two_inputs(generator, X1, X2, batch_size, img_height, img_width):
U = generator.flow_from_directory(X1,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
V = generator.flow_from_directory(X2,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
while True:
X1i = U.next()
X2i = V.next()
yield [X1i[0], X2i[0]], X2i[1] # Yield both images and their mutual label
在以下情况下,我可以通过 preds = base_model.predict_generator(val_flow)
获得预测,其中 val_flow
是:
val_flow = two_inputs(generator= GEN,
X1 = val_05_dirs,
X2 = val_06_dirs,
batch_size = batch_size,
img_height=img_height,
img_width=img_width
)
我需要使用 fpr, tpr, _ = metrics.roc_curve(LABELS, preds)
.
获取 fpr
和 tpr
因此,我正在尝试获取正在访问 two val_05_dirs
、val_06_dirs
文件夹的完整 val_flow
的 LABELS
。
提前致谢
我创建了一个简单的代码示例。您可以调整此示例以适合您的用例。
代码:
GEN = tf.keras.preprocessing.image.ImageDataGenerator(rescale = 1./255)
folder_path = r'C:\Users\Aniket\.keras\datasets\flower_photos'
def two_inputs(generator, X1, X2, batch_size, img_height, img_width):
U = generator.flow_from_directory(X1,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
V = generator.flow_from_directory(X2,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
while True:
X1i = U.next()
X2i = V.next()
yield [X1i[0], X2i[0]], X2i[1] # Yield both images and their mutual label
custom_gen = two_inputs(GEN, folder_path, folder_path, 1000, 256, 256)
在这里,我的flower_photos
目录包含5个子目录,子目录名称作为图像的标签。
输出:
Found 3670 images belonging to 5 classes.
现在遍历生成器。
代码:
val_labels = []
for image, labels in custom_gen:
val_labels += list(labels.astype('int32'))
break
注意:循环将 运行 无限,因为此生成器会根据您的数据无限生成增强图像。
如果您不希望这样,请只为以下对象创建循环 运行:
no_of_times = total_samples / batch_size
确保您的批量大小可以被样本总数整除,否则您将在列表末尾添加重复的标签。
您得到的标签将是整数。
如果你想要映射,你可以使用:
mapping = U.class_indices
mapping
输出:
{'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}
有人可以帮助我获取 validation_set 的标签,当它获取一对图像作为输入并使用 ImageDataGenerator 提供图像批次时,如下所示:
GEN = ImageDataGenerator(rescale = 1./255)
def two_inputs(generator, X1, X2, batch_size, img_height, img_width):
U = generator.flow_from_directory(X1,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
V = generator.flow_from_directory(X2,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
while True:
X1i = U.next()
X2i = V.next()
yield [X1i[0], X2i[0]], X2i[1] # Yield both images and their mutual label
在以下情况下,我可以通过 preds = base_model.predict_generator(val_flow)
获得预测,其中 val_flow
是:
val_flow = two_inputs(generator= GEN,
X1 = val_05_dirs,
X2 = val_06_dirs,
batch_size = batch_size,
img_height=img_height,
img_width=img_width
)
我需要使用 fpr, tpr, _ = metrics.roc_curve(LABELS, preds)
.
fpr
和 tpr
因此,我正在尝试获取正在访问 two val_05_dirs
、val_06_dirs
文件夹的完整 val_flow
的 LABELS
。
提前致谢
我创建了一个简单的代码示例。您可以调整此示例以适合您的用例。
代码:
GEN = tf.keras.preprocessing.image.ImageDataGenerator(rescale = 1./255)
folder_path = r'C:\Users\Aniket\.keras\datasets\flower_photos'
def two_inputs(generator, X1, X2, batch_size, img_height, img_width):
U = generator.flow_from_directory(X1,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
V = generator.flow_from_directory(X2,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle= False,
class_mode='binary',
seed=1221)
while True:
X1i = U.next()
X2i = V.next()
yield [X1i[0], X2i[0]], X2i[1] # Yield both images and their mutual label
custom_gen = two_inputs(GEN, folder_path, folder_path, 1000, 256, 256)
在这里,我的flower_photos
目录包含5个子目录,子目录名称作为图像的标签。
输出:
Found 3670 images belonging to 5 classes.
现在遍历生成器。
代码:
val_labels = []
for image, labels in custom_gen:
val_labels += list(labels.astype('int32'))
break
注意:循环将 运行 无限,因为此生成器会根据您的数据无限生成增强图像。
如果您不希望这样,请只为以下对象创建循环 运行:
no_of_times = total_samples / batch_size
确保您的批量大小可以被样本总数整除,否则您将在列表末尾添加重复的标签。
您得到的标签将是整数。 如果你想要映射,你可以使用:
mapping = U.class_indices
mapping
输出:
{'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}