tf.data.Dataset 来自 tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory?
tf.data.Dataset from tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory?
如何创建 tf.data.Dataset
from tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory
?
我正在考虑 tf.data.Dataset.from_generator
,但不清楚如何为其获取 output_types
关键字参数,给定 return 类型:
A DirectoryIterator
yielding tuples of (x, y)
where x
is a numpy array containing a batch of images with shape (batch_size, *target_size, channels)
and y
is a numpy array of corresponding labels.
ImageDataGenerator
中的两个batch_x and batch_y都是K.floatx()
类型,所以默认必须是tf.float32
。
How to use Keras generator with tf.data API 已经讨论过类似的问题。让我从那里复制粘贴答案:
def make_generator():
train_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator =
train_datagen.flow_from_directory(train_dataset_folder,target_size=(224, 224), class_mode='categorical', batch_size=32)
return train_generator
train_dataset = tf.data.Dataset.from_generator(make_generator,(tf.float32, tf.float32))
作者遇到了另一个图形范围问题,但我想这与您的问题无关。
或作为一个班轮:
tf.data.Dataset.from_generator(lambda:
ImageDataGenerator().flow_from_directory('folder_path'),(tf.float32, tf.float32))
这是我的解决方案。为了展示它是如何工作的,我使用 cats/dogs 数据集:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
#'/Users/mustafamuratarat/.keras/datasets/cats_and_dogs_filtered/train'
BATCH_SIZE = 32
IMG_SIZE = (160, 160)
img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
gen = img_gen.flow_from_directory(train_dir, target_size=(160, 160), batch_size=32)
#<tensorflow.python.keras.preprocessing.image.DirectoryIterator at 0x7fb9fde3b250>
#gen.class_indices
#{'cats': 0, 'dogs': 1}
#gen.target_size
#(160, 160)
# gen.batch_size
# 32
# gen.num_classes
# 2
dataset = tf.data.Dataset.from_generator(
lambda: gen,
output_types = (tf.float32, tf.float32),
output_shapes = ([None, 160, 160, 3], [None, 2]),
)
#list(dataset.take(1).as_numpy_iterator())
然后您可以将 dataset
对象提供给任何模型。
如何创建 tf.data.Dataset
from tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory
?
我正在考虑 tf.data.Dataset.from_generator
,但不清楚如何为其获取 output_types
关键字参数,给定 return 类型:
A
DirectoryIterator
yielding tuples of(x, y)
wherex
is a numpy array containing a batch of images with shape(batch_size, *target_size, channels)
andy
is a numpy array of corresponding labels.
ImageDataGenerator
中的两个batch_x and batch_y都是K.floatx()
类型,所以默认必须是tf.float32
。
How to use Keras generator with tf.data API 已经讨论过类似的问题。让我从那里复制粘贴答案:
def make_generator():
train_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator =
train_datagen.flow_from_directory(train_dataset_folder,target_size=(224, 224), class_mode='categorical', batch_size=32)
return train_generator
train_dataset = tf.data.Dataset.from_generator(make_generator,(tf.float32, tf.float32))
作者遇到了另一个图形范围问题,但我想这与您的问题无关。
或作为一个班轮:
tf.data.Dataset.from_generator(lambda:
ImageDataGenerator().flow_from_directory('folder_path'),(tf.float32, tf.float32))
这是我的解决方案。为了展示它是如何工作的,我使用 cats/dogs 数据集:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
#'/Users/mustafamuratarat/.keras/datasets/cats_and_dogs_filtered/train'
BATCH_SIZE = 32
IMG_SIZE = (160, 160)
img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
gen = img_gen.flow_from_directory(train_dir, target_size=(160, 160), batch_size=32)
#<tensorflow.python.keras.preprocessing.image.DirectoryIterator at 0x7fb9fde3b250>
#gen.class_indices
#{'cats': 0, 'dogs': 1}
#gen.target_size
#(160, 160)
# gen.batch_size
# 32
# gen.num_classes
# 2
dataset = tf.data.Dataset.from_generator(
lambda: gen,
output_types = (tf.float32, tf.float32),
output_shapes = ([None, 160, 160, 3], [None, 2]),
)
#list(dataset.take(1).as_numpy_iterator())
然后您可以将 dataset
对象提供给任何模型。