如何 select 仅 Tensorflow 数据集的一部分,并更改维度
How to select only one part of a Tensorflow dataset, and change the dimensions
我希望在 UCF101 的 10 个帧片段上训练我的模型,没有任何标签。目前我有这个:
import tensorflow as tf
import tensorflow_datasets as tfds
x_train = tfds.load('ucf101', split='train', shuffle_files=True, batch_size = 64)
>>> print(x_train)
<_OptionsDataset shapes: {label: (None,), video: (None, None, 256, 256, 3)}, types: {label: tf.int64, video: tf.uint8}>
我希望数据集的维度为 (None, 10, 256, 256, 3),并且不包括标签。
编辑:我尝试在 .map()
中使用 lambda 表达式,但这会产生错误。
new_x_train = x_train.map(lambda x: tf.py_function(func=lambda y: tf.convert_to_tensor(sample(y.numpy().tolist(), 10), dtype=uint8), inp=[x['video']], Tout=tf.uint8))
NameError: name 'sample' is not defined
请原谅我的大概答案,因为我不会下载 6GB 数据集来测试我的答案。
为什么不在遍历数据集时只 select 视频:
next(iter(x_train))['video']
到select维度,可以使用正常的numpy
索引。那将是 mnist
:
的示例
import tensorflow_datasets as tfds
data = tfds.load('mnist', split='train', batch_size=16)
<PrefetchDataset shapes: {image: (None, 28, 28, 1),
label: (None,)}, types: {image: tf.uint8, label: tf.int64}>
现在让我们 select 仅 image
和 select 前 10 个观察值。
dim = lambda x: x['image'][:10, ...]
next(iter(data.map(dim))).shape
TensorShape([10, 28, 28, 1])
看看我是如何通过简单索引删除形状中的 None
的。
这个问题的解决方案是简单地将数据集文件下载到其他地方,所以我的目录中有一个 .avi 文件列表,然后在 tensorflow 之外预处理这些文件。我使用了 cv2 库和以下代码,其中我从其他地方借用了两个函数:
# Utilities to open video files using CV2
def crop_center_square(frame):
y, x = frame.shape[0:2]
min_dim = min(y, x)
start_x = (x // 2) - (min_dim // 2)
start_y = (y // 2) - (min_dim // 2)
return frame[start_y:start_y+min_dim,start_x:start_x+min_dim]
def load_video(path, max_frames=0, resize=(256, 256)):
cap = cv2.VideoCapture(path)
frames = []
try:
while True:
ret, frame = cap.read()
if not ret:
break
frame = crop_center_square(frame)
frame = cv2.resize(frame, resize)
frame = frame[:, :, [2, 1, 0]]
frames.append(frame)
if len(frames) == max_frames:
break
finally:
cap.release()
return np.array(frames) / 255.0
files = [f for f in glob.glob("**/*.avi", recursive=True)]
for video_path in files:
video = load_video(video_path)
video_name = video_path[video_path.find('/')+1:]
num_frames = video.shape[0]
print("Video in " + video_path + " has " + str(num_frames) + " frames.")
for seg_num in range(math.floor(num_frames/10)):
result = video[seg_num*10:(seg_num+1)*10, ...]
new_filepath = video_name[:-4] + "_" + str(seg_num).zfill(2) + ".avi"
print(new_filepath)
out = cv2.VideoWriter(new_filepath,0, 25.0, (256,256))
for frame_n in range(0,10):
out.write(np.uint8(255*result[frame_n, ...]))
out.release()
del result
del video
我希望在 UCF101 的 10 个帧片段上训练我的模型,没有任何标签。目前我有这个:
import tensorflow as tf
import tensorflow_datasets as tfds
x_train = tfds.load('ucf101', split='train', shuffle_files=True, batch_size = 64)
>>> print(x_train)
<_OptionsDataset shapes: {label: (None,), video: (None, None, 256, 256, 3)}, types: {label: tf.int64, video: tf.uint8}>
我希望数据集的维度为 (None, 10, 256, 256, 3),并且不包括标签。
编辑:我尝试在 .map()
中使用 lambda 表达式,但这会产生错误。
new_x_train = x_train.map(lambda x: tf.py_function(func=lambda y: tf.convert_to_tensor(sample(y.numpy().tolist(), 10), dtype=uint8), inp=[x['video']], Tout=tf.uint8))
NameError: name 'sample' is not defined
请原谅我的大概答案,因为我不会下载 6GB 数据集来测试我的答案。
为什么不在遍历数据集时只 select 视频:
next(iter(x_train))['video']
到select维度,可以使用正常的numpy
索引。那将是 mnist
:
import tensorflow_datasets as tfds
data = tfds.load('mnist', split='train', batch_size=16)
<PrefetchDataset shapes: {image: (None, 28, 28, 1),
label: (None,)}, types: {image: tf.uint8, label: tf.int64}>
现在让我们 select 仅 image
和 select 前 10 个观察值。
dim = lambda x: x['image'][:10, ...]
next(iter(data.map(dim))).shape
TensorShape([10, 28, 28, 1])
看看我是如何通过简单索引删除形状中的 None
的。
这个问题的解决方案是简单地将数据集文件下载到其他地方,所以我的目录中有一个 .avi 文件列表,然后在 tensorflow 之外预处理这些文件。我使用了 cv2 库和以下代码,其中我从其他地方借用了两个函数:
# Utilities to open video files using CV2
def crop_center_square(frame):
y, x = frame.shape[0:2]
min_dim = min(y, x)
start_x = (x // 2) - (min_dim // 2)
start_y = (y // 2) - (min_dim // 2)
return frame[start_y:start_y+min_dim,start_x:start_x+min_dim]
def load_video(path, max_frames=0, resize=(256, 256)):
cap = cv2.VideoCapture(path)
frames = []
try:
while True:
ret, frame = cap.read()
if not ret:
break
frame = crop_center_square(frame)
frame = cv2.resize(frame, resize)
frame = frame[:, :, [2, 1, 0]]
frames.append(frame)
if len(frames) == max_frames:
break
finally:
cap.release()
return np.array(frames) / 255.0
files = [f for f in glob.glob("**/*.avi", recursive=True)]
for video_path in files:
video = load_video(video_path)
video_name = video_path[video_path.find('/')+1:]
num_frames = video.shape[0]
print("Video in " + video_path + " has " + str(num_frames) + " frames.")
for seg_num in range(math.floor(num_frames/10)):
result = video[seg_num*10:(seg_num+1)*10, ...]
new_filepath = video_name[:-4] + "_" + str(seg_num).zfill(2) + ".avi"
print(new_filepath)
out = cv2.VideoWriter(new_filepath,0, 25.0, (256,256))
for frame_n in range(0,10):
out.write(np.uint8(255*result[frame_n, ...]))
out.release()
del result
del video