将不同大小的torch张量垫成相等

Pad torch tensors of different sizes to be equal

我正在寻找一种方法来获取 image/target 批次进行分割,并 return 图像尺寸已更改为整个批次相等的批次。我已经使用以下代码尝试过此操作:

def collate_fn_padd(batch):
    '''
    Padds batch of variable length

    note: it converts things ToTensor manually here since the ToTensor transform
    assume it takes in images rather than arbitrary tensors.
    '''
    # separate the image and masks
    image_batch,mask_batch = zip(*batch)

    # pad the images and masks
    image_batch = torch.nn.utils.rnn.pad_sequence(image_batch, batch_first=True)
    mask_batch = torch.nn.utils.rnn.pad_sequence(mask_batch, batch_first=True)

    # rezip the batch
    batch = list(zip(image_batch, mask_batch))

    return batch

但是,我得到这个错误:

RuntimeError: The expanded size of the tensor (650) must match the existing size (439) at non-singleton dimension 2.  Target sizes: [3, 650, 650].  Tensor sizes: [3, 406, 439]

如何有效地将张量填充为相等维度并避免此问题?

rnn.pad_sequence只填充序列维度,它要求所有其他维度都相等。您不能使用它在两个维度(高度和宽度)上填充图像。

可以使用填充图片torch.nn.functional.pad,但需要手动确定需要填充的高度和宽度。

import torch.nn.functional as F

# Determine maximum height and width
# The mask's have the same height and width
# since they mask the image.
max_height = max([img.size(1) for img in image_batch])
max_width = max([img.size(2) for img in image_batch])

image_batch = [
    # The needed padding is the difference between the
    # max width/height and the image's actual width/height.
    F.pad(img, [0, max_width - img.size(2), 0, max_height - img.size(1)])
    for img in image_batch
]
mask_batch = [
    # Same as for the images, but there is no channel dimension
    # Therefore the mask's width is dimension 1 instead of 2
    F.pad(mask, [0, max_width - mask.size(1), 0, max_height - mask.size(0)])
    for mask in mask_batch
]

填充长度按维度的倒序指定,其中每个维度有两个值,一个用于开头的填充,一个用于结尾的填充。对于尺寸为 [channels, height, width] 的图像,填充为:[width_beginning, width_end, height_beginning, height_top],可以改写为 [left, right, top, bottom]。因此,上面的代码将图像填充到右侧和底部。通道被排除在外,因为它们没有被填充,这也意味着相同的填充可以直接应用于蒙版。