如何手动实现 pytorch 卷积的填充
How to manually implement padding for pytorch convolutions
我正在尝试将一些 pytorch 代码移植到 tensorflow 2.0,但我很难弄清楚如何在两者之间转换卷积函数。两个库处理填充的方式是症结所在。基本上,我想了解如何手动生成 pytorch 在后台执行的填充,以便我可以将其转换为 tensorflow。
如果我不做任何填充,下面的代码就可以工作,但是我不知道如何在添加任何填充后使两个实现匹配。
output_padding = SOME NUMBER
padding = SOME OTHER NUMBER
strides = 128
tensor = np.random.rand(2, 258, 249)
filters = np.random.rand(258, 1, 256)
out_torch = F.conv_transpose1d(
torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides,
padding=padding,
output_padding=output_padding)
def pytorch_transpose_conv1d(inputs, filters, strides, padding, output_padding):
N, L_in = inputs.shape[0], inputs.shape[2]
out_channels, kernel_size = filters.shape[1], filters.shape[2]
time_out = (L_in - 1) * strides - 2 * padding + (kernel_size - 1) + output_padding + 1
padW = (kernel_size - 1) - padding
# HOW DO I PAD HERE TO GET THE SAME OUTPUT AS IN PYTORCH
inputs = tf.pad(inputs, [(?, ?), (?, ?), (?, ?)])
return tf.nn.conv1d_transpose(
inputs,
tf.transpose(filters, perm=(2, 1, 0)),
output_shape=(N, out_channels, time_out),
strides=strides,
padding="VALID",
data_format="NCW")
out_tf = pytorch_transpose_conv1d(tensor, filters, strides, padding, output_padding)
assert np.allclose(out_tf.numpy(), out_torch.numpy())
填充
在Pytorch之间转换卷积和转置卷积函数(带填充padding) ] 和 Tensorflow 我们需要先了解 F.pad()
和 tf.pad()
函数。
torch.nn.functional.pad(输入,padding_size,模式='constant',值=0):
padding size
:从last dimension
开始描述了填充某些输入维度的填充大小,并向前移动。
- 只填充输入张量的
last dimension
,然后填充的形式为(padding_left,padding_right)
- 填充
last 3 dimensions
, (padding_left, padding_right, padding_top, padding_bottom, padding_front, padding_back)
tensorflow.pad(输入,padding_size,模式='CONSTANT',名称=None,constant_values=0)
padding_size
:是一个整数张量,形状为[n, 2]
,其中n是张量的秩。对于输入的每个维度D,paddings[D, 0]表示在该维度的tensor内容之前添加多少个值,paddings[D, 1]
表示在该维度的tensor内容之后添加多少个值。
这里的 table 表示 F.pad 和 tf.pad 等价物 以及输入张量的输出张量
[[[1, 1], [1, 1]]]
形状为 (1, 2, 2)
卷积填充
现在让我们转到卷积层
中的PyTorch填充
F.conv1d(输入, ..., 填充, ...):
- padding 控制
both sides
上的隐式填充量,用于填充点数。
padding=(size)
应用 F.pad(input, [size, size])
即填充 last 维度,(大小,大小)等于 tf.pad(input, [[0, 0], [0, 0], [size, size]])
F.conv2d(输入,...,填充,...):
padding=(size)
应用 F.pad(input, [size, size, size, size])
即填充 最后 2 维度,(大小,大小)等于 tf.pad(input, [[0, 0], [size, size], [size, size]])
padding=(size1, size2)
应用 F.pad(input, [size2, size2, size1, size1])
,相当于 tf.pad(input, [[0, 0], [size1, size1], [size2, size2]])
转置卷积中的填充
转置卷积层
中的PyTorch填充
- F.conv_transpose1d(输入, ..., 填充, output_padding, ...):
dilation * (kernel_size - 1) - padding
填充将添加到输入中每个维度的 both
边。
transposed
卷积中的 Padding
可以看作是分配 fake
输出,即 removed
output_padding
控制添加到输出形状一侧的附加尺寸
- 检查 this 以了解在
pytorch
.transpose convolution
期间究竟发生了什么。
- 这里是计算转置卷积输出大小的公式:
output_size = (input_size - 1)步幅 + (kerenel_size - 1) + 1 + output_padding - 2填充
代码
转置卷积
import torch
import torch.nn as nn
import torch.nn.functional as F
import tensorflow as tf
import numpy as np
# to stop tf checkfailed error not relevent to actual code
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
def tconv(tensor, filters, output_padding=0, padding=0, strides=1):
'''
tensor : input tensor of shape (batch_size, channels, W) i.e (NCW)
filters : input kernel of shape (in_ch, out_ch, kernel_size)
output_padding : single number must be smaller than either stride or dilation
padding : single number should be less or equal to ((valid output size + output padding) // 2)
strides : single number
'''
bs, in_ch, W = tensor.shape
in_ch, out_ch, k_sz = filters.shape
out_torch = F.conv_transpose1d(torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides, padding=padding,
output_padding=output_padding)
out_torch = out_torch.numpy()
# output_size = (input_size - 1)*stride + (kerenel_size - 1) + 1 + output_padding - 2*padding
# valid out size -> padding=0, output_padding=0
# -> valid_out_size = (input_size - 1)*stride + (kerenel_size - 1) + 1
out_size = (W - 1)*strides + (k_sz - 1) + 1
# input shape -> (batch_size, W, in_ch) and filters shape -> (kernel_size, out_ch, in_ch) for tf conv
valid_tf = tf.nn.conv1d_transpose(np.transpose(tensor, axes=(0, 2, 1)),
np.transpose(filters, axes=(2, 1, 0)),
output_shape=(bs, out_size, out_ch),
strides=strides, padding='VALID',
data_format='NWC')
# output padding
tf_outpad = tf.pad(valid_tf, [[0, 0], [0, output_padding], [0, 0]])
# NWC to NCW
tf_outpad = np.transpose(tf_outpad, (0, 2, 1))
# padding -> input, begin, shape -> remove `padding` elements on both side
out_tf = tf.slice(tf_outpad, [0, 0, padding], [bs, out_ch, tf_outpad.shape[2]-2*padding])
out_tf = np.array(out_tf)
print('output size(tf, torch):', out_tf.shape, out_torch.shape)
# print('out_torch:\n', out_torch)
# print('out_tf:\n', out_tf)
print('outputs are close:', np.allclose(out_tf, out_torch))
tensor = np.random.rand(2, 1, 7)
filters = np.random.rand(1, 2, 3)
tconv(tensor, filters, output_padding=2, padding=5, strides=3)
结果
>>> tensor = np.random.rand(2, 258, 249)
>>> filters = np.random.rand(258, 1, 7)
>>> tconv(tensor, filters, output_padding=4, padding=9, strides=6)
output size(tf, torch): (2, 1, 1481) (2, 1, 1481)
outputs are close: True
一些有用的链接:
我正在尝试将一些 pytorch 代码移植到 tensorflow 2.0,但我很难弄清楚如何在两者之间转换卷积函数。两个库处理填充的方式是症结所在。基本上,我想了解如何手动生成 pytorch 在后台执行的填充,以便我可以将其转换为 tensorflow。
如果我不做任何填充,下面的代码就可以工作,但是我不知道如何在添加任何填充后使两个实现匹配。
output_padding = SOME NUMBER
padding = SOME OTHER NUMBER
strides = 128
tensor = np.random.rand(2, 258, 249)
filters = np.random.rand(258, 1, 256)
out_torch = F.conv_transpose1d(
torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides,
padding=padding,
output_padding=output_padding)
def pytorch_transpose_conv1d(inputs, filters, strides, padding, output_padding):
N, L_in = inputs.shape[0], inputs.shape[2]
out_channels, kernel_size = filters.shape[1], filters.shape[2]
time_out = (L_in - 1) * strides - 2 * padding + (kernel_size - 1) + output_padding + 1
padW = (kernel_size - 1) - padding
# HOW DO I PAD HERE TO GET THE SAME OUTPUT AS IN PYTORCH
inputs = tf.pad(inputs, [(?, ?), (?, ?), (?, ?)])
return tf.nn.conv1d_transpose(
inputs,
tf.transpose(filters, perm=(2, 1, 0)),
output_shape=(N, out_channels, time_out),
strides=strides,
padding="VALID",
data_format="NCW")
out_tf = pytorch_transpose_conv1d(tensor, filters, strides, padding, output_padding)
assert np.allclose(out_tf.numpy(), out_torch.numpy())
填充
在Pytorch之间转换卷积和转置卷积函数(带填充padding) ] 和 Tensorflow 我们需要先了解 F.pad()
和 tf.pad()
函数。
torch.nn.functional.pad(输入,padding_size,模式='constant',值=0):
padding size
:从last dimension
开始描述了填充某些输入维度的填充大小,并向前移动。- 只填充输入张量的
last dimension
,然后填充的形式为(padding_left,padding_right) - 填充
last 3 dimensions
, (padding_left, padding_right, padding_top, padding_bottom, padding_front, padding_back)
tensorflow.pad(输入,padding_size,模式='CONSTANT',名称=None,constant_values=0)
padding_size
:是一个整数张量,形状为[n, 2]
,其中n是张量的秩。对于输入的每个维度D,paddings[D, 0]表示在该维度的tensor内容之前添加多少个值,paddings[D, 1]
表示在该维度的tensor内容之后添加多少个值。
这里的 table 表示 F.pad 和 tf.pad 等价物 以及输入张量的输出张量
[[[1, 1], [1, 1]]]
形状为 (1, 2, 2)
卷积填充
现在让我们转到卷积层
中的PyTorch填充F.conv1d(输入, ..., 填充, ...):
- padding 控制
both sides
上的隐式填充量,用于填充点数。 padding=(size)
应用F.pad(input, [size, size])
即填充 last 维度,(大小,大小)等于tf.pad(input, [[0, 0], [0, 0], [size, size]])
- padding 控制
F.conv2d(输入,...,填充,...):
padding=(size)
应用F.pad(input, [size, size, size, size])
即填充 最后 2 维度,(大小,大小)等于tf.pad(input, [[0, 0], [size, size], [size, size]])
padding=(size1, size2)
应用F.pad(input, [size2, size2, size1, size1])
,相当于tf.pad(input, [[0, 0], [size1, size1], [size2, size2]])
转置卷积中的填充
转置卷积层
中的PyTorch填充- F.conv_transpose1d(输入, ..., 填充, output_padding, ...):
dilation * (kernel_size - 1) - padding
填充将添加到输入中每个维度的both
边。Padding
可以看作是分配fake
输出,即removed
output_padding
控制添加到输出形状一侧的附加尺寸- 检查 this 以了解在
pytorch
.transpose convolution
期间究竟发生了什么。 - 这里是计算转置卷积输出大小的公式:
transposed
卷积中的
output_size = (input_size - 1)步幅 + (kerenel_size - 1) + 1 + output_padding - 2填充
代码
转置卷积
import torch
import torch.nn as nn
import torch.nn.functional as F
import tensorflow as tf
import numpy as np
# to stop tf checkfailed error not relevent to actual code
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
def tconv(tensor, filters, output_padding=0, padding=0, strides=1):
'''
tensor : input tensor of shape (batch_size, channels, W) i.e (NCW)
filters : input kernel of shape (in_ch, out_ch, kernel_size)
output_padding : single number must be smaller than either stride or dilation
padding : single number should be less or equal to ((valid output size + output padding) // 2)
strides : single number
'''
bs, in_ch, W = tensor.shape
in_ch, out_ch, k_sz = filters.shape
out_torch = F.conv_transpose1d(torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides, padding=padding,
output_padding=output_padding)
out_torch = out_torch.numpy()
# output_size = (input_size - 1)*stride + (kerenel_size - 1) + 1 + output_padding - 2*padding
# valid out size -> padding=0, output_padding=0
# -> valid_out_size = (input_size - 1)*stride + (kerenel_size - 1) + 1
out_size = (W - 1)*strides + (k_sz - 1) + 1
# input shape -> (batch_size, W, in_ch) and filters shape -> (kernel_size, out_ch, in_ch) for tf conv
valid_tf = tf.nn.conv1d_transpose(np.transpose(tensor, axes=(0, 2, 1)),
np.transpose(filters, axes=(2, 1, 0)),
output_shape=(bs, out_size, out_ch),
strides=strides, padding='VALID',
data_format='NWC')
# output padding
tf_outpad = tf.pad(valid_tf, [[0, 0], [0, output_padding], [0, 0]])
# NWC to NCW
tf_outpad = np.transpose(tf_outpad, (0, 2, 1))
# padding -> input, begin, shape -> remove `padding` elements on both side
out_tf = tf.slice(tf_outpad, [0, 0, padding], [bs, out_ch, tf_outpad.shape[2]-2*padding])
out_tf = np.array(out_tf)
print('output size(tf, torch):', out_tf.shape, out_torch.shape)
# print('out_torch:\n', out_torch)
# print('out_tf:\n', out_tf)
print('outputs are close:', np.allclose(out_tf, out_torch))
tensor = np.random.rand(2, 1, 7)
filters = np.random.rand(1, 2, 3)
tconv(tensor, filters, output_padding=2, padding=5, strides=3)
结果
>>> tensor = np.random.rand(2, 258, 249)
>>> filters = np.random.rand(258, 1, 7)
>>> tconv(tensor, filters, output_padding=4, padding=9, strides=6)
output size(tf, torch): (2, 1, 1481) (2, 1, 1481)
outputs are close: True
一些有用的链接: