RuntimeError:shape ‘[4, 98304]’ is invalid for input of size 113216
RuntimeError:shape ‘[4, 98304]’ is invalid for input of size 113216
我正在学习训练用于图像分类的基本 nn 模型,当我尝试将图像数据输入模型时发生错误。我知道我应该输入正确大小的图像数据。我的图像数据是128*256,3通道,4类,batch size是4。我不明白的是113216这个尺寸是从哪里来的?我检查了所有相关参数或图像元数据,但没有找到任何线索。这是我的代码:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(3*128*256, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(4, 3*128*256)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
for epoch in range(2): # loop over the dataset multiple times
print('round start')
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
print(inputs.shape)
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
感谢您的帮助!
形状
Conv2d
在没有 padding
的情况下更改图像的宽度和高度。经验法则(如果你想与 stride=1
(默认)保持相同的图像大小:padding = kernel_size // 2
- 您正在更改频道数量,而您的
linear
图层出于某种原因 3
?
- 如果您想知道张量数据是如何转换的,请在每个步骤后使用
print(x.shape)
!
注释代码
修复了代码,在每一步后添加了关于形状的注释:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(3, 6, 5)
self.pool = torch.nn.MaxPool2d(2, 2)
self.conv2 = torch.nn.Conv2d(6, 16, 5)
# Output shape from convolution is input shape to fc
self.fc1 = torch.nn.Linear(16 * 29 * 61, 120)
self.fc2 = torch.nn.Linear(120, 84)
self.fc3 = torch.nn.Linear(84, 10)
def forward(self, x):
# In: (4, 3, 128, 256)
x = F.relu(self.conv1(x))
# (4, 3, 124, 252) because kernel_size=5 takes 2 pixels
x = self.pool(x)
# (4, 6, 62, 126) # Because pooling halving the size
x = F.relu(self.conv2(x))
# (4, 16, 58, 122) # Same reason as above
x = self.pool(x)
# (4, 16, 29, 61) Because pooling halving the size
# Better use torch.flatten(x, dim=1) so you don't have to input size here
x = x.view(-1, 16 * 29 * 61) # Use -1 to be batch size independent
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
其他可能有帮助的事情
- 在ReLU之前尝试
torch.nn.AdaptiveMaxPool2d(1)
,它会让你的网络宽度和高度独立
- 在此池化后使用
flatten
(或 torch.nn.Flatten()
层)
- 如果是这样,将最后一个卷积中的
num_channels
设置为in_features
for nn.Linear
我正在学习训练用于图像分类的基本 nn 模型,当我尝试将图像数据输入模型时发生错误。我知道我应该输入正确大小的图像数据。我的图像数据是128*256,3通道,4类,batch size是4。我不明白的是113216这个尺寸是从哪里来的?我检查了所有相关参数或图像元数据,但没有找到任何线索。这是我的代码:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(3*128*256, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(4, 3*128*256)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
for epoch in range(2): # loop over the dataset multiple times
print('round start')
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
print(inputs.shape)
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
感谢您的帮助!
形状
Conv2d
在没有padding
的情况下更改图像的宽度和高度。经验法则(如果你想与stride=1
(默认)保持相同的图像大小:padding = kernel_size // 2
- 您正在更改频道数量,而您的
linear
图层出于某种原因3
? - 如果您想知道张量数据是如何转换的,请在每个步骤后使用
print(x.shape)
!
注释代码
修复了代码,在每一步后添加了关于形状的注释:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(3, 6, 5)
self.pool = torch.nn.MaxPool2d(2, 2)
self.conv2 = torch.nn.Conv2d(6, 16, 5)
# Output shape from convolution is input shape to fc
self.fc1 = torch.nn.Linear(16 * 29 * 61, 120)
self.fc2 = torch.nn.Linear(120, 84)
self.fc3 = torch.nn.Linear(84, 10)
def forward(self, x):
# In: (4, 3, 128, 256)
x = F.relu(self.conv1(x))
# (4, 3, 124, 252) because kernel_size=5 takes 2 pixels
x = self.pool(x)
# (4, 6, 62, 126) # Because pooling halving the size
x = F.relu(self.conv2(x))
# (4, 16, 58, 122) # Same reason as above
x = self.pool(x)
# (4, 16, 29, 61) Because pooling halving the size
# Better use torch.flatten(x, dim=1) so you don't have to input size here
x = x.view(-1, 16 * 29 * 61) # Use -1 to be batch size independent
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
其他可能有帮助的事情
- 在ReLU之前尝试
torch.nn.AdaptiveMaxPool2d(1)
,它会让你的网络宽度和高度独立 - 在此池化后使用
flatten
(或torch.nn.Flatten()
层) - 如果是这样,将最后一个卷积中的
num_channels
设置为in_features
fornn.Linear