批量大小会降低预训练 CNN 的整体精度
Batch size reduces accuracy of ensemble of pretrained CNNs
我正在尝试实现基于 softmax 的基本投票,我采用了几个预训练的 CNN,对它们的输出进行 softmax,将它们加在一起,然后使用 argmax 作为最终输出。
所以我从 "chenyaofo/pytorch-cifar-models"
中加载了 4 个不同的预训练 CNN(vgg11
、vgg13
、vgg16
、vgg19
)——我没有训练他们。
当我使用 DataLoader 和 batch_size=128/256
对测试集进行迭代时,准确率达到 94%;
当我使用 batch_size=1
遍历测试集时,准确率达到 69%。
怎么可能?
这是代码:
import torch
from tqdm import tqdm
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader
import torch.nn as nn
import torch
torch.cuda.empty_cache()
model_names = [
"cifar10_vgg11_bn",
"cifar10_vgg13_bn",
"cifar10_vgg16_bn",
"cifar10_vgg19_bn",
# "cifar10_resnet56",
]
batch_size = 2
test_transform = transforms.Compose([
transforms.ToTensor(),
])
def load_models():
models = []
for model_name in model_names:
model = torch.hub.load("chenyaofo/pytorch-cifar-models", model_name, pretrained=True)
models.append(model)
return models
testset = datasets.CIFAR10(root='./data', train=False,
download=True, transform=test_transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
shuffle=False)
import torch.nn as nn
import torch
class MyEnsemble(nn.Module):
def __init__(self, modelA, modelB, modelC, modelD):
super(MyEnsemble, self).__init__()
self.modelA = modelA
self.modelB = modelB
self.modelC = modelC
self.modelD = modelD
# self.modelE = modelE
def forward(self, x):
out1 = self.modelA(x)
out2 = self.modelB(x)
out3 = self.modelC(x)
out4 = self.modelD(x)
# out5 = self.modelE(x)
# print(out1.shape)
out1 = torch.softmax(out1, dim=1)
out2 = torch.softmax(out2, dim=1)
out3 = torch.softmax(out3, dim=1)
out4 = torch.softmax(out4, dim=1)
out = out1 + out2 + out3 + out4
return out
from EnsembleModule import MyEnsemble
from data import load_models, testloader
import torch
from tqdm import tqdm
device = 'cuda' if torch.cuda.is_available() else 'cpu'
models = load_models()
model = MyEnsemble(models[0], models[1], models[2], models[3])
model.to(device)
total = 0
correct = 0
with torch.no_grad():
for images, labels in tqdm(testloader):
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predictions = torch.max(outputs, 1)
total += labels.size(0)
correct += (predictions == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
您忘记拨打 model.eval()
:
# ...
model.to(device)
model.eval() # <<<<<<<<<<<<<
total = 0
correct = 0
with torch.no_grad():
for images, labels in tqdm(testloader):
images, labels = images.to(device), labels.to(device)
outputs = model(images)
# ...
由于您的模型有 BatchNorm 层,batch_size=1 的性能特别差。
预处理也应该遵循用于训练的预处理。正如您在 repository of the author of the model, you should normalize using the following statistics:
中看到的
test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2023, 0.1994, 0.2010))
])
您正在使用包含 batchnorm 层的模型(由 torchvision
模型名称中的 _bn
后缀表示)。
这反过来意味着结果将取决于当前批次的统计信息。这些在使用 batch_size=2
和 batch_size=128
时是不同的。评估时,您应该始终致电 nn.Module.eval
function. This makes the layer use running statistics (those learned during training) and not the batch's statistics. Read 以获取更多信息。
请注意调用 eval 将递归传播到所有子模块,因此您只需直接对 yoru 集成模块进行一次调用:
model = MyEnsemble(models[0], models[1], models[2], models[3])
model.eval()
完成后,批量大小应该不会影响模型的性能。
训练时,您需要使用 nn.Module.train
重新开启训练模式。
您需要根据dataset's statistics对数据进行标准化,您可以在torchvision预处理管道中进行:
test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261))]))
我正在尝试实现基于 softmax 的基本投票,我采用了几个预训练的 CNN,对它们的输出进行 softmax,将它们加在一起,然后使用 argmax 作为最终输出。
所以我从 "chenyaofo/pytorch-cifar-models"
中加载了 4 个不同的预训练 CNN(vgg11
、vgg13
、vgg16
、vgg19
)——我没有训练他们。
当我使用 DataLoader 和
batch_size=128/256
对测试集进行迭代时,准确率达到 94%;当我使用
batch_size=1
遍历测试集时,准确率达到 69%。
怎么可能?
这是代码:
import torch
from tqdm import tqdm
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader
import torch.nn as nn
import torch
torch.cuda.empty_cache()
model_names = [
"cifar10_vgg11_bn",
"cifar10_vgg13_bn",
"cifar10_vgg16_bn",
"cifar10_vgg19_bn",
# "cifar10_resnet56",
]
batch_size = 2
test_transform = transforms.Compose([
transforms.ToTensor(),
])
def load_models():
models = []
for model_name in model_names:
model = torch.hub.load("chenyaofo/pytorch-cifar-models", model_name, pretrained=True)
models.append(model)
return models
testset = datasets.CIFAR10(root='./data', train=False,
download=True, transform=test_transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
shuffle=False)
import torch.nn as nn
import torch
class MyEnsemble(nn.Module):
def __init__(self, modelA, modelB, modelC, modelD):
super(MyEnsemble, self).__init__()
self.modelA = modelA
self.modelB = modelB
self.modelC = modelC
self.modelD = modelD
# self.modelE = modelE
def forward(self, x):
out1 = self.modelA(x)
out2 = self.modelB(x)
out3 = self.modelC(x)
out4 = self.modelD(x)
# out5 = self.modelE(x)
# print(out1.shape)
out1 = torch.softmax(out1, dim=1)
out2 = torch.softmax(out2, dim=1)
out3 = torch.softmax(out3, dim=1)
out4 = torch.softmax(out4, dim=1)
out = out1 + out2 + out3 + out4
return out
from EnsembleModule import MyEnsemble
from data import load_models, testloader
import torch
from tqdm import tqdm
device = 'cuda' if torch.cuda.is_available() else 'cpu'
models = load_models()
model = MyEnsemble(models[0], models[1], models[2], models[3])
model.to(device)
total = 0
correct = 0
with torch.no_grad():
for images, labels in tqdm(testloader):
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predictions = torch.max(outputs, 1)
total += labels.size(0)
correct += (predictions == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
您忘记拨打 model.eval()
:
# ...
model.to(device)
model.eval() # <<<<<<<<<<<<<
total = 0
correct = 0
with torch.no_grad():
for images, labels in tqdm(testloader):
images, labels = images.to(device), labels.to(device)
outputs = model(images)
# ...
由于您的模型有 BatchNorm 层,batch_size=1 的性能特别差。
预处理也应该遵循用于训练的预处理。正如您在 repository of the author of the model, you should normalize using the following statistics:
中看到的test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2023, 0.1994, 0.2010))
])
您正在使用包含 batchnorm 层的模型(由 torchvision
模型名称中的 _bn
后缀表示)。
这反过来意味着结果将取决于当前批次的统计信息。这些在使用 batch_size=2
和 batch_size=128
时是不同的。评估时,您应该始终致电 nn.Module.eval
function. This makes the layer use running statistics (those learned during training) and not the batch's statistics. Read
请注意调用 eval 将递归传播到所有子模块,因此您只需直接对 yoru 集成模块进行一次调用:
model = MyEnsemble(models[0], models[1], models[2], models[3])
model.eval()
完成后,批量大小应该不会影响模型的性能。
训练时,您需要使用 nn.Module.train
重新开启训练模式。
您需要根据dataset's statistics对数据进行标准化,您可以在torchvision预处理管道中进行:
test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261))]))