如何计算 Torch 中任意 layer/weight 的损失梯度?
How to compute the gradient of loss with repect to an arbitrary layer/weight in Torch?
我正在从 Theano 过渡到 Torch。所以请多多包涵。在 Theano 中,计算损失函数的梯度 w.r.t 甚至是特定的权重是一种直接的方法。我想知道,如何在 Torch 中做到这一点?
假设我们有以下代码生成一些 data/labels 并定义一个模型:
t = require 'torch'
require 'nn'
require 'cunn'
require 'cutorch'
-- Generate random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
inputs = t.rand(1000, 3, 32, 32) -- 1000 samples, 3 color channels
inputs = inputs:cuda()
labels = randLabels(inputs:size()[1], 10)
labels = labels:cuda()
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(2, 2, 2, 2))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 300))
net:add(nn.ReLU())
net:add(nn.Linear(300, 10))
net = net:cuda()
-- Loss
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda()
forwardPass = net:forward(inputs)
net:zeroGradParameters()
dEd_WeightsOfLayer1 -- How to compute this?
forwardPass = nil
net = nil
criterion = nil
inputs = nil
labels = nil
collectgarbage()
如何计算卷积层的梯度 w.r.t 权重?
好的,我找到了答案(感谢 Torch7 Google 组的 alban desmaison)。
问题中的代码有一个错误并且不起作用。所以我重新写了代码。以下是如何获得每个 node/parameter:
的渐变
t = require 'torch'
require 'cunn'
require 'nn'
require 'cutorch'
-- A function to generate some random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
-- Declare some variables
nClass = 10
kernelSize = 5
stride = 2
poolKernelSize = 2
nData = 100
nChannel = 3
imageSize = 32
-- Generate some [random] data
data = t.rand(nData, nChannel, imageSize, imageSize) -- 100 Random images with 3 channels
data = data:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
label = randLabels(data:size()[1], nClass)
label = label:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Define model
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(poolKernelSize, poolKernelSize, stride, stride))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 350))
net:add(nn.ReLU())
net:add(nn.Linear(350, 10))
net = net:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Do forward pass and get the gradient for each node/parameter:
net:forward(data) -- Do the forward propagation
criterion:forward(net.output, label) -- Computer the overall negative log-likelihood error
criterion:backward(net.output, label); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
net:backward(data, criterion.gradInput); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
-- Now you can access the gradient values
layer1InputGrad = net:get(1).gradInput
layer1WeightGrads = net:get(1).gradWeight
net = nil
data = nil
label = nil
criterion = nil
复制并粘贴代码,效果非常好:)
我正在从 Theano 过渡到 Torch。所以请多多包涵。在 Theano 中,计算损失函数的梯度 w.r.t 甚至是特定的权重是一种直接的方法。我想知道,如何在 Torch 中做到这一点?
假设我们有以下代码生成一些 data/labels 并定义一个模型:
t = require 'torch'
require 'nn'
require 'cunn'
require 'cutorch'
-- Generate random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
inputs = t.rand(1000, 3, 32, 32) -- 1000 samples, 3 color channels
inputs = inputs:cuda()
labels = randLabels(inputs:size()[1], 10)
labels = labels:cuda()
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(2, 2, 2, 2))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 300))
net:add(nn.ReLU())
net:add(nn.Linear(300, 10))
net = net:cuda()
-- Loss
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda()
forwardPass = net:forward(inputs)
net:zeroGradParameters()
dEd_WeightsOfLayer1 -- How to compute this?
forwardPass = nil
net = nil
criterion = nil
inputs = nil
labels = nil
collectgarbage()
如何计算卷积层的梯度 w.r.t 权重?
好的,我找到了答案(感谢 Torch7 Google 组的 alban desmaison)。 问题中的代码有一个错误并且不起作用。所以我重新写了代码。以下是如何获得每个 node/parameter:
的渐变t = require 'torch'
require 'cunn'
require 'nn'
require 'cutorch'
-- A function to generate some random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
-- Declare some variables
nClass = 10
kernelSize = 5
stride = 2
poolKernelSize = 2
nData = 100
nChannel = 3
imageSize = 32
-- Generate some [random] data
data = t.rand(nData, nChannel, imageSize, imageSize) -- 100 Random images with 3 channels
data = data:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
label = randLabels(data:size()[1], nClass)
label = label:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Define model
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(poolKernelSize, poolKernelSize, stride, stride))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 350))
net:add(nn.ReLU())
net:add(nn.Linear(350, 10))
net = net:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Do forward pass and get the gradient for each node/parameter:
net:forward(data) -- Do the forward propagation
criterion:forward(net.output, label) -- Computer the overall negative log-likelihood error
criterion:backward(net.output, label); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
net:backward(data, criterion.gradInput); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
-- Now you can access the gradient values
layer1InputGrad = net:get(1).gradInput
layer1WeightGrads = net:get(1).gradWeight
net = nil
data = nil
label = nil
criterion = nil
复制并粘贴代码,效果非常好:)