torch的inception net为什么要用ReLU的clone()？

Question

我正在阅读https://github.com/Element-Research/dpnn/blob/master/Inception.lua

您可以在此来源中看到大量 clone()。喜欢

mlp:add(self.transfer:clone())

self.transfer无非就是nn.ReLU().

然后，

为什么这段代码使用clone()调用激活函数？这是否只涉及内存问题？
我以为 clone 共享参数。这是正确的吗？如果正确，则意味着此初始模块的所有激活都共享参数。看起来像废话。我对 Inception-Net 有误解吗？

Answer 1

如果您不克隆模块 self.transfer，那么您网络 mlp 中的所有模块 transfer 将具有相同的状态变量 output 和 gradInput.

例如看这个玩具代码

require 'nn'

module = nn.ReLU()

net = nn.Sequential():add(nn.Linear(2,2)):add(module):add(nn.Linear(2,1)):add(module)

input = torch.Tensor(2,2):random()
net:forward(input)

print(net:get(2).output)
print(net:get(4).output)

两个打印语句将 return 相同的值。修改 module 输出之一将修改另一个。由于我们不希望出现这种行为，因此我们必须克隆该模块。（但是在你的情况下，克隆一个简单的 nn.ReLU() 并不是那么有用。）

文档说

If arguments are provided to the clone(...) function it also calls share(...) with those arguments on the cloned module after creating it, hence making a deep copy of this module with some shared parameters.

因此，如果您不提供任何参数，则不会共享参数。

torch的inception net为什么要用ReLU的clone()？

Why use clone() of ReLU in torch's inception net?

clone

torch