pytorch optim.SGD with momentum 如何检查 "velocity"？

Question

我有一个菜鸟问题：从 SGD doc 他们提供了 SGD 的动量方程，这表明除了当前梯度 weight.grad，我们还需要保存来自上一步（类似于 weight.prev_v?）。我知道 nn.Parameter 对象有 .data 和 .grad 属性，但它是否也保存了一个 .prev_v？你知道pytorch是怎么工作的吗？谢谢！

编辑：基本上我想知道 pytorch 在哪里保存上一步的速度？

Answer 1

这些存储在优化器的状态属性中。在torch.optim.SGD the momentum values are stored a dictionary under the 'momentum_buffer' key, as you can see in the source code.

的情况下

这是一个最小的例子：

>>> m = nn.Linear(10,10)
>>> optim = torch.optim.SGD(m.parameters(), lr=1.e-3, momentum=.9)
>>> m(torch.rand(1, 10)).mean().backward()
>>> optim.step()

>>> optim.state
defaultdict(dict, {0: {}, Parameter containing: ...})

>>> list(optim.state.values())[0]
{'momentum_buffer': tensor([...])}

pytorch optim.SGD with momentum 如何检查 "velocity"？

pytorch optim.SGD with momentum how to check "velocity"?

pytorch