如何在不影响梯度的情况下更改 NN 权重？

Question

假设我有一个简单的神经网络：

import torch
import torch.nn as nn
import torch.optim as optim
from torch.nn.utils import parameters_to_vector

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.fc1 = nn.Linear(1, 2)
        self.fc2 = nn.Linear(2, 3)
        self.fc3 = nn.Linear(3, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = torch.relu(x)        
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Model()

opt = optim.Adam(net.parameters())

还有一些功能

features = torch.rand((3,1))

我可以正常训练它：

for i in range(10):
    opt.zero_grad()
    out = net(features)
    loss = torch.mean(torch.square(torch.tensor(5) - torch.sum(out)))
    loss.backward()
    opt.step()

但是，我有兴趣在批处理中的每个示例之后更新每一层的权重。也就是说，将实际权重值更新为每层不同的某个量。

我可以打印每一层的参数：

for i in range(1): opt.zero_grad() out = net(features) print(parameters_to_vector(net.fc1.parameters())) print(parameters_to_vector(net.fc2.parameters())) print(parameters_to_vector(net.fc3.parameters())) loss = torch.mean(torch.square(torch.tensor(5) - torch.sum(out))) loss.backward() opt.step()

如何在不影响梯度的情况下更改反向传播之前的权重值？

假设我希望根据以下函数更新图层权重：

def first_layer_update(weight): return weight + 1e-3*weight def second_layer_update(weight): return 1e-2*weight def third_layer_update(weight): return weight - 1e-1*weight

Answer 1

从 pytorch docs，你基本上是在正确的轨道上。可以循环遍历每一层的所有参数，然后直接添加进去

with torch.no_grad():
  for param in layer.parameters():
    param.weight += 1e-3  # or whatever

Answer 2

- 使用 `torch.no_grad` 上下文管理器。

这允许您对张量执行（就地或异地）操作，而无需 Autograd 跟踪这些更改。正如 @user3474165 解释的那样：

def first_layer_update(weight):
    with torch.no_grad():
        return weight + 1e-3*weight

def second_layer_update(weight):
    with torch_no_grad():
        return 1e-2*weight

def third_layer_update(weight):
    with torch.no_grad():
        return weight - 1e-1*weight

或者在调用它们时使用上下文管理器不改变函数的不同方式：

with torch.no_grad():
    first_layer_update(net.fc1.weight)
    second_layer_update(net.fc2.weight)
    third_layer_update(net.fc3.weight)

- 使用 `@torch.no_grad` 装饰器。

一个变体是使用 @torch.no_grad 装饰器：

@torch.no_grad()
def first_layer_update(weight):
    return weight + 1e-3*weight

@torch.no_grad():
def second_layer_update(weight):
    return 1e-2*weight

@torch.no_grad():
def third_layer_update(weight):
    return weight - 1e-1*weight

并用以下方式调用它们：first_layer_update(net.fc1.weight)、second_layer_update(net.fc2.weight) 等...

- 变异 `torch.Tensor.data`.

用 torch.no_grad 上下文包装操作的另一种方法是使用 data 属性改变权重。这意味着调用您的函数：

>>> first_layer_update(net.fc1.weight.data)
>>> second_layer_update(net.fc2.weight.data)
>>> third_layer_update(net.fc3.weight.data)

这会根据各自的更新策略改变三层的权重（而不是偏差）。

简而言之，如果你想改变 nn.Module 的所有参数，你可以这样做：

>>> with torch.no_grad():
...     update_policy(parameters_to_vector(net.layer.parameters()))

或

>>> update_policy(parameters_to_vector(net.layer.parameters().data))

如何在不影响梯度的情况下更改 NN 权重？

How can I change the NN weights without affecting the gradients?

python

machine-learning

neural-network

pytorch

- 使用 `torch.no_grad` 上下文管理器。

- 使用 `@torch.no_grad` 装饰器。

- 变异 `torch.Tensor.data`.

如何在不影响梯度的情况下更改 NN 权重？

How can I change the NN weights without affecting the gradients?

python

machine-learning

neural-network

pytorch

- 使用 torch.no_grad 上下文管理器。

- 使用 @torch.no_grad 装饰器。

- 变异 torch.Tensor.data.

- 使用 `torch.no_grad` 上下文管理器。

- 使用 `@torch.no_grad` 装饰器。

- 变异 `torch.Tensor.data`.