使用自定义渐变的自定义激活不起作用

Custom Activation with custom gradient does not work

我正在尝试为简单的神经网络训练编写代码。目标是定义一个自定义激活函数,而不是让 Keras 自动采用它的导数进行反向传播,我让 Keras 使用我的自定义梯度函数来进行自定义激活:

import numpy as np
import tensorflow as tf
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
from keras.backend import tf
from keras import initializers
from keras.layers import Lambda

@tf.custom_gradient
def custom_activation(x):

    def grad(dy):
        return dy * 0

    result=(K.sigmoid(x) *2-1 )
    return result, grad 

x_train=np.array([[1,2],[3,4],[3,4]]);

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
layer = Lambda(lambda x: custom_activation)(output_1)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)
model2 = Model(inputs=inputs, outputs=output_2)

model2.compile(optimizer='adam',loss='mean_squared_error')
model2.fit(x_train,x_train,epochs=20,validation_split=0.1,shuffle=False)

由于梯度被定义为零,我希望损失在所有 epoch 后都不会改变。这是我得到的错误的回溯:

Using TensorFlow backend.
WARNING:tensorflow:From C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
  File "C:/p/CE/mytest.py", line 43, in <module>
    layer = Lambda(lambda x: custom_activation)(output_1)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 474, in __call__
    output_shape = self.compute_output_shape(input_shape)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\core.py", line 656, in compute_output_shape
    return K.int_shape(x)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 593, in int_shape
    return tuple(x.get_shape().as_list())
AttributeError: 'function' object has no attribute 'get_shape'

更新: 我使用了 Manoj Mohan 的答案,现在代码可以工作了。我希望看到 epoch 之间的损失不变,因为梯度被定义为零。但是,它确实改变了。为什么?我错过了什么吗?

示例:

Epoch 1/20
2019-10-03 10:31:34.193232: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2/2 [==============================] - 0s 68ms/step - loss: 8.3184 - val_loss: 13.7232
Epoch 2/20

2/2 [==============================] - 0s 496us/step - loss: 8.2783 - val_loss: 13.6368

替换

layer = Lambda(lambda x: custom_activation)(output_1)

layer = Lambda(custom_activation)(output_1)

I expect to see unchanged loss among epochs since the gradient is defined to be zero. But, it does change. Why?

中间层的梯度更新为零。因此,梯度不会从那里向后流动。但是从输出到中间层,梯度会流动并且权重会得到更新。这种修改后的架构,将在各个时期输出恒定的损失。

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(output_1)
layer = Lambda(custom_activation)(output_2)  #should be last layer
model2 = Model(inputs=inputs, outputs=layer) 

这是从 :

获得想法的另一种方法
import numpy as np
import random
import tensorflow as tf
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
from keras.backend import tf
from keras import initializers


@tf.custom_gradient
def custom_activation(x):
    result = (K.sigmoid(x) * 2 - 1)
    def grad(dy):
        grad=0;
        return dy * grad

    return result, grad



class CustomLayer(tf.keras.layers.Layer):
    def __init__(self):
        super(CustomLayer, self).__init__()

    def call(self, x):
        return custom_activation(x)


x_train=np.array([[1,2],[3,4],[3,4]]);  


inputs = tf.keras.layers.Input(shape=(2,))
output_1 = tf.keras.layers.Dense(20, kernel_initializer='glorot_normal')(inputs)
layer = CustomLayer()(output_1)
output_2 = tf.keras.layers.Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)
model2 = tf.keras.models.Model(inputs=inputs, outputs=output_2)

model2.compile(optimizer='adam',loss='mean_squared_error')
model2.fit(x_train,x_train,epochs=10,validation_split=0.1,shuffle=False)