使用 Keras 连接层的困难:图形断开连接

Difficulty in Connecting Layers with Keras: Graph Disconnected

我正在尝试使用 Keras Functional API 为我的策略梯度(深度强化学习)代理构建一个 NN 模型。我打算做的是通过在 logit 层中将它们的概率分布降低到零来掩盖无效动作:

def __build_policy_network(self):
    inputs = keras.layers.Input(shape=(self.input_dim,))
    advantages = keras.layers.Input(shape=(1,))
    valid_actions = keras.layers.Input(shape=(3,))
    dense_1 = keras.layers.Dense(units=self.fc1_size, activation="relu", kernel_initializer="he_uniform")(inputs)
    dense_2 = keras.layers.Dense(units=self.fc2_size, activation="relu", kernel_initializer="he_uniform")(dense_1)
    probs_logits = keras.layers.Dense(units=self.nb_actions, activation='softmax')(dense_2)
    masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
    probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)
    
       def custom_loss(y_true, y_pred):
           out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
           log_lik = y_true * keras.backend.log(out)
           return keras.backend.sum(-log_lik * advantages)
    
     policy = keras.models.Model([inputs, advantages], [probs])
     policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
     predict = keras.models.Model([inputs, valid_actions], [probs])
     return policy, predict

然而,我 运行 陷入臭名昭著的错误 ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_3:0", shape=(None, 3), dtype=float32) at layer "multiply". 当我注释掉 advantagesvalid_actions 输入层(当然,删除它们相应的行)我可以成功 运行 代码。我应该提到 valid_actions 输入层仅传递给屏蔽无效概率,而不是损失计算所必需的。

如果有人能帮助我,我将不胜感激。

提前感谢您的宝贵时间

你的损失还涉及advantages所以你需要把它传递给损失。你可以用 .add_loss.

policy 模型还需要 valid_actions 作为输入来生成 probs

predict 模型似乎没问题,可以在推理时使用。

这里是 .add_loss 的完整示例。

inputs = keras.layers.Input(shape=(30,))
advantages = keras.layers.Input(shape=(1,))
valid_actions = keras.layers.Input(shape=(3,))
true = keras.layers.Input(shape=(3,))
dense_1 = keras.layers.Dense(units=64, activation="relu", kernel_initializer="he_uniform")(inputs)
dense_2 = keras.layers.Dense(units=32, activation="relu", kernel_initializer="he_uniform")(dense_1)
probs_logits = keras.layers.Dense(units=3, activation='softmax')(dense_2)
masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)

def custom_loss(y_true, y_pred, advantages):
    out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
    log_lik = y_true * keras.backend.log(out)
    return keras.backend.sum(-log_lik * advantages)

policy = keras.models.Model([inputs, advantages, valid_actions, true], [probs])
policy.add_loss( custom_loss(true, probs, advantages) )
policy.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss=None)
predict = keras.models.Model([inputs, valid_actions], [probs])

非常感谢MarcoCerliani for his time and follow-ups, I succeeded to find a working solution to my problem finally. The mistake was that I modified the probs output layer which is required for loss calculation with the valid_actions input layer which is indeed only required for the predict model. As stated by :

Keras cannot just ignore an input layer as the output depends on it.

我需要做的就是将probs_logitsvalid_actions层未修改的输出层)传递给policy model进行损失计算,并将probs输出传递给层(由 valid actions 层操纵)到 predict model:

def __build_policy_network(self):

 // previous lines of code left unchanged

 policy = keras.models.Model([inputs, advantages], [probs_logits])
 policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
 predict = keras.models.Model([inputs, valid_actions], [probs])
 return policy, predict