使用 Keras 连接层的困难:图形断开连接
Difficulty in Connecting Layers with Keras: Graph Disconnected
我正在尝试使用 Keras Functional API 为我的策略梯度(深度强化学习)代理构建一个 NN 模型。我打算做的是通过在 logit 层中将它们的概率分布降低到零来掩盖无效动作:
def __build_policy_network(self):
inputs = keras.layers.Input(shape=(self.input_dim,))
advantages = keras.layers.Input(shape=(1,))
valid_actions = keras.layers.Input(shape=(3,))
dense_1 = keras.layers.Dense(units=self.fc1_size, activation="relu", kernel_initializer="he_uniform")(inputs)
dense_2 = keras.layers.Dense(units=self.fc2_size, activation="relu", kernel_initializer="he_uniform")(dense_1)
probs_logits = keras.layers.Dense(units=self.nb_actions, activation='softmax')(dense_2)
masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)
def custom_loss(y_true, y_pred):
out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
log_lik = y_true * keras.backend.log(out)
return keras.backend.sum(-log_lik * advantages)
policy = keras.models.Model([inputs, advantages], [probs])
policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
predict = keras.models.Model([inputs, valid_actions], [probs])
return policy, predict
然而,我 运行 陷入臭名昭著的错误 ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_3:0", shape=(None, 3), dtype=float32) at layer "multiply".
当我注释掉 advantages
或 valid_actions
输入层(当然,删除它们相应的行)我可以成功 运行 代码。我应该提到 valid_actions
输入层仅传递给屏蔽无效概率,而不是损失计算所必需的。
如果有人能帮助我,我将不胜感激。
提前感谢您的宝贵时间
你的损失还涉及advantages
所以你需要把它传递给损失。你可以用 .add_loss
.
policy
模型还需要 valid_actions
作为输入来生成 probs
。
predict
模型似乎没问题,可以在推理时使用。
这里是 .add_loss
的完整示例。
inputs = keras.layers.Input(shape=(30,))
advantages = keras.layers.Input(shape=(1,))
valid_actions = keras.layers.Input(shape=(3,))
true = keras.layers.Input(shape=(3,))
dense_1 = keras.layers.Dense(units=64, activation="relu", kernel_initializer="he_uniform")(inputs)
dense_2 = keras.layers.Dense(units=32, activation="relu", kernel_initializer="he_uniform")(dense_1)
probs_logits = keras.layers.Dense(units=3, activation='softmax')(dense_2)
masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)
def custom_loss(y_true, y_pred, advantages):
out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
log_lik = y_true * keras.backend.log(out)
return keras.backend.sum(-log_lik * advantages)
policy = keras.models.Model([inputs, advantages, valid_actions, true], [probs])
policy.add_loss( custom_loss(true, probs, advantages) )
policy.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss=None)
predict = keras.models.Model([inputs, valid_actions], [probs])
非常感谢MarcoCerliani for his time and follow-ups, I succeeded to find a working solution to my problem finally. The mistake was that I modified the probs
output layer which is required for loss calculation with the valid_actions
input layer which is indeed only required for the predict
model. As stated by :
Keras cannot just ignore an input layer as the output depends on it.
我需要做的就是将probs_logits
(valid_actions
层未修改的输出层)传递给policy model
进行损失计算,并将probs
输出传递给层(由 valid actions
层操纵)到 predict model
:
def __build_policy_network(self):
// previous lines of code left unchanged
policy = keras.models.Model([inputs, advantages], [probs_logits])
policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
predict = keras.models.Model([inputs, valid_actions], [probs])
return policy, predict
我正在尝试使用 Keras Functional API 为我的策略梯度(深度强化学习)代理构建一个 NN 模型。我打算做的是通过在 logit 层中将它们的概率分布降低到零来掩盖无效动作:
def __build_policy_network(self):
inputs = keras.layers.Input(shape=(self.input_dim,))
advantages = keras.layers.Input(shape=(1,))
valid_actions = keras.layers.Input(shape=(3,))
dense_1 = keras.layers.Dense(units=self.fc1_size, activation="relu", kernel_initializer="he_uniform")(inputs)
dense_2 = keras.layers.Dense(units=self.fc2_size, activation="relu", kernel_initializer="he_uniform")(dense_1)
probs_logits = keras.layers.Dense(units=self.nb_actions, activation='softmax')(dense_2)
masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)
def custom_loss(y_true, y_pred):
out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
log_lik = y_true * keras.backend.log(out)
return keras.backend.sum(-log_lik * advantages)
policy = keras.models.Model([inputs, advantages], [probs])
policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
predict = keras.models.Model([inputs, valid_actions], [probs])
return policy, predict
然而,我 运行 陷入臭名昭著的错误 ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_3:0", shape=(None, 3), dtype=float32) at layer "multiply".
当我注释掉 advantages
或 valid_actions
输入层(当然,删除它们相应的行)我可以成功 运行 代码。我应该提到 valid_actions
输入层仅传递给屏蔽无效概率,而不是损失计算所必需的。
如果有人能帮助我,我将不胜感激。
提前感谢您的宝贵时间
你的损失还涉及advantages
所以你需要把它传递给损失。你可以用 .add_loss
.
policy
模型还需要 valid_actions
作为输入来生成 probs
。
predict
模型似乎没问题,可以在推理时使用。
这里是 .add_loss
的完整示例。
inputs = keras.layers.Input(shape=(30,))
advantages = keras.layers.Input(shape=(1,))
valid_actions = keras.layers.Input(shape=(3,))
true = keras.layers.Input(shape=(3,))
dense_1 = keras.layers.Dense(units=64, activation="relu", kernel_initializer="he_uniform")(inputs)
dense_2 = keras.layers.Dense(units=32, activation="relu", kernel_initializer="he_uniform")(dense_1)
probs_logits = keras.layers.Dense(units=3, activation='softmax')(dense_2)
masked_probs = keras.layers.Multiply()([probs_logits, valid_actions])
probs = keras.layers.Lambda(lambda x: x / keras.backend.sum(x, axis=1))(masked_probs)
def custom_loss(y_true, y_pred, advantages):
out = keras.backend.clip(y_pred, 1e-8, 1 - 1e-8)
log_lik = y_true * keras.backend.log(out)
return keras.backend.sum(-log_lik * advantages)
policy = keras.models.Model([inputs, advantages, valid_actions, true], [probs])
policy.add_loss( custom_loss(true, probs, advantages) )
policy.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss=None)
predict = keras.models.Model([inputs, valid_actions], [probs])
非常感谢MarcoCerliani for his time and follow-ups, I succeeded to find a working solution to my problem finally. The mistake was that I modified the probs
output layer which is required for loss calculation with the valid_actions
input layer which is indeed only required for the predict
model. As stated by
Keras cannot just ignore an input layer as the output depends on it.
我需要做的就是将probs_logits
(valid_actions
层未修改的输出层)传递给policy model
进行损失计算,并将probs
输出传递给层(由 valid actions
层操纵)到 predict model
:
def __build_policy_network(self):
// previous lines of code left unchanged
policy = keras.models.Model([inputs, advantages], [probs_logits])
policy.compile(optimizer=keras.optimizers.Adam(lr=self.alpha), loss=custom_loss)
predict = keras.models.Model([inputs, valid_actions], [probs])
return policy, predict