编写基本的 XOR 神经网络程序
Writing a basic XOR neural network program
我正在尝试编写一个从头开始识别异或函数的神经网络。完整代码是here(在python 3)。
我目前遇到错误:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients
我是tensorflow的新手,不明白为什么会这样。谁能帮我更正我的代码?提前致谢。
P.S。如果问题中需要更多详细信息,请在投票前告诉我。再次感谢!
编辑:代码的相关部分:
def initialize_parameters():
# Create Weights and Biases for Hidden Layer and Output Layer
W1 = tf.get_variable("W1", [2, 2], initializer = tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [2, 1], initializer = tf.zeros_initializer())
W2 = tf.get_variable("W2", [1, 2], initializer = tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [1, 1], initializer = tf.zeros_initializer())
parameters = {
"W1" : W1,
"b1" : b1,
"W2" : W2,
"b2" : b2
}
return parameters
def forward_propogation(X, parameters):
threshold = tf.constant(0.5, name = "threshold")
W1, b1 = parameters["W1"], parameters["b1"]
W2, b2 = parameters["W2"], parameters["b2"]
Z1 = tf.add(tf.matmul(W1, X), b1)
A1 = tf.nn.relu(Z1)
tf.squeeze(A1)
Z2 = tf.add(tf.matmul(W2, A1), b2)
A2 = tf.round(tf.sigmoid(Z2))
print(A2.shape)
tf.squeeze(A2)
A2 = tf.reshape(A2, [1, 1])
print(A2.shape)
return A2
def compute_cost(A, Y):
logits = tf.transpose(A)
labels = tf.transpose(Y)
cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits, labels = labels)
return cost
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500):
ops.reset_default_graph()
(n_x, m) = X_train.shape
n_y = Y_train.shape[0]
costs = []
X, Y = create_placeholders(n_x, n_y)
parameters = initialize_parameters()
A2 = forward_propogation(X, parameters)
cost = compute_cost(A2, Y)
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
for epoch in range(num_epochs):
epoch_cost = 0
_, epoch_cost = session.run([optimizer, cost], feed_dict = {X : X_train, Y : Y_train})
parameters = session.run(parameters)
correct_prediction = tf.equal(tf.argmax(A2), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Training Accuracy is {0} %...".format(accuracy.eval({X : X_train, Y : Y_train})))
print("Test Accuracy is {0} %...".format(accuracy.eval({X : X_test, Y : Y_test})))
return parameters
这个错误是因为你在定义A2
时使用了tf.round
(known issue,顺便说一句)。
在这个特定任务中,解决方案就是根本不使用 tf.round
。请记住,tf.sigmoid
的输出是 0
和 1
之间的值,可以解释为结果 1
的概率。交叉熵损失函数正在测量到目标的距离,0
或 1
,并根据此距离计算所需的权重更新。在交叉熵之前调用 tf.round
会将概率压缩到 0
或 1
- 这将使交叉熵变得毫无意义。
顺便说一下,tf.losses.softmax_cross_entropy
应该会更好,因为你已经在第二层自己应用了 sigmoid。
我正在尝试编写一个从头开始识别异或函数的神经网络。完整代码是here(在python 3)。
我目前遇到错误:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients
我是tensorflow的新手,不明白为什么会这样。谁能帮我更正我的代码?提前致谢。
P.S。如果问题中需要更多详细信息,请在投票前告诉我。再次感谢!
编辑:代码的相关部分:
def initialize_parameters():
# Create Weights and Biases for Hidden Layer and Output Layer
W1 = tf.get_variable("W1", [2, 2], initializer = tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [2, 1], initializer = tf.zeros_initializer())
W2 = tf.get_variable("W2", [1, 2], initializer = tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [1, 1], initializer = tf.zeros_initializer())
parameters = {
"W1" : W1,
"b1" : b1,
"W2" : W2,
"b2" : b2
}
return parameters
def forward_propogation(X, parameters):
threshold = tf.constant(0.5, name = "threshold")
W1, b1 = parameters["W1"], parameters["b1"]
W2, b2 = parameters["W2"], parameters["b2"]
Z1 = tf.add(tf.matmul(W1, X), b1)
A1 = tf.nn.relu(Z1)
tf.squeeze(A1)
Z2 = tf.add(tf.matmul(W2, A1), b2)
A2 = tf.round(tf.sigmoid(Z2))
print(A2.shape)
tf.squeeze(A2)
A2 = tf.reshape(A2, [1, 1])
print(A2.shape)
return A2
def compute_cost(A, Y):
logits = tf.transpose(A)
labels = tf.transpose(Y)
cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits, labels = labels)
return cost
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500):
ops.reset_default_graph()
(n_x, m) = X_train.shape
n_y = Y_train.shape[0]
costs = []
X, Y = create_placeholders(n_x, n_y)
parameters = initialize_parameters()
A2 = forward_propogation(X, parameters)
cost = compute_cost(A2, Y)
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
for epoch in range(num_epochs):
epoch_cost = 0
_, epoch_cost = session.run([optimizer, cost], feed_dict = {X : X_train, Y : Y_train})
parameters = session.run(parameters)
correct_prediction = tf.equal(tf.argmax(A2), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Training Accuracy is {0} %...".format(accuracy.eval({X : X_train, Y : Y_train})))
print("Test Accuracy is {0} %...".format(accuracy.eval({X : X_test, Y : Y_test})))
return parameters
这个错误是因为你在定义A2
时使用了tf.round
(known issue,顺便说一句)。
在这个特定任务中,解决方案就是根本不使用 tf.round
。请记住,tf.sigmoid
的输出是 0
和 1
之间的值,可以解释为结果 1
的概率。交叉熵损失函数正在测量到目标的距离,0
或 1
,并根据此距离计算所需的权重更新。在交叉熵之前调用 tf.round
会将概率压缩到 0
或 1
- 这将使交叉熵变得毫无意义。
顺便说一下,tf.losses.softmax_cross_entropy
应该会更好,因为你已经在第二层自己应用了 sigmoid。