TensorFlow XOR 实现,未能达到 100% 准确率
TensorFlow XOR implementation, fail to achieve 100% accuracy
我是 machine learning
和 tensorflow
的新手。我正在尝试在张量流中实现 XOR 门我想出了这段代码。
import numpy as np
import tensorflow as tf
tf.reset_default_graph()
learning_rate = 0.01
n_epochs = 1000
n_inputs = 2
n_hidden1 = 2
n_outputs = 2
arr1, target = [[0, 0], [0, 1], [1, 0], [1,1]], [0, 1, 1, 0]
X_data = np.array(arr1).astype(np.float32)
y_data = np.array(target).astype(np.int)
X = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X")
y = tf.placeholder(tf.int64, shape=(None), name="y")
with tf.name_scope("dnn_tf"):
hidden1 = tf.layers.dense(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
logits = tf.layers.dense(hidden1, n_outputs, name="outputs")
with tf.name_scope("loss"):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy, name="loss")
with tf.name_scope("train"):
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=0.9)
training_op = optimizer.minimize(loss)
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch: ", epoch, " Train Accuracy: ", acc_train)
sess.run(training_op, feed_dict={X:X_data, y:y_data})
acc_train = accuracy.eval(feed_dict={X:X_data, y:y_data})
代码 运行 很好,但我在每个 运行
中得到不同的输出
运行-1
Epoch: 0 Train Accuracy: 0.75
Epoch: 100 Train Accuracy: 1.0
Epoch: 200 Train Accuracy: 1.0
Epoch: 300 Train Accuracy: 1.0
Epoch: 400 Train Accuracy: 1.0
Epoch: 500 Train Accuracy: 1.0
Epoch: 600 Train Accuracy: 1.0
Epoch: 700 Train Accuracy: 1.0
Epoch: 800 Train Accuracy: 1.0
Epoch: 900 Train Accuracy: 1.0
运行-2
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.75
Epoch: 200 Train Accuracy: 0.75
Epoch: 300 Train Accuracy: 0.75
Epoch: 400 Train Accuracy: 0.75
Epoch: 500 Train Accuracy: 0.75
Epoch: 600 Train Accuracy: 0.75
Epoch: 700 Train Accuracy: 0.75
Epoch: 800 Train Accuracy: 0.75
Epoch: 900 Train Accuracy: 0.75
运行3-
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.5
Epoch: 200 Train Accuracy: 0.5
Epoch: 300 Train Accuracy: 0.5
Epoch: 400 Train Accuracy: 0.5
Epoch: 500 Train Accuracy: 0.5
Epoch: 600 Train Accuracy: 0.5
Epoch: 700 Train Accuracy: 0.5
Epoch: 800 Train Accuracy: 0.5
Epoch: 900 Train Accuracy: 0.5
我无法理解我在这里做错了什么以及为什么我的解决方案没有收敛。
理论上,可以像您在代码中那样,用一个隐藏层和两个带有 ReLU 激活的单元来解决 XOR。然而,网络能够 表示 一个解决方案和能够 学习 它之间始终存在关键区别。我假设由于网络规模较小,您 运行 陷入 "dead ReLU" 问题,由于不幸的随机初始化,您的一个(或两个)隐藏单元不会为任何输入激活。不幸的是,当 ReLU 的激活为零时,它的梯度也为零,因此从不激活的单元也无法学习任何东西。
增加隐藏单元的数量可以降低这种情况发生的可能性(即你可以有三个死单元,另外两个仍然足以解决问题),这可以解释为什么你有五个更成功隐藏单位。
您可能想查看交互式 TensorFlow Playground。他们有可用的 XOR 数据集。您可以尝试隐藏层的数量、大小、激活函数等,并可视化分类器通过 epohcs 的数量学习的决策边界。
我是 machine learning
和 tensorflow
的新手。我正在尝试在张量流中实现 XOR 门我想出了这段代码。
import numpy as np
import tensorflow as tf
tf.reset_default_graph()
learning_rate = 0.01
n_epochs = 1000
n_inputs = 2
n_hidden1 = 2
n_outputs = 2
arr1, target = [[0, 0], [0, 1], [1, 0], [1,1]], [0, 1, 1, 0]
X_data = np.array(arr1).astype(np.float32)
y_data = np.array(target).astype(np.int)
X = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X")
y = tf.placeholder(tf.int64, shape=(None), name="y")
with tf.name_scope("dnn_tf"):
hidden1 = tf.layers.dense(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
logits = tf.layers.dense(hidden1, n_outputs, name="outputs")
with tf.name_scope("loss"):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy, name="loss")
with tf.name_scope("train"):
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=0.9)
training_op = optimizer.minimize(loss)
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch: ", epoch, " Train Accuracy: ", acc_train)
sess.run(training_op, feed_dict={X:X_data, y:y_data})
acc_train = accuracy.eval(feed_dict={X:X_data, y:y_data})
代码 运行 很好,但我在每个 运行
中得到不同的输出运行-1
Epoch: 0 Train Accuracy: 0.75
Epoch: 100 Train Accuracy: 1.0
Epoch: 200 Train Accuracy: 1.0
Epoch: 300 Train Accuracy: 1.0
Epoch: 400 Train Accuracy: 1.0
Epoch: 500 Train Accuracy: 1.0
Epoch: 600 Train Accuracy: 1.0
Epoch: 700 Train Accuracy: 1.0
Epoch: 800 Train Accuracy: 1.0
Epoch: 900 Train Accuracy: 1.0
运行-2
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.75
Epoch: 200 Train Accuracy: 0.75
Epoch: 300 Train Accuracy: 0.75
Epoch: 400 Train Accuracy: 0.75
Epoch: 500 Train Accuracy: 0.75
Epoch: 600 Train Accuracy: 0.75
Epoch: 700 Train Accuracy: 0.75
Epoch: 800 Train Accuracy: 0.75
Epoch: 900 Train Accuracy: 0.75
运行3-
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.5
Epoch: 200 Train Accuracy: 0.5
Epoch: 300 Train Accuracy: 0.5
Epoch: 400 Train Accuracy: 0.5
Epoch: 500 Train Accuracy: 0.5
Epoch: 600 Train Accuracy: 0.5
Epoch: 700 Train Accuracy: 0.5
Epoch: 800 Train Accuracy: 0.5
Epoch: 900 Train Accuracy: 0.5
我无法理解我在这里做错了什么以及为什么我的解决方案没有收敛。
理论上,可以像您在代码中那样,用一个隐藏层和两个带有 ReLU 激活的单元来解决 XOR。然而,网络能够 表示 一个解决方案和能够 学习 它之间始终存在关键区别。我假设由于网络规模较小,您 运行 陷入 "dead ReLU" 问题,由于不幸的随机初始化,您的一个(或两个)隐藏单元不会为任何输入激活。不幸的是,当 ReLU 的激活为零时,它的梯度也为零,因此从不激活的单元也无法学习任何东西。
增加隐藏单元的数量可以降低这种情况发生的可能性(即你可以有三个死单元,另外两个仍然足以解决问题),这可以解释为什么你有五个更成功隐藏单位。
您可能想查看交互式 TensorFlow Playground。他们有可用的 XOR 数据集。您可以尝试隐藏层的数量、大小、激活函数等,并可视化分类器通过 epohcs 的数量学习的决策边界。