TensorFlow 感知器给出无法解释的输出

TensorFlow's perceptron gives unexplaineble output

我是 TF 的新手:我从 MNIST 上的教程中获取了感知器的代码(实际上,没有必要遵循这个 link):https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/multilayer_perceptron.py

我想将这些感知器重新制作为具有 1 层和线性激活函数的感知器,使其成为最简单的形式:输出 =w2(w1*x+b1)+b2。但这就是我得到的:

数据:

X_train: 数组([[ 10.], [ 10.], [ 11.], [6.], [8.], [9.], [ 22.], [ 14.], [6.], [8.], [ 11.], [9.], [ 13.], [7.], [ 13.], [7.], [ 13.], [11.]])

y_train: 数组([[ 44.5825], [ 53.99 ], [52.4475], [ 37.6 ], [ 38.6125], [ 39.5875], [ 43.07 ], [74.8575], [ 34.185 ], [ 38.61 ], [ 34.8175], [ 36.61 ], [34.0675], [ 37.67 ], [ 49.725 ], [79.4775], [ 50.41 ], [ 51.26 ]])

X_test: 数组([[ 6.], [ 14.], [ 14.], [ 12.], [ 13.], [13.]])

y_test: 数组([[ 55.75 ], [ 33.035 ], [ 38.3275], [ 39.2825], [50.7325], [45.2575]])

参数:

learning_rate = 1
training_epochs = 1
display_step = 1 #maintaining variable
x = tf.placeholder("float", [None, 1])
y = tf.placeholder("float", [None, 1])

感知器模型:

def multilayer_perceptron(x, weights, biases, output_0):
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    out_layer = tf.add(tf.matmul(layer_1, weights['out']), biases['out'])
    output_o = out_layer #This variable is just needed to print result in session 
    return out_layer

output_0 = tf.Variable(tf.random_normal([1, n_classes]))
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'out': tf.Variable(tf.random_normal([n_classes]))}

让我们构建图表:

prediction = multilayer_perceptron(x, weights, biases, output)

cost = tf.reduce_mean(tf.square(prediction-y)) #MSE
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) #Gives the smallest cost

init = tf.initialize_all_variables()

最后,让我们运行 session:

with tf.Session() as Sess:
    Sess.run(init)
    for epoch in range(training_epochs):
        avg_cost = 0.
        number_of_bathces = len(X_train)/batch_size    
        _, c = Sess.run([optimizer, cost], feed_dict = {x: X_train, y: y_train})
        avg_cost += c/len(X_train)
        print(Sess.run(output_0))
        if epoch % display_step ==0:
            print("Epoch:", '%02d' % (epoch+1), "cost =", "{:.9f}".format(avg_cost))
    print("Optimization finished")
    correct_prediction = tf.equal(tf.arg_max(prediction,1), tf.arg_max(y,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    print("Accuracy:", accuracy.eval({x:X_test, y:y_test}))

现在,我们得到输出:

[[ 0.77995574]]
Epoch: 01 cost = 262.544189453
Optimization finished
Accuracy: 1.0

最令人困惑的是输出(第一个数字)!它应该在 [30; 50]!请解释一下,我哪里做错了。

你的代码特别乱,所以我删除了很多多余的部分:

from __future__ import print_function
import numpy as np
import tensorflow as tf

X_train =  np.array([[ 10.], [ 10.], [ 11.], [ 6.], [ 8.], [ 9.], [ 22.], [ 14.], [ 6.], [ 8.], [ 11.], [ 9.], [ 13.], [ 7.], [ 13.], [ 7.], [ 13.], [ 11.]])

y_train =  np.array([[ 44.5825], [ 53.99 ], [ 52.4475], [ 37.6 ], [ 38.6125], [ 39.5875], [ 43.07 ], [ 74.8575], [ 34.185 ], [ 38.61 ], [ 34.8175], [ 36.61 ], [ 34.0675], [ 37.67 ], [ 49.725 ], [ 79.4775], [ 50.41 ], [ 51.26 ]])

X_test = np.array([[ 6.], [ 14.], [ 14.], [ 12.], [ 13.], [ 13.]])

y_test =  np.array([[ 55.75 ], [ 33.035 ], [ 38.3275], [ 39.2825], [ 50.7325], [ 45.2575]])

learning_rate = 0.05
training_epochs = 10

n_classes = 1
n_hidden_1 = 5
n_hidden_2 = 5
n_input = 1
x = tf.placeholder(tf.float32, [None, 1])
y = tf.placeholder(tf.float32, [None, 1])

def multilayer_perceptron(x, weights, biases):
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    out_layer = tf.add(tf.matmul(layer_1, weights['out']), biases['out'])
    return out_layer

weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'out': tf.Variable(tf.random_normal([n_classes]))}

prediction = multilayer_perceptron(x, weights, biases)

cost = tf.reduce_mean(tf.square(prediction - y)) #MSE
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost) #Gives the smallest cost

init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(training_epochs):
        _, c = sess.run([optimizer, cost], feed_dict = {x: X_train, y: y_train})
        print("Epoch:", '%02d' % (epoch+1), "cost =", "{:.9f}".format(c))
    print("Optimization finished")

    print(sess.run(prediction, feed_dict = {x: X_test, y: y_test} ))

现在好像可以了。我得到了以下结果:

Epoch: 01 cost = 1323.519653320
Epoch: 02 cost = 926.386840820
Epoch: 03 cost = 628.072326660
Epoch: 04 cost = 431.689270020
Epoch: 05 cost = 343.259063721
Epoch: 06 cost = 355.978668213
Epoch: 07 cost = 430.280548096
Epoch: 08 cost = 501.149414062
Epoch: 09 cost = 527.575683594
Epoch: 10 cost = 507.708007812
Optimization finished
[[ 30.79703712]
 [ 69.70319366]
 [ 69.70319366]
 [ 59.97665405]
 [ 64.83992004]
 [ 64.83992004]]

由于权重的随机初始化,结果可能会有所不同。

几个提示:

  1. 使用较小的学习率
  2. 训练几个时期以查看动态