神经网络在简单的线性插值任务中表现不佳
neural network performs badly at simple linear interpolation task
仅供参考:我上传了您自己测试所需的所有内容(数据 + 简化的脚本)。
这是我的问题:
我试图训练一个使用四个输入值的非常简单的模型
x(0), x(1), x(2), x(3)
预测值 x(4),即 y = x(4).
但是,我修改了数据,使 y = x(4) 成为完美的线性外推法:
y = x(3) + (x(3)-x(2))
我使用的模型是一个具有四个神经元的密集层。权重“0 0 -1 2”将是一个完美的解决方案(丢失“0”)。
但是,我无法让它达到这些值。
你能帮忙或告诉我,为什么?
文件在这里:https://ufile.io/5d2t4
主脚本(含人工数据):
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Flatten, Dense
from keras.optimizers import Adadelta, Adam
import keras.backend as K
def root_mean_squared_error(y_true, y_pred):
return K.sqrt( K.mean( K.square( y_pred - y_true ) ) )
X_train = np.random.random(240000*4)
X_train = np.reshape( X_train, ( 240000, 1, 4 ) )
# predict the gradient of the
y_train = X_train[:,0,3] - X_train[:,0,2]
inputShape = ( X_train.shape[1], X_train.shape[2] )
# create model
model = Sequential()
model.add( Flatten( input_shape=inputShape ) )
model.add( Dense( 1 ) )
model.compile( loss=root_mean_squared_error, optimizer=Adam( decay = 0.1 ) )
# train model
batchSize = 8
model.fit( X_train, y_train, nb_epoch=10, batch_size=batchSize, shuffle=True )
y_train_predicted = model.predict( X_train)
y_train_predicted = np.asarray(y_train_predicted).ravel()
y_train_predicted_rmse = np.sqrt( np.mean( np.square( y_train_predicted - y_train ) ) )
print( "y_train RMSE = " + str( y_train_predicted_rmse ) )
当我的 "obvious" 模型不收敛时,我首先问自己的是 hyper-params 是否合适。
我调整了你的代码来修复学习率。我删除了衰减并添加了 0.01 的学习率,而不是默认的 0.001(参见 https://keras.io/optimizers/)。一个 epoch 后的权重为
[ 9.3402149e-04],
[ 5.8139337e-04],
[-9.9929601e-01],
[ 1.0009530e+00]
这大约是我们在代码中设置的。
[0, 0, -1, 1]
如果你只是保持默认学习率 (0.001) 没有衰减,它也可以正常工作。
在下面找到工作代码。
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Flatten, Dense
from keras.optimizers import Adadelta, Adam
import keras.backend as K
def root_mean_squared_error(y_true, y_pred):
return K.sqrt( K.mean( K.square( y_pred - y_true ) ) )
X_train = np.random.random(240000*4)
X_train = np.reshape( X_train, ( 240000, 1, 4 ) )
y_train = X_train[:,0,3] - X_train[:,0,2]
inputShape = ( X_train.shape[1], X_train.shape[2] )
# create model
model = Sequential()
model.add( Flatten( input_shape=inputShape ) )
model.add( Dense( 1 ) )
model.compile( loss=root_mean_squared_error, optimizer=Adam( lr=0.01 ) )
# train model
batchSize = 8
model.fit( X_train, y_train, nb_epoch=1, batch_size=batchSize, shuffle=True )
y_train_predicted = model.predict( X_train)
y_train_predicted = np.asarray(y_train_predicted).ravel()
y_train_predicted_rmse = np.sqrt( np.mean( np.square( y_train_predicted - y_train ) ) )
print( "y_train RMSE = " + str( y_train_predicted_rmse ) )
x = [model.layers]
x[0][1].get_weights()
仅供参考:我上传了您自己测试所需的所有内容(数据 + 简化的脚本)。
这是我的问题: 我试图训练一个使用四个输入值的非常简单的模型 x(0), x(1), x(2), x(3) 预测值 x(4),即 y = x(4).
但是,我修改了数据,使 y = x(4) 成为完美的线性外推法: y = x(3) + (x(3)-x(2))
我使用的模型是一个具有四个神经元的密集层。权重“0 0 -1 2”将是一个完美的解决方案(丢失“0”)。
但是,我无法让它达到这些值。
你能帮忙或告诉我,为什么?
文件在这里:https://ufile.io/5d2t4
主脚本(含人工数据):
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Flatten, Dense
from keras.optimizers import Adadelta, Adam
import keras.backend as K
def root_mean_squared_error(y_true, y_pred):
return K.sqrt( K.mean( K.square( y_pred - y_true ) ) )
X_train = np.random.random(240000*4)
X_train = np.reshape( X_train, ( 240000, 1, 4 ) )
# predict the gradient of the
y_train = X_train[:,0,3] - X_train[:,0,2]
inputShape = ( X_train.shape[1], X_train.shape[2] )
# create model
model = Sequential()
model.add( Flatten( input_shape=inputShape ) )
model.add( Dense( 1 ) )
model.compile( loss=root_mean_squared_error, optimizer=Adam( decay = 0.1 ) )
# train model
batchSize = 8
model.fit( X_train, y_train, nb_epoch=10, batch_size=batchSize, shuffle=True )
y_train_predicted = model.predict( X_train)
y_train_predicted = np.asarray(y_train_predicted).ravel()
y_train_predicted_rmse = np.sqrt( np.mean( np.square( y_train_predicted - y_train ) ) )
print( "y_train RMSE = " + str( y_train_predicted_rmse ) )
当我的 "obvious" 模型不收敛时,我首先问自己的是 hyper-params 是否合适。
我调整了你的代码来修复学习率。我删除了衰减并添加了 0.01 的学习率,而不是默认的 0.001(参见 https://keras.io/optimizers/)。一个 epoch 后的权重为
[ 9.3402149e-04],
[ 5.8139337e-04],
[-9.9929601e-01],
[ 1.0009530e+00]
这大约是我们在代码中设置的。
[0, 0, -1, 1]
如果你只是保持默认学习率 (0.001) 没有衰减,它也可以正常工作。 在下面找到工作代码。
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Flatten, Dense
from keras.optimizers import Adadelta, Adam
import keras.backend as K
def root_mean_squared_error(y_true, y_pred):
return K.sqrt( K.mean( K.square( y_pred - y_true ) ) )
X_train = np.random.random(240000*4)
X_train = np.reshape( X_train, ( 240000, 1, 4 ) )
y_train = X_train[:,0,3] - X_train[:,0,2]
inputShape = ( X_train.shape[1], X_train.shape[2] )
# create model
model = Sequential()
model.add( Flatten( input_shape=inputShape ) )
model.add( Dense( 1 ) )
model.compile( loss=root_mean_squared_error, optimizer=Adam( lr=0.01 ) )
# train model
batchSize = 8
model.fit( X_train, y_train, nb_epoch=1, batch_size=batchSize, shuffle=True )
y_train_predicted = model.predict( X_train)
y_train_predicted = np.asarray(y_train_predicted).ravel()
y_train_predicted_rmse = np.sqrt( np.mean( np.square( y_train_predicted - y_train ) ) )
print( "y_train RMSE = " + str( y_train_predicted_rmse ) )
x = [model.layers]
x[0][1].get_weights()