使用 Sequential 或 functional 风格构建的相同 Keras 模型的结果截然不同

Question

我正在尝试实现一个学习设置一些参数的 Keras 回归模型，例如输入中有一些参数和一组不相关的输出，与输入一致（例如相似的输入在训练集中给出相似的输出，并且某些输入和某些输出之间存在部分线性）。输入和输出被归一化，因为参数有不同的单位。

训练阶段的 mse 约为 0.48，预测相当合理。

这是型号：

model = Sequential()
model.add(Dense(78, activation='relu', input_shape = 3))
model.add(Dense(54, activation='relu'))
model.add(Dense(54, activation='relu'))
model.add(Dense(5))

总结：

X:  (2011, 3) y:  (2011, 5)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 78)                312       
_________________________________________________________________
dense_1 (Dense)              (None, 54)                4266      
_________________________________________________________________
dense_2 (Dense)              (None, 54)                2970      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 275       
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0

然后我制作完全相同的模型功能风格

inputs = keras.layers.Input(shape=3) #(X.shape[1],)
out = keras.layers.Dense(78, activation='relu')(inputs)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(5, activation='relu')(out)


X:  (2011, 3) y:  (2011, 5)
Model: "func_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
dense (Dense)                (None, 78)                312       
_________________________________________________________________
dense_1 (Dense)              (None, 54)                4266      
_________________________________________________________________
dense_2 (Dense)              (None, 54)                2970      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 275       
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0

总结是完全一样的，除了函数添加了输入层..但是文档说：

When a popular kwarg input_shape is passed, then keras will create an input layer 
to insert before the current layer. This can be treated equivalent to explicitly
defining an InputLayer.

https://keras.io/api/layers/core_layers/dense/

这就是我在第一个模型中所做的。所以这两个模型应该是一样的。但事实并非如此：训练期间的 mse 明显更高，约为 0.7，并且与其他模型相反，预测是“扁平化”的：输出集对输入参数的响应最小。

有考虑吗？

Answer 1

区别在于你的输出层激活。在功能中你使用 relu:

out = keras.layers.Dense(5, activation='relu')(out)

按顺序，您使用线性（默认激活）

model.add(Dense(5))

正确的输出激活取决于您建模的数据，但不同之处在于给您带来混乱的结果。

编辑：在查看您的问题后，您的功能模型似乎应该将最后一行更改为

out = keras.layers.Dense(5, activation='linear')(out)

或者干脆

out = keras.layers.Dense(5)(out)

使用 Sequential 或 functional 风格构建的相同 Keras 模型的结果截然不同

Very different results from same Keras model, built with Sequential or functional style

python

model

keras