如何为 LSTM keras 重塑 X_train 和 y_train
How to reshape X_train and y_train for LSTM keras
我有以下内容:
X_train.shape
(2730, 10)
y_train.shape
(2730)
我想用 keras 训练 LSTM 模型,但我不确定如何重塑输入。
我添加了这个 LSTM 层
time_steps = 30
input_dim = 10 # number of features
...
self.model.add(LSTM(self.hidden_dim, input_shape=(time_steps, self.input_dim), return_sequences=True))
...
input_shape 与我的输入不符。我应该如何重塑我的 X_train?我还必须重塑 y_train 吗?
How should I reshape my X_train?
最简单的选择是向数据添加 timesteps
维度,使其与 LSTM
:
兼容
import tensorflow as tf
samples = 5
features = 10
data = tf.random.normal((samples, features))
time_series_data = tf.expand_dims(data, axis=1) # add timesteps dimension
tf.print('Data -->', tf.shape(data), 'Time series data', tf.shape(time_series_data))
# Data --> [5 10] Time series data [5 1 10]
但是,您希望对通向形状 (samples, 30, 10)
的每个特征使用 30 timesteps
。所以,你可以使用的是RepeatVector layer as part of your model or tf.repeat。这是 RepeatVector
层的示例:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_shape=(features,)))
model.add(tf.keras.layers.RepeatVector(30))
model.add(tf.keras.layers.LSTM(32))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.build((1, 10))
tf.print(model.summary())
Model: "sequential_01"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_24 (Dense) (None, 10) 110
repeat_vector_1 (RepeatVect (None, 30, 10) 0
or)
lstm_3 (LSTM) (None, 32) 5504
dense_25 (Dense) (None, 1) 33
=================================================================
Total params: 5,647
Trainable params: 5,647
Non-trainable params: 0
_________________________________________________________________
None
您也可以先将 10 个特征映射到 300 维输出,然后重塑输出以适应 LSTM
:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(300, input_shape=(features,)))
model.add(tf.keras.layers.Reshape((30, 10)))
model.add(tf.keras.layers.LSTM(32))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
Model: "sequential_02"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_26 (Dense) (None, 300) 3300
reshape (Reshape) (None, 30, 10) 0
lstm_4 (LSTM) (None, 32) 5504
dense_27 (Dense) (None, 1) 33
=================================================================
Total params: 8,837
Trainable params: 8,837
Non-trainable params: 0
_________________________________________________________________
None
对问题:
Do I also have to reshape the y_train?
这取决于你想要什么。如果你只有一个简单的分类任务,就像我在示例中假设的那样,那么你不需要更改 y_train.
更新一:
您还可以像下面这样重塑数据。产生具有 91 个样本的张量,其中每个样本有 30 个时间步长,每个时间步长与 10 个特征相关联。
import tensorflow as tf
timesteps = 2730
features = 10
data = tf.random.normal((timesteps, features))
data = tf.reshape(data, (91, 30, features))
print(data.shape)
# (91, 30, 10)
我有以下内容:
X_train.shape
(2730, 10)
y_train.shape
(2730)
我想用 keras 训练 LSTM 模型,但我不确定如何重塑输入。
我添加了这个 LSTM 层
time_steps = 30
input_dim = 10 # number of features
...
self.model.add(LSTM(self.hidden_dim, input_shape=(time_steps, self.input_dim), return_sequences=True))
...
input_shape 与我的输入不符。我应该如何重塑我的 X_train?我还必须重塑 y_train 吗?
How should I reshape my X_train?
最简单的选择是向数据添加 timesteps
维度,使其与 LSTM
:
import tensorflow as tf
samples = 5
features = 10
data = tf.random.normal((samples, features))
time_series_data = tf.expand_dims(data, axis=1) # add timesteps dimension
tf.print('Data -->', tf.shape(data), 'Time series data', tf.shape(time_series_data))
# Data --> [5 10] Time series data [5 1 10]
但是,您希望对通向形状 (samples, 30, 10)
的每个特征使用 30 timesteps
。所以,你可以使用的是RepeatVector layer as part of your model or tf.repeat。这是 RepeatVector
层的示例:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_shape=(features,)))
model.add(tf.keras.layers.RepeatVector(30))
model.add(tf.keras.layers.LSTM(32))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.build((1, 10))
tf.print(model.summary())
Model: "sequential_01"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_24 (Dense) (None, 10) 110
repeat_vector_1 (RepeatVect (None, 30, 10) 0
or)
lstm_3 (LSTM) (None, 32) 5504
dense_25 (Dense) (None, 1) 33
=================================================================
Total params: 5,647
Trainable params: 5,647
Non-trainable params: 0
_________________________________________________________________
None
您也可以先将 10 个特征映射到 300 维输出,然后重塑输出以适应 LSTM
:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(300, input_shape=(features,)))
model.add(tf.keras.layers.Reshape((30, 10)))
model.add(tf.keras.layers.LSTM(32))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
Model: "sequential_02"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_26 (Dense) (None, 300) 3300
reshape (Reshape) (None, 30, 10) 0
lstm_4 (LSTM) (None, 32) 5504
dense_27 (Dense) (None, 1) 33
=================================================================
Total params: 8,837
Trainable params: 8,837
Non-trainable params: 0
_________________________________________________________________
None
对问题:
Do I also have to reshape the y_train?
这取决于你想要什么。如果你只有一个简单的分类任务,就像我在示例中假设的那样,那么你不需要更改 y_train.
更新一: 您还可以像下面这样重塑数据。产生具有 91 个样本的张量,其中每个样本有 30 个时间步长,每个时间步长与 10 个特征相关联。
import tensorflow as tf
timesteps = 2730
features = 10
data = tf.random.normal((timesteps, features))
data = tf.reshape(data, (91, 30, features))
print(data.shape)
# (91, 30, 10)