无法让 Keras TimeseriesGenerator 训练 LSTM 但可以训练 DNN

Can't get Keras TimeseriesGenerator to train LSTM but can train DNN

我正在从事一个更大的项目,但能够在一个小型协作笔记本中重现此问题,我希望有人能看一看。我能够成功训练密集网络,但无法使用时间序列生成器训练 LSTM。见下文google collab

我知道我使用的回溯长度为 1(只是在这个例子中,对于 LSTM 来说并不完全有意义),但如果我能想到的话,我会将其扩展到更多 n_features 和更大的回溯了解如何进行这项工作。

在这个例子中,我创建了一个简单的数据框,其中有一个名为 input 的输入变量,它预测 pos_pointsneg_points(在协作笔记本中我概述了如何算出来的很简单。

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
tf.__version__

import pandas as pd
df = pd.DataFrame()
df['input'] = np.random.uniform(-10.0, 10.0, 50000)


# Simple rules:
#   If input is positive:
#     pos_points = input * input
#     neg_points = -0.5 * input
#
#   If input is negative:
#     pos_points = -0.5 * input
#     neg_points = -input * input
df['pos_points'] = df['input'].apply(lambda x: x*x if x > 0 else x * -0.5)
df['neg_points'] = df['input'].apply(lambda x: x*x*-1 if x < 0 else -x * 0.5)

target = pd.concat([df.pop(x) for x in ['pos_points', 'neg_points']], axis=1)

我可以通过以下方式成功训练它:

# Build a simple model to go from input to the two outputs
from tensorflow.keras import regularizers
def get_df_model():
  model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1,], activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(10, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(2)
  ])

  model.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())
  return model


model = get_df_model()
model.fit(df, target, epochs=10)

产生:

Epoch 1/10
1563/1563 [==============================] - 2s 1ms/step - loss: 194.1551
Epoch 2/10
1563/1563 [==============================] - 2s 1ms/step - loss: 26.1025
Epoch 3/10
1563/1563 [==============================] - 2s 1ms/step - loss: 7.3179
Epoch 4/10
1563/1563 [==============================] - 2s 1ms/step - loss: 1.1513
Epoch 5/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.4611
Epoch 6/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.3274
Epoch 7/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.2420
Epoch 8/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1833
Epoch 9/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1411
Epoch 10/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1110

但是,当我尝试使用时间序列生成器时,我无法获得适合 LSTM 的输入。注意,我使用的是 1 的回顾,所以它应该是模棱两可的:

# Create timeseries generator
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
lookback = 1
n_features = 1 # If you look at df, there is just the input column, a single number
train_generator = TimeseriesGenerator(df.astype(np.float32).to_numpy(), target.astype(np.float32).to_numpy(), length=lookback, batch_size=16)

print(train_generator[0][0].shape) # input shape, should print out (16, 1, 1) = batchsize, lookback length, input_size
print(train_generator[0][1].shape) # target shape, should print out (16, 2) = batchsize, # of outputs in final dense layer

# Build a simple model to go from input to the two outputs
def get_lstm_model():
  model = tf.keras.Sequential([
    tf.keras.layers.LSTM(10, input_shape=[lookback, n_features,], activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(10, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(2)
  ])

  model.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())
  return model


model = get_lstm_model()
model.fit(train_generator, epochs=10)

生产:

(16, 1, 1)
(16, 2)
WARNING:tensorflow:Layer lstm_33 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
Epoch 1/10
3125/3125 [==============================] - 9s 3ms/step - loss: 722.0089
Epoch 2/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9944
Epoch 3/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0886
Epoch 4/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0521
Epoch 5/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0247
Epoch 6/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9836
Epoch 7/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9711
Epoch 8/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9208
Epoch 9/10
3125/3125 [==============================] - 9s 3ms/step - loss: 686.9716
Epoch 10/10
3125/3125 [==============================] - 9s 3ms/step - loss: 686.9753
<keras.callbacks.History at 0x7f3d09185eb8>

好的,找到答案了,这与 Keras TimeseriesGenerator 有关。

我使用带有输入和输出列的 table 来组织数据。生成器中的输出总是提前映射一个 ROW(因为它通常需要传统的时间序列格式)。

为了解决这个问题,我在目标数据框的前面放了一行NaN。当我调用生成器时,我可以看到它们正确地映射了结果:

adjusted_target = pd.DataFrame([[np.nan] * len(target.columns)], columns=target.columns).append(target, ignore_index=True)[:-1]
train_generator = TimeseriesGenerator(df.astype(np.float32).to_numpy(), adjusted_target.astype(np.float32).to_numpy(), length=1, batch_size=1)

您可以通过以下代码验证 inputs/outputs 是否正确映射:

for i in range(len(train_generator)):
    x, y = train_generator[i]
    print('%s => %s' % (x, y))