如果我想预测数字序列中的下一个元素,我需要将什么作为第二个参数传递给 Keras 的 fit 方法?

If I want to predict the next element in a sequence of numbers, what do I need to pass as second argument to Keras' fit method?

我正在尝试编写一个简单的示例来了解 LSTM 的工作原理。我想取一个简单的整数系列 1、2、3、4、5、6、7、8、9、10,并预测下一个数字。我有一个代码,但我不知道 fit 方法的第二个参数需要是什么。

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM

df = pd.DataFrame(columns = ['Serie'])
for i in range(0, 10):
    df.loc[i, 'Serie'] = i
    
sc = MinMaxScaler(feature_range = (0, 1))
train_set = sc.fit_transform(df.iloc[:, [True]])

xTrain = []

for i in range(0, len(train_set) - 3):
    xTrain.append(train_set[i:i + 3, 0])

xTrain = np.array(xTrain)
xTrain = np.reshape(xTrain, (xTrain.shape[0], xTrain.shape[1], 1))

regresor = Sequential()
regresor.add(LSTM(units = 1, input_shape = (3, 1)))
regresor.compile(optimizer = 'rmsprop', loss = 'mse')
regresor.fit(xTrain, ???, batch_size = 1)

有人可以给我一个非常简单的例子吗?

您需要将问题设置为监督问题。每个样本都包含自变量 x 和因变量 y。根据您的问题,x 包含 3 个时间步长和 1 个特征的样本。从进行必要的导入开始:

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import numpy as np
import tensorflow as tf

让我们定义一些常数:

points = 30 # number of data points to generate
timesteps = 3 # number of time steps per sample as LSTM layers need input shape (samples, time steps, features)
features = 1 # number of features per time step as LSTM layers need input shape (samples, time steps, features)

从 0 到 30 的序列生成:

x = np.arange(points + 1) # array([ 0,  1, ..., 29, 30])

这里是我们开始将问题设置为监督问题的地方,其中 x 作为数字序列,y 作为下一个数字的序列:

y = x[1:] # [ 1,  2, ..., 29, 30 ]
x = x[:30] # [ 0,  1, ..., 28, 29 ]

xy 放在一起进行缩放:

dataset = np.hstack((x.reshape((points, 1)),y.reshape((points, 1))))
scaler = MinMaxScaler((0, 1))
scaled = scaler.fit_transform(dataset)

让我们定义模型的输入和输出:

x_train = scaled[:,0] # first column
x_train = x_train.reshape((points // timesteps, timesteps, features)) # as i stated before LSTM layers need input shape (samples, time steps, features)

y_train = scaled[:,1] # second column
y_train = y_train[2::3] # start at the third element in steps of 3, for a total of 10

模型定义和编译。我决定让模型架构更健壮一些以获得“更好”的性能(见下面的结果):

regresor = tf.keras.models.Sequential()
regresor.add(tf.keras.layers.LSTM(units = 4, return_sequences = True))
regresor.add(tf.keras.layers.LSTM(units = 2))
regresor.add(tf.keras.layers.Dense(units = 1))
regresor.compile(optimizer = 'rmsprop', loss = 'mse')

训练模型: regresor.fit(x_train, y_train, batch_size = 2, epochs = 500, verbose = 1)

一些预测: y_hats = regresor.predict(x_train)

结果;

    real y      predicted y
    0.068966    0.086510
    0.172414    0.162209
    0.275862    0.252749
    0.379310    0.356117
    0.482759    0.467885
    0.586207    0.582081
    0.689655    0.692756
    0.793103    0.795362
    0.896552    0.887317
    1.000000    0.967796

如您所见,预测值非常接近真实值。

结果图:

请注意,为简单起见,我对训练数据集进行了预测,应该对测试数据进行测试。为此,您将必须生成更多点并相应地分配它们(70% 训练,30% 测试)。此外,您可以通过调用缩放器的 inverse_transform 方法来获取原始范围内的值。