如何将多数组数据提供给模型？

Question

正如您在下面看到的，对象 n3w_coin 有一个名为 forecast_coin() 的方法，其中 returns 一个数据框在删除 date_time 后有 5 列，我拆分了使用 train_test_split 的数据，然后使用 sc 对其进行归一化，在将 2D 数组转换为 3D 数组之后，我想将其传递给模型进行训练，但我在弄清楚时遇到了一些麻烦如何将 normalized_x_train 喂给模型

我的目标是将 normalized_x_train 中的每个子数组提供给模型

我收到以下错误 IndexError: tuple index out of range

请解释为什么以及我的方法有什么问题

df = pd.DataFrame(n3w_coin.forecast_coin())

x_sth = np.array(df.drop(['date_time'],1))
y_sth = np.array(df.drop(['date_time'],1))



sc = MinMaxScaler(feature_range=(0,1))


X_train, X_test, y_train, y_test = train_test_split(x_sth,y_sth, test_size=0.2, shuffle=False)

print (X_train)
normalized_x_train = sc.fit_transform(X_train)
normalized_y_train = sc.fit_transform(y_train)

print (normalized_x_train)

### converting to a 3D array to feed the model 

normalized_x_train = np.reshape(normalized_x_train, (400 , 5 ,1 ))

print (normalized_x_train.shape)

print (normalized_x_train)

model = Sequential()
model.add(LSTM(units = 100, return_sequences = True, input_shape=(normalized_x_train.shape[5],1)))
 
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
model.fit(normalized_x_train, normalized_y_train, epochs=100, batch_size=400 )

Answer 1

您的代码中的一些观察结果：

除了Normalizing the Data，还需要Prepare Time Series Data。请找到下面的 function，pre-processes 和 Data 以便它可以喂给 LSTM Model。

def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])

    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])

  return np.array(data), np.array(labels)

parameters、history_size 和 target_size 很重要。 history_size表示Time Series需要考虑predicting和Target Value中的多少个值。 target_size 表示需要准确预测哪个Future Value。

您的 Network 只有 1 个 LSTM Layer 而您正在设置参数的值 return_sequences = True。仅当此 Layer.
之后还有另一个 LSTM Layer 时，该参数的值才应为 True
既然你想预测一个数值，最后应该有一个密集层 1 Neuron/Unit/Node，Activation = 'linear'.

请参考此 Tensorflow Tutorial on Time Series Analysis，其中包含针对您的问题的完整代码。

如何将多数组数据提供给模型？

how to feed multi-array data to a model?

python

tensorflow2.0