了解 R 中 rnn 模型的 Keras 预测输出

Question

我正在尝试 R 中的 Keras 包，方法是 tutorial 关于预测温度。但是，该教程没有解释如何使用经过训练的 RNN 模型进行预测，我想知道该怎么做。为了训练模型，我使用了从教程中复制的以下代码：

dir.create("~/Downloads/jena_climate", recursive = TRUE)
download.file(
    "https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip",
      "~/Downloads/jena_climate/jena_climate_2009_2016.csv.zip"
    )
unzip(
  "~/Downloads/jena_climate/jena_climate_2009_2016.csv.zip",
  exdir = "~/Downloads/jena_climate"
)

library(readr)
data_dir <- "~/Downloads/jena_climate"
fname <- file.path(data_dir, "jena_climate_2009_2016.csv")
data <- read_csv(fname)

data <- data.matrix(data[,-1])

train_data <- data[1:200000,]
mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
data <- scale(data, center = mean, scale = std)

generator <- function(data, lookback, delay, min_index, max_index,
                      shuffle = FALSE, batch_size = 128, step = 6) {
  if (is.null(max_index))
    max_index <- nrow(data) - delay - 1
  i <- min_index + lookback
  function() {
    if (shuffle) {
      rows <- sample(c((min_index+lookback):max_index), size = batch_size)
    } else {
      if (i + batch_size >= max_index)
        i <<- min_index + lookback
      rows <- c(i:min(i+batch_size, max_index))
      i <<- i + length(rows)
    }

    samples <- array(0, dim = c(length(rows), 
                                lookback / step,
                                dim(data)[[-1]]))
    targets <- array(0, dim = c(length(rows)))

    for (j in 1:length(rows)) {
      indices <- seq(rows[[j]] - lookback, rows[[j]], 
                     length.out = dim(samples)[[2]])
      samples[j,,] <- data[indices,]
      targets[[j]] <- data[rows[[j]] + delay,2]
    }            

    list(samples, targets)
  }
}

lookback <- 1440
step <- 6
delay <- 144
batch_size <- 128

train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = 200000,
  shuffle = TRUE,
  step = step, 
  batch_size = batch_size
)

val_gen = generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 200001,
  max_index = 300000,
  step = step,
  batch_size = batch_size
)

test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 300001,
  max_index = NULL,
  step = step,
  batch_size = batch_size
)

# How many steps to draw from val_gen in order to see the entire validation set
val_steps <- (300000 - 200001 - lookback) / batch_size

# How many steps to draw from test_gen in order to see the entire test set
test_steps <- (nrow(data) - 300001 - lookback) / batch_size

library(keras)

model <- keras_model_sequential() %>% 
  layer_flatten(input_shape = c(lookback / step, dim(data)[-1])) %>% 
  layer_dense(units = 32, activation = "relu") %>% 
  layer_dense(units = 1)

model %>% compile(
  optimizer = optimizer_rmsprop(),
  loss = "mae"
)

history <- model %>% fit_generator(
  train_gen,
  steps_per_epoch = 500,
  epochs = 20,
  validation_data = val_gen,
  validation_steps = val_steps
)

我试着用下面的代码预测温度。如果我是正确的，这应该给我每批次的标准化预测温度。因此，当我对这些值进行非规范化并对其进行平均时，我得到了预测温度。这是正确的吗？如果是，那么预测哪个时间（最近的观察时间 + delay？）？

prediction.set <- test_gen()[[1]]
prediction <- predict(model, prediction.set)

另外，keras::predict_generator()和test_gen()函数的正确使用方法是什么？如果我使用以下代码：

model %>% predict_generator(generator = test_gen,
                            steps = test_steps)

它给出了这个错误：

error in py_call_impl(callable, dots$args, dots$keywords) : 
 ValueError: Error when checking model input: the list of Numpy
 arrays that you are passing to your model is not the size the model expected. 
 Expected to see 1 array(s), but instead got the following list of 2 arrays: 
 [array([[[ 0.50394005,  0.6441838 ,  0.5990761 , ...,  0.22060473,
          0.2018686 , -1.7336458 ],
        [ 0.5475698 ,  0.63853574,  0.5890239 , ..., -0.45618412,
         -0.45030192, -1.724062...

Answer 1

注意：我对R语法的了解很少，所以很遗憾我不能用R给你答案。相反，我在我的答案中使用Python。我希望你能轻松地将我的话至少翻译回 R.

... If I am correct, this should give me the normalized predicted temperature for every batch.

是的，没错。预测将被归一化，因为您已经使用归一化标签对其进行了训练：

data <- scale(data, center = mean, scale = std)

因此，您需要使用计算出的均值和标准差对值进行非规范化，以找到真实的预测：

pred = model.predict(test_data)
denorm_pred = pred * std + mean

... for which time is then predicted (latest observation time + delay?)

没错。具体来说，由于在这个特定的数据集中每十分钟记录一次新的观测值并且您已设置 delay=144，这意味着预测值是提前 24 小时的温度（即 144 * 10 = 1440 分钟 = 24 小时）从最后一次给出的观察。

Also, what is the correct way to use keras::predict_generator() and the test_gen() function?

predict_generator takes a generator that gives as output only test samples and not the labels (since we don't need labels when we are performing prediction; the labels are needed when training, i.e. fit_generator(), and when evaluating the model, i.e. evaluate_generator())。这就是错误中提到您需要传递一个数组而不是两个数组的原因。因此，您需要定义一个仅提供测试样本的生成器，或者在 Python 中定义一种替代方法，将现有的生成器包装在另一个仅提供输入样本的函数中（我不知道您是否可以这样做是否在 R 中）：

def pred_generator(gen):
    for data, labels in gen:
        yield data  # discards labels

preds = model.predict_generator(pred_generator(test_generator), number_of_steps)

您需要提供另一个参数，即生成器的步数以覆盖测试数据中的所有样本。实际上我们有 num_steps = total_number_of_samples / batch_size。例如，如果您有 1000 个样本，生成器每次生成 10 个样本，则需要使用生成器进行 1000 / 10 = 100 步。

奖励： 要查看模型的性能，您可以使用 evaluate_generator 使用现有的测试生成器（即 test_gen）：

loss = model.evaluate_generator(test_gen, number_of_steps)

给定的 loss 也被归一化并对其进行非归一化（以获得更好的预测误差感）你只需要将它乘以 std （你不需要添加 mean 因为你使用的是 mae，即平均绝对误差，作为损失函数）：

denorm_loss = loss * std

这会告诉您您的预测平均偏离了多少。例如，如果您要预测温度，denorm_loss 为 5 表示预测平均偏离 5 度（即小于或大于实际值）。

更新： 对于预测，您可以使用 R 中的现有生成器定义新生成器，如下所示：

pred_generator <- function(gen) {
  function() { # wrap it in a function to make it callable
    gen()[1]  # call the given generator and get the first element (i.e. samples)
  }
}

preds <- model %>% 
  predict_generator(
    generator = pred_generator(test_gen), # pass test_gen directly to pred_generator without calling it
    steps = test_steps
  )

evaluate_generator(model, test_gen, test_steps)

了解 R 中 rnn 模型的 Keras 预测输出

Understanding Keras prediction output of a rnn model in R

r

machine-learning

lstm

keras

recurrent-neural-network