如何使用 RStudio 的 R 包 keras 拟合序列到序列模型?

How to fit a sequence to sequence model with the R package keras from RStudio?

这个问题是关于 R 和 Rstudio 的 keras 包的。 (https://github.com/rstudio/keras)

我正在尝试学习一个模型来标记序列的某些部分。 我希望模型做类似的事情:[64,34,77,33,88] -> [0,0,1,1,0] 因此,在输入中,我有一个序列矩阵(每行 1 个序列)由 pad_sequences 生成,看起来像这样:

int [1:21885, 1:30] 21 21 1506 28 102 21 61 224 15 15 ...

并且输出也是由 pad_sequences:

生成的序列矩阵
int [1:21885, 1:30] 0 0 1 0 0 0 1 1 0 0 ...

这里是模拟inputs/outputs形状的代码,我用的是:

input_length = 30
n_sample = 5
vocab_size = 100
quest_train <- matrix(floor(runif(input_length*n_sample, 1,vocab_size)), ncol = input_length)
tag_train <- matrix(sample(c(0,1), size = input_length*n_sample, replace = T), ncol = input_length)

这是我尝试拟合的模型:

input_dim = vocab_size
embed_dim = 50

model <- keras_model_sequential()
model %>%
  layer_embedding(input_dim = input_dim,
                  output_dim = embed_dim) %>%
  layer_dropout(rate = 0.2) %>%
  layer_lstm(units = 128, return_sequences = T) %>%
  layer_dropout(rate = 0.5) %>% 
  time_distributed(layer_dense(units = 2, activation = 'softmax'))

model %>%
  compile(loss = 'categorical_crossentropy', 
          optimizer = 'adam', 
          metrics = c('accuracy'))

model %>% fit(quest_train,  
              tag_train, 
              batch_size = 16 ,
              epochs = 10, 
              shuffle = TRUE)

但是当我尝试 运行 时,我得到了这个错误:

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: Error when checking model target: expected time_distributed_23 to have 3 dimensions, but got array with shape (5, 30)

所以我尝试将输出向量转换为二维矩阵列表 to_categorical 像这样:

tags_train_cat <- lapply(1:nrow(tag_train), function(x) (to_categorical(tag_train[x,])))

那么我的新目标是这样的:

List of 5
 $ : num [1:30, 1:2] 1 1 1 1 1 0 1 1 1 1 ...
 $ : num [1:30, 1:2] 1 1 1 1 1 0 0 1 1 1 ...
 $ : num [1:30, 1:2] 0 1 1 1 1 1 1 1 1 1 ...
 $ : num [1:30, 1:2] 1 1 1 1 1 1 0 0 0 0 ...
...

但知道我得到了这个错误:

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of X arrays

所以,我的问题是:我做错了什么?

已解决:

即使输入 (X) 必须是二维数组(由于嵌入), 输出 (Y) 必须是具有维度(样本、sequence_length、特征)的 3 维形状。这里的特征 = 1。