深度学习如何拆分 5 维时间序列并通过嵌入层传递一些维度

Deep Learning how to split 5 dimensions timeseries and pass some dimensions through embedding layer

我有一个 5 维时间序列的输入:

a = [[8,3],[2] , [4,5],[1], [9,1],[2]...] #total 100 timestamps. For each element, dims 0,1 are numerical data and dim 2 is a numerical encoding of a category. This is per sample, 3200 samples

该类别有 3 个可能的值 (0,1,2)

我想构建一个 NN,使最后一个维度(类别)将通过输出大小为 8 的嵌入层,然后连接回前两个维度(数字数据)。

所以,这将是这样的:

input1 = keras.layers.Input(shape=(2,)) #the numerical features
input2 = keras.layers.Input(shape=(1,)) #the encoding of the categories. this part will be embedded to 5 dims
x2 = Embedding(input_dim=1, output_dim = 8)(input2) #apply it to every timestamp and take only dim 3, so [2],[1], [2] 
x = concatenate([input1,x2]) #will get 10 dims at each timepoint, still 100 timepoints
x = LSTM(units=24)(x) #the input has 10 dims/features at each timepoint, total 100 timepoints per sample
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x]) #input1 is 1D vec of the width 2 , input2 is 1D vec with the width 1 and it is going through the embedding
model.compile(
        loss='binary_crossentropy',
        optimizer='adam',
        metrics=['acc']
    )

我该怎么做? (最好在keras)? 我的问题是如何将嵌入应用于每个时间点? 意思是,如果我有 1000 个时间点,每个时间点有 3 个亮度,我需要将它转换为 1000 个时间点,每个时间点有 8 个亮度(嵌入层应该将 input2 从 (1000X1) 转换为 (1000X8)

您在这里遇到了几个问题。 首先让我给你一个工作示例,并解释如何解决你的问题。

导入和数据生成

import tensorflow as tf
import numpy as np

from tensorflow.keras import layers
from tensorflow.keras.models import Model

num_timesteps = 100
max_features_values = [100, 100, 3]
num_observations = 2

input_list = [[[np.random.randint(0, v) for _ in range(num_timesteps)]
   for v in max_features_values]
    for _ in range(num_observations)]

input_arr = np.array(input_list)  # shape (2, 3, 100)

为了使用嵌入,我们需要 voc_size 作为 input_dimension,如 LSTM documentation.

中所述

嵌入和串联

voc_size = len(np.unique(input_arr[:, 2, :])) + 1  # 4

现在我们需要创建输入。输入的大小应为 [None, 2, num_timesteps][None, 1, num_timesteps],其中第一个维度是灵活的,将填充我们传入的观察数。让我们使用之前计算的 voc_size.

inp1 = layers.Input(shape=(2, num_timesteps))  # TensorShape([None, 2, 100])
inp2 = layers.Input(shape=(1, num_timesteps))  # TensorShape([None, 1, 100])
x2 = layers.Embedding(input_dim=voc_size, output_dim=8)(inp2)  # TensorShape([None, 1, 100, 8])
x2_reshaped = tf.transpose(tf.squeeze(x2, axis=1), [0, 2, 1])  # TensorShape([None, 8, 100])

这不容易串联,因为除了沿串联轴的维度外,所有维度都必须匹配。但不幸的是形状不匹配。因此我们重塑x2。我们通过删除第一个维度然后转置来实现。

现在我们可以毫无问题地连接起来,一切都以直接的方式进行:

x = layers.concatenate([inp1, x2_reshaped], axis=1)
x = layers.LSTM(32)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[inp1, inp2], outputs=[x])

检查虚拟示例

inp1_np = input_arr[:, :2, :]
inp2_np = input_arr[:, 2:, :]
model.predict([inp1_np, inp2_np])

# Output
# array([[0.544262 ],
#       [0.6157502]], dtype=float32)

#这正如预期的那样输出 0 到 1 之间的值。

如果您不是在寻找 Embeddings,它通常在 Keras 中使用(正整数映射到密集向量)。您可能正在寻找某种非投影或基础扩展,其中 3 个维度被映射(嵌入)到 8 个维度并将结果连接起来。这可以使用内核技巧或其他方法来完成,但也会在具有非线性应用程序的神经网络中隐式发生。

因此,您可以按照与 pythonic833 类似的格式做这样的事情,因为它很好(但根据 Keras LSTM 文档要求 [batch, timesteps, feature],中间有时间戳):

输入生成

import tensorflow as tf
import numpy as np

from tensorflow.keras import layers
from tensorflow.keras.models import Model

num_timesteps = 100
num_features = 5
num_observations = 2

input_list = [[[np.random.randint(1, 100) for _ in range(num_features)]
   for _ in range(num_timesteps)]
    for _ in range(num_observations)]

input_arr = np.array(input_list)  # shape (2, 100, 5)

模型构建

然后你可以处理输入:

input1 = layers.Input(shape=(num_timesteps, 2,))
input2 = layers.Input(shape=(num_timesteps, 3))
x2 = layers.Dense(8, activation='relu')(input2)
x = layers.concatenate([input1,x2], axis=2) # This produces tensors of shape (None, 100, 10)
x = layers.LSTM(units=24)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(inputs=[input1, input2] , outputs=[x])
model.compile(
    loss='binary_crossentropy',
    optimizer='adam',
    metrics=['acc']
)

结果

inp1_np = input_arr[:, :, :2]
inp2_np = input_arr[:, :, 2:]
model.predict([inp1_np, inp2_np])

产生

array([[0.44117224],
       [0.23611131]], dtype=float32)

关于基础扩展的其他解释要查看:

  1. https://stats.stackexchange.com/questions/527258/embedding-data-into-a-larger-dimension-space
  2. https://www.reddit.com/r/MachineLearning/comments/2ffejw/why_dont_researchers_use_the_kernel_method_in/