如何获得 lstm 张量流网络数据集的正确基数？

Question

我正在尝试对数据进行一些分类，我有 X 和 y，它们看起来像这样

这是 X 信息

在 y 中我有一个 pandas 1 col 的数据帧，它可以包含 1、0 或 -1，X 中的列位置也可以是 1,0,-1

所以我尝试像这样预处理数据

X_train, y_train = X.iloc[0:107452], y.iloc[0:107452]
X_test, y_test = X.iloc[107452:len(X)], y.iloc[107452:len(y)]

y_train = tf.keras.utils.to_categorical(y_train)

y_test = tf.keras.utils.to_categorical(y_test)

X_train= np.asarray(X_train).astype('float32')
X_train=X_train.reshape(-1, 107452, 9)
y_train= np.asarray(y_train).astype('float32')
y_train=y_train.reshape(-1, 107452, 1)
X_test = np.asarray(X_test).astype('float32')

X_test=X_test.reshape(-1, 46050, 9)

y_test= np.asarray(y_test).astype('float32')
y_test=y_test.reshape(-1, 46050, 1)

使 y 成为绝对值。所以 X（训练和测试）和 y 的形状是

X_train shape: (1, 107452, 9), y_train shape: (2, 107452, 1)
X_test shape: (1, 46050, 9), y_test shape: (2, 46050, 1)

对于我使用的模型

model = keras.Sequential()
model.add(keras.layers.LSTM(250, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(3,activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()

history = model.fit(
    X_train, y_train,
    epochs=32,
    batch_size=32,
    shuffle=False
)

但我收到下一个错误

ValueError: Data cardinality is ambiguous:
  x sizes: 1
  y sizes: 2
Make sure all arrays contain the same number of samples.

希望你们能帮我弄清楚如何让它工作，提前谢谢

Answer 1

这一行之后：

X_train, y_train = X.iloc[0:107452], y.iloc[0:107452]
X_train= np.asarray(X_train).astype('float32')

尝试运行:

y_train = tf.keras.utils.to_categorical(y_train, 3) # 3 classes
X_train = tf.expand_dims(X_train, axis=-1)

它应该可以工作。这同样适用于您的测试数据。

如何获得 lstm 张量流网络数据集的正确基数？

how to get the right cardinality of a dataset for lstm tensorflow network?

python

lstm

keras

tensorflow