Keras,无状态 LSTM
Keras, stateless LSTM
这是无状态模式下 LSTM 的一个非常简单的例子,我们在一个非常简单的序列 [0–>1]
和 [0–>2]
[=14= 上训练它]
知道为什么它不会在无状态模式下收敛吗?
我们有一个大小为 2 的批次,有 2 个样本,它应该将状态保持在批次中。预测时我们希望连续收到 1 和 2。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequence into required data format.
#We are going to extract 2 samples [0–>1] and [0–>2] and convert them into one hot vectors
seqX=numpy.array([[( 1. , 0. , 0.)], [( 1. , 0. , 0.)]])
seqY=numpy.array([( 0. , 1. , 0.) , ( 0. , 0. , 1.)])
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_batch = 2
n_features = n_unique #which is =3
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=( 1, n_features) ))
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
model.fit(seqX, seqY, epochs=300, batch_size=n_batch, verbose=2, shuffle=False)
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, batch_size=n_batch, verbose=0)
for i in range(2):
print('X=%.1f y=%.1f, yhat=%.1f' % (0, i+1, result[i]))
示例 2
在这里我想澄清一下我想要的结果。
相同的代码示例,但处于有状态模式 (stateful=True)。它工作得很好。我们用零向网络输入 2 次,然后得到 1,然后得到 2。但是我想在无状态模式下获得相同的结果,因为它应该将状态保持在批处理中。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequences into required data format
seqX=numpy.array([[( 1. , 0. , 0.)], [( 1. , 0. , 0.)]])
seqY=numpy.array([( 0. , 1. , 0.) , ( 0. , 0. , 1.)])
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_batch = 1
n_features = n_unique
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, 1, n_features), stateful=True ))
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
for epoch in range(300):
model.fit(seqX, seqY, epochs=1, batch_size=n_batch, verbose=2, shuffle=False)
model.reset_states()
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, batch_size=1, verbose=0)
for i in range(2):
print('X=%.1f y=%.1f, yhat=%.1f' % (0, i+1, result[i]))
作为正确的结果,我们应该得到:
序列
X=0.0 y=1.0, yhat=1.0
X=0.0 y=2.0, yhat=2.0
您必须用两个步骤喂入一个序列,而不是用一个步骤喂入两个序列:
- 一个序列,两个步骤:
seqX.shape = (1,2,3)
- 两个序列,一步:
seqX.shape = (2,1,3)
输入的形状是(numberOfSequences, stepsPerSequence, featuresPerStep)
seqX = [[[1,0,0],[1,0,0]]]
如果要将 y 的两个步骤都作为输出,则必须使用 return_sequences=True
。
LSTM(n_neurons, input_shape=( 1, n_features), return_sequences=True)
整个工作代码:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequence into required data format.
#We are going to extract 2 samples [0–>1] and [0–>2] and convert them into one hot vectors
seqX=numpy.array([[[ 1. , 0. , 0.], [ 1. , 0. , 0.]]])
seqY=numpy.array([[[0. , 1. , 0.] , [ 0. , 0. , 1.]]])
#shapes are (1,2,3) - 1 sequence, 2 steps, 3 features
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_features = n_unique #which is =3
#no need for batch size
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=( 2, n_features),return_sequences=True))
#the input shape must have two steps
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
model.fit(seqX, seqY, epochs=300, verbose=2)
#no shuffling and no batch size needed.
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, verbose=0)
print(seqX)
print(result) #all steps are predicted in a single array (with return_sequences=True)
这是无状态模式下 LSTM 的一个非常简单的例子,我们在一个非常简单的序列 [0–>1]
和 [0–>2]
[=14= 上训练它]
知道为什么它不会在无状态模式下收敛吗?
我们有一个大小为 2 的批次,有 2 个样本,它应该将状态保持在批次中。预测时我们希望连续收到 1 和 2。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequence into required data format.
#We are going to extract 2 samples [0–>1] and [0–>2] and convert them into one hot vectors
seqX=numpy.array([[( 1. , 0. , 0.)], [( 1. , 0. , 0.)]])
seqY=numpy.array([( 0. , 1. , 0.) , ( 0. , 0. , 1.)])
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_batch = 2
n_features = n_unique #which is =3
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=( 1, n_features) ))
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
model.fit(seqX, seqY, epochs=300, batch_size=n_batch, verbose=2, shuffle=False)
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, batch_size=n_batch, verbose=0)
for i in range(2):
print('X=%.1f y=%.1f, yhat=%.1f' % (0, i+1, result[i]))
示例 2 在这里我想澄清一下我想要的结果。
相同的代码示例,但处于有状态模式 (stateful=True)。它工作得很好。我们用零向网络输入 2 次,然后得到 1,然后得到 2。但是我想在无状态模式下获得相同的结果,因为它应该将状态保持在批处理中。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequences into required data format
seqX=numpy.array([[( 1. , 0. , 0.)], [( 1. , 0. , 0.)]])
seqY=numpy.array([( 0. , 1. , 0.) , ( 0. , 0. , 1.)])
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_batch = 1
n_features = n_unique
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, 1, n_features), stateful=True ))
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
for epoch in range(300):
model.fit(seqX, seqY, epochs=1, batch_size=n_batch, verbose=2, shuffle=False)
model.reset_states()
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, batch_size=1, verbose=0)
for i in range(2):
print('X=%.1f y=%.1f, yhat=%.1f' % (0, i+1, result[i]))
作为正确的结果,我们应该得到:
序列
X=0.0 y=1.0, yhat=1.0
X=0.0 y=2.0, yhat=2.0
您必须用两个步骤喂入一个序列,而不是用一个步骤喂入两个序列:
- 一个序列,两个步骤:
seqX.shape = (1,2,3)
- 两个序列,一步:
seqX.shape = (2,1,3)
输入的形状是(numberOfSequences, stepsPerSequence, featuresPerStep)
seqX = [[[1,0,0],[1,0,0]]]
如果要将 y 的两个步骤都作为输出,则必须使用 return_sequences=True
。
LSTM(n_neurons, input_shape=( 1, n_features), return_sequences=True)
整个工作代码:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy
# define sequences
seq = [0, 1, 0, 2]
# convert sequence into required data format.
#We are going to extract 2 samples [0–>1] and [0–>2] and convert them into one hot vectors
seqX=numpy.array([[[ 1. , 0. , 0.], [ 1. , 0. , 0.]]])
seqY=numpy.array([[[0. , 1. , 0.] , [ 0. , 0. , 1.]]])
#shapes are (1,2,3) - 1 sequence, 2 steps, 3 features
# define LSTM configuration
n_unique = len(set(seq))
n_neurons = 20
n_features = n_unique #which is =3
#no need for batch size
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=( 2, n_features),return_sequences=True))
#the input shape must have two steps
model.add(Dense(n_unique, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam')
# train LSTM
model.fit(seqX, seqY, epochs=300, verbose=2)
#no shuffling and no batch size needed.
# evaluate LSTM
print('Sequence')
result = model.predict_classes(seqX, verbose=0)
print(seqX)
print(result) #all steps are predicted in a single array (with return_sequences=True)