如何在 Tensorflow 中结合 FCNN 和 RNN?
How to combine FCNN and RNN in Tensorflow?
我想制作一个神经网络,它在某些层具有循环(例如,LSTM)而在其他层具有正常连接(FC)。
我找不到在 Tensorflow 中执行此操作的方法。
它可以工作,如果我只有 FC 层,但我不知道如何正确地只添加一个循环层。
我按以下方式创建网络:
with tf.variable_scope("autoencoder_variables", reuse=None) as scope:
for i in xrange(self.__num_hidden_layers + 1):
# Train weights
name_w = self._weights_str.format(i + 1)
w_shape = (self.__shape[i], self.__shape[i + 1])
a = tf.multiply(4.0, tf.sqrt(6.0 / (w_shape[0] + w_shape[1])))
w_init = tf.random_uniform(w_shape, -1 * a, a)
self[name_w] = tf.Variable(w_init,
name=name_w,
trainable=True)
# Train biases
name_b = self._biases_str.format(i + 1)
b_shape = (self.__shape[i + 1],)
b_init = tf.zeros(b_shape)
self[name_b] = tf.Variable(b_init, trainable=True, name=name_b)
if i+1 == self.__recurrent_layer:
# Create an LSTM cell
lstm_size = self.__shape[self.__recurrent_layer]
self['lstm'] = tf.contrib.rnn.BasicLSTMCell(lstm_size)
它应该按顺序处理批次。我有一个只处理一个时间步长的函数,稍后将由一个处理整个序列的函数调用:
def single_run(self, input_pl, state, just_middle = False):
"""Get the output of the autoencoder for a single batch
Args:
input_pl: tf placeholder for ae input data of size [batch_size, DoF]
state: current state of LSTM memory units
just_middle : will indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
last_output = input_pl
# Pass through the network
for i in xrange(self.num_hidden_layers+1):
if(i!=self.__recurrent_layer):
w = self._w(i + 1)
b = self._b(i + 1)
last_output = self._activate(last_output, w, b)
else:
last_output, state = self['lstm'](last_output,state)
return last_output
以下函数应将批次序列作为输入并生成批次序列作为输出:
def process_sequences(self, input_seq_pl, dropout, just_middle = False):
"""Get the output of the autoencoder
Args:
input_seq_pl: input data of size [batch_size, sequence_length, DoF]
dropout: dropout rate
just_middle : indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
if(~just_middle): # if not middle layer
numb_layers = self.__num_hidden_layers+1
else:
numb_layers = FLAGS.middle_layer
with tf.variable_scope("process_sequence", reuse=None) as scope:
# Initial state of the LSTM memory.
state = initial_state = self['lstm'].zero_state(FLAGS.batch_size, tf.float32)
tf.get_variable_scope().reuse_variables() # THIS IS IMPORTANT LINE
# First - Apply Dropout
the_whole_sequences = tf.nn.dropout(input_seq_pl, dropout)
# Take batches for every time step and run them through the network
# Stack all their outputs
with tf.control_dependencies([tf.convert_to_tensor(state, name='state') ]): # do not let paralelize the loop
stacked_outputs = tf.stack( [ self.single_run(the_whole_sequences[:,time_st,:], state, just_middle) for time_st in range(self.sequence_length) ])
# Transpose output from the shape [sequence_length, batch_size, DoF] into [batch_size, sequence_length, DoF]
output = tf.transpose(stacked_outputs , perm=[1, 0, 2])
return output
问题在于变量范围及其 属性 "reuse"。
如果我按原样 运行 这段代码,我会收到以下错误:
' 变量 Train/process_sequence/basic_lstm_cell/weights 不存在,或者不是用 tf.get_variable() 创建的。您的意思是在 VarScope 中设置 reuse=None 吗? '
如果我注释掉该行,告诉它重用变量(tf.get_variable_scope().reuse_variables())我收到以下错误:
'Variable Train/process_sequence/basic_lstm_cell/weights already exists, disallowed. Did you mean to set reuse=True in VarScope?'
看来,我们需要 "reuse=None" 来初始化 LSTM 单元的权重,我们需要 "reuse=True" 来调用 LSTM 单元。
请帮我想出正确的方法。
我认为问题在于您正在使用 tf.Variable 创建变量。请改用 tf.get_variable -- 这能解决您的问题吗?
看来我已经使用来自官方 Tensorflow RNN 示例 (https://www.tensorflow.org/tutorials/recurrent) 的 hack 和以下代码解决了这个问题
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(inputs[:, time_step, :], state)
outputs.append(cell_output)
hack 是当我们第一次 运行 LSTM 时,tf.get_variable_scope().reuse 被设置为 False,从而创建新的 LSTM 单元。下次 运行 时,我们将 tf.get_variable_scope().reuse 设置为 True,这样我们就可以使用已经创建的 LSTM。
我想制作一个神经网络,它在某些层具有循环(例如,LSTM)而在其他层具有正常连接(FC)。 我找不到在 Tensorflow 中执行此操作的方法。 它可以工作,如果我只有 FC 层,但我不知道如何正确地只添加一个循环层。
我按以下方式创建网络:
with tf.variable_scope("autoencoder_variables", reuse=None) as scope:
for i in xrange(self.__num_hidden_layers + 1):
# Train weights
name_w = self._weights_str.format(i + 1)
w_shape = (self.__shape[i], self.__shape[i + 1])
a = tf.multiply(4.0, tf.sqrt(6.0 / (w_shape[0] + w_shape[1])))
w_init = tf.random_uniform(w_shape, -1 * a, a)
self[name_w] = tf.Variable(w_init,
name=name_w,
trainable=True)
# Train biases
name_b = self._biases_str.format(i + 1)
b_shape = (self.__shape[i + 1],)
b_init = tf.zeros(b_shape)
self[name_b] = tf.Variable(b_init, trainable=True, name=name_b)
if i+1 == self.__recurrent_layer:
# Create an LSTM cell
lstm_size = self.__shape[self.__recurrent_layer]
self['lstm'] = tf.contrib.rnn.BasicLSTMCell(lstm_size)
它应该按顺序处理批次。我有一个只处理一个时间步长的函数,稍后将由一个处理整个序列的函数调用:
def single_run(self, input_pl, state, just_middle = False):
"""Get the output of the autoencoder for a single batch
Args:
input_pl: tf placeholder for ae input data of size [batch_size, DoF]
state: current state of LSTM memory units
just_middle : will indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
last_output = input_pl
# Pass through the network
for i in xrange(self.num_hidden_layers+1):
if(i!=self.__recurrent_layer):
w = self._w(i + 1)
b = self._b(i + 1)
last_output = self._activate(last_output, w, b)
else:
last_output, state = self['lstm'](last_output,state)
return last_output
以下函数应将批次序列作为输入并生成批次序列作为输出:
def process_sequences(self, input_seq_pl, dropout, just_middle = False):
"""Get the output of the autoencoder
Args:
input_seq_pl: input data of size [batch_size, sequence_length, DoF]
dropout: dropout rate
just_middle : indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
if(~just_middle): # if not middle layer
numb_layers = self.__num_hidden_layers+1
else:
numb_layers = FLAGS.middle_layer
with tf.variable_scope("process_sequence", reuse=None) as scope:
# Initial state of the LSTM memory.
state = initial_state = self['lstm'].zero_state(FLAGS.batch_size, tf.float32)
tf.get_variable_scope().reuse_variables() # THIS IS IMPORTANT LINE
# First - Apply Dropout
the_whole_sequences = tf.nn.dropout(input_seq_pl, dropout)
# Take batches for every time step and run them through the network
# Stack all their outputs
with tf.control_dependencies([tf.convert_to_tensor(state, name='state') ]): # do not let paralelize the loop
stacked_outputs = tf.stack( [ self.single_run(the_whole_sequences[:,time_st,:], state, just_middle) for time_st in range(self.sequence_length) ])
# Transpose output from the shape [sequence_length, batch_size, DoF] into [batch_size, sequence_length, DoF]
output = tf.transpose(stacked_outputs , perm=[1, 0, 2])
return output
问题在于变量范围及其 属性 "reuse"。
如果我按原样 运行 这段代码,我会收到以下错误: ' 变量 Train/process_sequence/basic_lstm_cell/weights 不存在,或者不是用 tf.get_variable() 创建的。您的意思是在 VarScope 中设置 reuse=None 吗? '
如果我注释掉该行,告诉它重用变量(tf.get_variable_scope().reuse_variables())我收到以下错误: 'Variable Train/process_sequence/basic_lstm_cell/weights already exists, disallowed. Did you mean to set reuse=True in VarScope?'
看来,我们需要 "reuse=None" 来初始化 LSTM 单元的权重,我们需要 "reuse=True" 来调用 LSTM 单元。
请帮我想出正确的方法。
我认为问题在于您正在使用 tf.Variable 创建变量。请改用 tf.get_variable -- 这能解决您的问题吗?
看来我已经使用来自官方 Tensorflow RNN 示例 (https://www.tensorflow.org/tutorials/recurrent) 的 hack 和以下代码解决了这个问题
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(inputs[:, time_step, :], state)
outputs.append(cell_output)
hack 是当我们第一次 运行 LSTM 时,tf.get_variable_scope().reuse 被设置为 False,从而创建新的 LSTM 单元。下次 运行 时,我们将 tf.get_variable_scope().reuse 设置为 True,这样我们就可以使用已经创建的 LSTM。