Tensorflow tfprof LSTMCell

Tensorflow tfprof LSTMCell

我正在使用 tfprof 以获得模型前向路径所需的触发器数。 我的模型是 3 层 LSTM 和之后的全连接层。 我观察到全连接层的计算量呈线性增长,而 LSTM 层则没有变化。这怎么可能?

1 个时间戳转发路径的 tfprof 报告。

==================Model Analysis Report======================
_TFProfRoot (0/2.71m flops)
  rnn/while/multi_rnn_cell/cell_1/lstm_cell/lstm_cell_1/MatMul (1.05m/1.05m flops)
  rnn/while/multi_rnn_cell/cell_2/lstm_cell/lstm_cell_1/MatMul (1.05m/1.05m flops)
  rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell_1/MatMul (606.21k/606.21k flops)
  fc_layer/MatMul (1.54k/1.54k flops)
  rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  rnn/while/multi_rnn_cell/cell_1/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  rnn/while/multi_rnn_cell/cell_2/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  fc_layer/BiasAdd (3/3 flops)

tfprof 报告 2 个时间戳转发路径。

==================Model Analysis Report======================
_TFProfRoot (0/2.71m flops)
  rnn/while/multi_rnn_cell/cell_1/lstm_cell/lstm_cell_1/MatMul (1.05m/1.05m flops)
  rnn/while/multi_rnn_cell/cell_2/lstm_cell/lstm_cell_1/MatMul (1.05m/1.05m flops)
  rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell_1/MatMul (606.21k/606.21k flops)
  fc_layer/MatMul (3.07k/3.07k flops)
  rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  rnn/while/multi_rnn_cell/cell_1/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  rnn/while/multi_rnn_cell/cell_2/lstm_cell/lstm_cell_1/BiasAdd (1.02k/1.02k flops)
  fc_layer/BiasAdd (6/6 flops)

tfprof 对您的图进行静态分析并计算每个图节点的浮点运算。

我假设您使用的是 dynamic_rnn 或具有 tf.while_loop 的类似内容。在这种情况下,图节点在图中出现一次 但实际上在 运行 次 运行 次。

在这种情况下,tfprof 没有办法静态计算出有多少 步骤(你的话中的时间戳)将是 运行。因此,它只计算 浮点运算一次。

目前的解决方法可能是自己乘以时间步长。