Tensorflow Android Speech Recognition Sample 中的日志记录和双端队列操作问题

Question

我正在研究 tensorflow 语音命令示例。我使用的 Android 代码库与 tensorflow GitHub android sample and mainly focus on SpeechActivity.java and RecognizeCommands.java 相同。除了记录消息外，我没有做任何更改。

据我所知，

(1) SpeechActivity.java 会将模型参考结果 (outputScores) 和 currentTime 传递给 recognizeCommands.processLatestResults 以进行后验平滑。

// Run the model.
inferenceInterface.feed(SAMPLE_RATE_NAME, sampleRateList);
inferenceInterface.feed(INPUT_DATA_NAME, floatInputBuffer, RECORDING_LENGTH, 1);
inferenceInterface.run(outputScoresNames);
inferenceInterface.fetch(OUTPUT_SCORES_NAME, outputScores);

// Use the smoother to figure out if we've had a real recognition event.
long currentTime = System.currentTimeMillis();
final RecognizeCommands.RecognitionResult result =
recognizeCommands.processLatestResults(outputScores, currentTime);

(2) 在 ProcessLatestResults() 中，previousResults 用于存储最近 500 毫秒（averageWindowDurationMs == 500）推断的输出分数，而 averageScores 将是最终成绩我们want/use以后

// Add the latest results to the head of the queue.
previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

// Prune any earlier results that are too old for the averaging window.
final long timeLimit = currentTimeMS - averageWindowDurationMs;
while (previousResults.getFirst().first < timeLimit) {
  previousResults.removeFirst();
}

...

// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

我的 problems/questions 是

(1) 当 for 循环计算平均值时，previousResult.second 从每个项目读取的值是相同的。然而，这是不可能的。我的问题是我是否遗漏了日志信息中的某些内容，从而打印出错误的 previousResult.second 值？或者那些分数数组真的相同？这是我的记录方式：

Log.d("tmp", "start average");
// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  Log.d("tmp", "previousResult("+previousResult.first+"): ["+Arrays.toString(previousResult.second)+"]" );
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

这里是两次平均循环过程中的日志消息。第一次，previousResults中的分数数组是相同的[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]，这是不可能的。

start avarage
previousResult(1520998400247): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400301): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400354): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400408): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400466): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400520): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400574): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400629): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400683): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400737): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
....
....
start average
previousResult(1520998400301): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400354): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400408): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400466): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400520): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400574): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400629): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400683): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400737): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400791): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
...
...

(2) 根据message log可以看到第一次1520998400301对应的scores数组是[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]，但是下次scores数组变成了[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]

我的第二个问题是我不知道这怎么会发生。我的代码与 RecognizeCommands.java 相同。任何线索或建议都会很有帮助，谢谢。

Answer 1

尝试更改此代码：

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

如下：

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, 
                            Arrays.copyOf(currentResults, currentResults.length)));

Tensorflow Android Speech Recognition Sample 中的日志记录和双端队列操作问题

Logging and deque operation problems in Tensorflow Android Speech Recognition Sample

android

speech-recognition

tensorflow