Tensorflow Android Speech Recognition Sample 中的日志记录和双端队列操作问题

Logging and deque operation problems in Tensorflow Android Speech Recognition Sample

我正在研究 tensorflow 语音命令示例。 我使用的 Android 代码库与 tensorflow GitHub android sample and mainly focus on SpeechActivity.java and RecognizeCommands.java 相同。除了记录消息外,我没有做任何更改。

据我所知,

(1) SpeechActivity.java 会将模型参考结果 (outputScores) 和 currentTime 传递给 recognizeCommands.processLatestResults 以进行后验平滑。

// Run the model.
inferenceInterface.feed(SAMPLE_RATE_NAME, sampleRateList);
inferenceInterface.feed(INPUT_DATA_NAME, floatInputBuffer, RECORDING_LENGTH, 1);
inferenceInterface.run(outputScoresNames);
inferenceInterface.fetch(OUTPUT_SCORES_NAME, outputScores);

// Use the smoother to figure out if we've had a real recognition event.
long currentTime = System.currentTimeMillis();
final RecognizeCommands.RecognitionResult result =
recognizeCommands.processLatestResults(outputScores, currentTime);

(2) 在 ProcessLatestResults() 中,previousResults 用于存储最近 500 毫秒(averageWindowDurationMs == 500)推断的输出分数,而 averageScores 将是最终成绩我们want/use以后

// Add the latest results to the head of the queue.
previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

// Prune any earlier results that are too old for the averaging window.
final long timeLimit = currentTimeMS - averageWindowDurationMs;
while (previousResults.getFirst().first < timeLimit) {
  previousResults.removeFirst();
}

...

// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

我的 problems/questions 是

(1) 当 for 循环计算平均值时,previousResult.second 从每个项目读取的值是相同的。然而,这是不可能的。我的问题是我是否遗漏了日志信息中的某些内容,从而打印出错误的 previousResult.second 值?或者那些分数数组真的相同? 这是我的记录方式:

Log.d("tmp", "start average");
// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  Log.d("tmp", "previousResult("+previousResult.first+"): ["+Arrays.toString(previousResult.second)+"]" );
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

这里是两次平均循环过程中的日志消息。第一次,previousResults中的分数数组是相同的[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225],这是不可能的。

start avarage
previousResult(1520998400247): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400301): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400354): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400408): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400466): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400520): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400574): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400629): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400683): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400737): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
....
....
start average
previousResult(1520998400301): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400354): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400408): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400466): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400520): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400574): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400629): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400683): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400737): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400791): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
...
...

(2) 根据message log可以看到第一次1520998400301对应的scores数组是[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225],但是下次scores数组变成了[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]

我的第二个问题是我不知道这怎么会发生。我的代码与 RecognizeCommands.java 相同。任何线索或建议都会很有帮助,谢谢。

尝试更改此代码:

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

如下:

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, 
                            Arrays.copyOf(currentResults, currentResults.length)));