验证简单 for/lambda 比较的 JMH 测量

Question

我想对简单的 for 循环和等效流实现进行一些性能测量和比较。我相信流会比等效的非流代码慢一些，但我想确保我正在衡量正确的事情。

我在这里包括了我的整个 jmh class。

import java.util.ArrayList;
import java.util.List;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;

@State(Scope.Benchmark)
public class MyBenchmark {
    List<String>    shortLengthListConstantSize     = null;
    List<String>    mediumLengthListConstantSize    = null;
    List<String>    longerLengthListConstantSize    = null;
    List<String>    longLengthListConstantSize      = null;

    @Setup
    public void setup() {
        shortLengthListConstantSize     = populateList(2);
        mediumLengthListConstantSize    = populateList(12);
        longerLengthListConstantSize    = populateList(300);
        longLengthListConstantSize      = populateList(300000);
    }

    private List<String> populateList(int size) {
        List<String> list   = new ArrayList<>();
        for (int ctr = 0; ctr < size; ++ ctr) {
            list.add("xxx");
        }
        return list;
    }

    @Benchmark
    public long shortLengthConstantSizeFor() {
        long count   = 0;
        for (String val : shortLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long shortLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        shortLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long shortLengthConstantSizeLambda() {
        return shortLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long shortLengthConstantSizeLambdaParallel() {
        return shortLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long mediumLengthConstantSizeFor() {
        long count   = 0;
        for (String val : mediumLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long mediumLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        mediumLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long mediumLengthConstantSizeLambda() {
        return mediumLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long mediumLengthConstantSizeLambdaParallel() {
        return mediumLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longerLengthConstantSizeFor() {
        long count   = 0;
        for (String val : longerLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long longerLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        longerLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long longerLengthConstantSizeLambda() {
        return longerLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longerLengthConstantSizeLambdaParallel() {
        return longerLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longLengthConstantSizeFor() {
        long count   = 0;
        for (String val : longLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long longLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        longLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long longLengthConstantSizeLambda() {
        return longLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longLengthConstantSizeLambdaParallel() {
        return longLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    public static class IntHolder {
        public int value    = 0;
    }
}

我在 Win7 笔记本电脑上运行这些。我不关心绝对测量值，只关心相对值。以下是这些的最新结果：

Benchmark                                            Mode  Cnt          Score         Error  Units
MyBenchmark.longLengthConstantSizeFor               thrpt  200       2984.554 ±      57.557  ops/s
MyBenchmark.longLengthConstantSizeForEach           thrpt  200       2971.701 ±     110.414  ops/s
MyBenchmark.longLengthConstantSizeLambda            thrpt  200        331.741 ±       2.196  ops/s
MyBenchmark.longLengthConstantSizeLambdaParallel    thrpt  200       2827.695 ±     682.662  ops/s
MyBenchmark.longerLengthConstantSizeFor             thrpt  200    3551842.518 ±   42612.744  ops/s
MyBenchmark.longerLengthConstantSizeForEach         thrpt  200    3616285.629 ±   16335.379  ops/s
MyBenchmark.longerLengthConstantSizeLambda          thrpt  200    2791292.093 ±   12207.302  ops/s
MyBenchmark.longerLengthConstantSizeLambdaParallel  thrpt  200      50278.869 ±    1977.648  ops/s
MyBenchmark.mediumLengthConstantSizeFor             thrpt  200   55447999.297 ±  277442.812  ops/s
MyBenchmark.mediumLengthConstantSizeForEach         thrpt  200   57381287.954 ±  362751.975  ops/s
MyBenchmark.mediumLengthConstantSizeLambda          thrpt  200   15925281.039 ±   65707.093  ops/s
MyBenchmark.mediumLengthConstantSizeLambdaParallel  thrpt  200      60082.495 ±     581.405  ops/s
MyBenchmark.shortLengthConstantSizeFor              thrpt  200  132278188.475 ± 1132184.820  ops/s
MyBenchmark.shortLengthConstantSizeForEach          thrpt  200  124158664.044 ± 1112991.883  ops/s
MyBenchmark.shortLengthConstantSizeLambda           thrpt  200   18750818.019 ±  171239.562  ops/s
MyBenchmark.shortLengthConstantSizeLambdaParallel   thrpt  200     474054.951 ±    1344.705  ops/s

在之前的一个问题中，我确认这些基准似乎是 "functionally equivalent"（只是寻找更多的眼睛）。这些数字看起来是否与这些基准的独立运行一致？

我一直不确定 JMH 输出的另一件事是确定吞吐量数字的确切含义。例如，"Cnt" 列中的“200”究竟代表什么？吞吐量单位是"operations per second"，那么"operation"到底代表什么，是调用benchmark方法执行一次吗？例如，在最后一行中，这表示每秒执行 474k 次基准方法。

更新:

我注意到，当我将 "for" 与 "lambda" 进行比较时，从 "short" 列表开始到更长的列表，它们之间的比率相当大，但下降，直到 "long" 列表，其中的比率甚至大于 "short" 列表（14%、29%、78% 和 11%）。我觉得这令人惊讶。随着实际业务逻辑中工作的增加，我预计流开销的比率会降低。有人对此有什么想法吗？

Answer 1

For instance, what does the "200" in the "Cnt" column exactly represent?

cnt 列是迭代次数 - 即重复测试的次数。您可以使用以下注释控制该值：

实测：@Measurement(iterations = 10, time = 50, timeUnit = TimeUnit.MILLISECONDS)
对于热身阶段：@Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)

这里iterations是cnt； time是一次迭代所需的持续时间，timeUnit是time值的度量单位。

The throughput units are in "operations per second"

您可以通过多种方式控制输出。例如，您可以使用 @OutputTimeUnit(TimeUnit.XXXX) 更改时间的测量单位，因此您可以得到 ops/us、ops/ms

您还可以更改 mode：您可以测量 "average time"、"sample time" 等，而不是测量 ops/time。您可以通过 @BenchmarkMode({Mode.AverageTime})注释

so what exactly does the "operation" represent, is that the execution of one call to the benchmark method

所以假设一次迭代是 1 秒长，你得到 1000 ops/sec。这意味着 benchamrk 方法已经执行了 1000 次。

换句话说，一次操作就是基准方法的一次执行，除非您有 @OperationsPerInvocation(XXX) 注释，这意味着方法的教学调用将计为 XXX 操作。

误差是在所有迭代中计算的。

另一个提示：您可以执行参数化基准测试，而不是对每个可能的大小进行硬编码：

@Param({"3", "12", "300", "3000"})
private int length;

然后您可以在您的设置中使用该参数：

 @Setup(Level.Iteration)
 public void setUp(){
     populateList(length)
 }

验证简单 for/lambda 比较的 JMH 测量

Verify JMH measurements of simple for/lambda comparisons

java

benchmarking

java-stream

jmh