为什么 instanceof 和迭代单个列表比几个专门的列表更快?
Why instanceof and iterating single list is faster than several specialized lists?
我认为将实现不同接口的对象分成几个列表并在之后迭代这些列表会比将所有对象转储到一个列表中然后通过 instanceof
切换更快。例如。这个:
ArrayList<Visible> visibles = new ArrayList<>();
ArrayList<Highlightable> highlightables = new ArrayList<>();
ArrayList<Selectable> selectables = new ArrayList<>();
// populate the lists
// Visible is an interface, Highlightable is also interface (extends Visible),
// Selectable extends Highlightable
// All interfaces have 3 concrete subclasses each,
// to test situations when JVM is not able to optimize too much due to small number of classes
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
应该比
快
ArrayList<Visible> visibles = new ArrayList<>();
// populate the list
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
不过好像不是这样的:
Main.separateLists thrpt 30 1546.898 ± 32.312 ops/s
Main.singleListAndInstanceof thrpt 30 1673.733 ± 29.804 ops/s
我在下面添加了基准测试的完整源代码。
instanceof
版本更快的原因可能是什么?即使我们假设 isntanceof
指令是免费的,那么两个版本至少在性能上应该是相等的(因为它们仍然向列表添加元素并迭代它们)。
package test;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long separateLists() {
ArrayList<Visible> visibles = new ArrayList<>(3_500);
ArrayList<Highlightable> highlightables = new ArrayList<>(3_500);
ArrayList<Selectable> selectables = new ArrayList<>(3_500);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size() + highlightables.size() + selectables.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return listSize + vsum * hsum * ssum;
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof() {
ArrayList<Visible> visibles = new ArrayList<>(10_000);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return listSize + vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
基准输出:
Main.separateLists thrpt 600 1690.522 ± 6.570 ops/s
Main.singleListAndInstanceof thrpt 600 1751.375 ± 4.368 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 2298.258 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 627.451 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1217756.290 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1135982.650 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 113.599 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 99.896 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 656048.382 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 694074.004 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 872.681 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 355.666 #/op
Main.separateLists:LLC-loads:u thrpt 2 12036.496 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 7445.434 #/op
Main.separateLists:LLC-stores:u thrpt 2 15277.223 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 10649.517 #/op
Main.separateLists:branch-misses:u thrpt 2 22463.763 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29940.958 #/op
Main.separateLists:branches:u thrpt 2 254018.586 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 275450.951 #/op
Main.separateLists:cycles:u thrpt 2 1988517.839 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1921584.057 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 66.212 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 64.442 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1217929.340 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1135799.981 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.179 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.876 #/op
Main.separateLists:iTLB-loads:u thrpt 2 656595.175 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 693913.010 #/op
Main.separateLists:instructions:u thrpt 2 2273646.245 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2045332.939 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 773671.154 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 619477.824 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 184604.485 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 271938.450 #/op
Main.separateLists:·gc.alloc.rate thrpt 600 217.266 ± 0.846 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 600 222.747 ± 0.556 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 600 202181.035 ± 2.986 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 600 200083.395 ± 4.720 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 600 217.792 ± 3.841 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 600 223.528 ± 4.973 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 600 202704.197 ± 3508.997 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 600 200804.794 ± 4414.457 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 600 0.095 ± 0.008 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 600 0.091 ± 0.008 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 600 88.896 ± 7.778 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 600 81.693 ± 7.269 B/op
Main.separateLists:·gc.count thrpt 600 2440.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 600 2289.000 counts
Main.separateLists:·gc.time thrpt 600 4501.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 600 4236.000 ms
更新:下面是基准代码和结果,其中数组设置代码提取到单独的方法中并从测量中删除。正如预期的那样,instanceof
在这种情况下速度较慢 - 以上差异可能与列表设置中的分支预测问题有关。 (虽然这些也很有趣,但它们可能应该单独提出问题)
package test;
import org.openjdk.jmh.annotations.*;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@State(Scope.Thread)
public static class SeparateListsState {
public ArrayList<Visible> visibles;
public ArrayList<Highlightable> highlightables;
public ArrayList<Selectable> selectables;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
highlightables = new ArrayList<>();
selectables = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long separateLists(SeparateListsState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : state.highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : state.selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return vsum * hsum * ssum;
}
@State(Scope.Thread)
public static class SingleListAndInstanceofState {
public ArrayList<Visible> visibles;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof(SingleListAndInstanceofState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
结果:
Main.separateLists thrpt 300 4211.552 ± 23.791 ops/s
Main.singleListAndInstanceof thrpt 300 3920.251 ± 15.478 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 3046.033 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 1089.122 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1090745.006 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1125243.609 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 150.542 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 143.304 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 600852.620 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 700771.042 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 1299.520 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 636.764 #/op
Main.separateLists:LLC-loads:u thrpt 2 14408.815 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 10429.768 #/op
Main.separateLists:LLC-stores:u thrpt 2 18999.178 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 15370.582 #/op
Main.separateLists:branch-misses:u thrpt 2 22578.062 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29257.959 #/op
Main.separateLists:branches:u thrpt 2 258026.890 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 284911.889 #/op
Main.separateLists:cycles:u thrpt 2 1915774.770 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1974841.023 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 101.573 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 99.982 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1090174.103 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1129185.929 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.432 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.955 #/op
Main.separateLists:iTLB-loads:u thrpt 2 600301.665 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 703339.482 #/op
Main.separateLists:instructions:u thrpt 2 1974603.052 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2040460.093 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 808914.974 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 685615.056 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 186013.216 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 272207.204 #/op
Main.separateLists:·gc.alloc.rate thrpt 300 346.891 ± 1.166 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 300 358.297 ± 0.614 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 300 310744.294 ± 0.107 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 300 328992.302 ± 0.110 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 300 349.387 ± 14.305 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 300 360.039 ± 9.075 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 300 313154.953 ± 13018.012 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 300 330629.833 ± 8345.712 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 300 0.092 ± 0.012 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 300 0.094 ± 0.011 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 300 82.348 ± 10.661 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 300 86.465 ± 10.417 B/op
Main.separateLists:·gc.count thrpt 300 1196.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 300 1235.000 counts
Main.separateLists:·gc.time thrpt 300 2178.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 300 2355.000 ms
正如 NoDataFound 在评论中建议的那样,您不仅在比较遍历列表的性能,还在比较列表填充方法。您需要将这部分代码放入设置方法中 - 否则您可能会受到三个 ArrayList
实例(除其他外)的调整大小操作的影响。
您还应该放弃使用 Random
来填充列表,或者至少在两个实现中使用相同的种子对其进行实例化 - 否则您不会在两个实现中创建可重复的元素顺序.
核心的一点是:你的测量有问题。您的问题的第一个版本不仅测量结果不同 "things",而且即使是更新后的问题也有一个大问题,即 10_000
!
的样本量(太低了)
这根本不是一个合理的样本量。至少,不是一个人。您应该检查一下您看到的 10K、50K、100K、100 万……循环迭代。
你看,你犯了很多人在使用 Java 时犯的错误:他们认为 源代码 方面的这个或那个构造决定了性能。
这只是部分正确。您会看到,真正的 性能提升来自 JVM 中的 JIT 编译器将在运行时根据其收集的分析信息进行的无数优化。
我认为,JIT 启动并将字节码转换为高度优化的机器代码的默认阈值类似于 90K(?) 方法 invocations/loop 次迭代。
请理解这一点:除非您的代码执行 100K 或更多规模的操作,否则 JIT 甚至不会考虑您的代码 "worth optimising"。但随后它会开始发挥作用,这会对整体性能产生巨大影响。然后,您在源代码中放入了什么……可能不再重要了。
因此,没有一个测量可以告诉您 "best" 解决方案是什么。这取决于您经历的迭代次数。
因此,真正的答案是:java 性能基准测试很难,你必须对你所做的事情以及你从结果中得出的结论非常勤奋。
除此之外,真正的答案是:性能是一个奢侈的问题。当然,应该避免白白消耗 CPU 个周期的愚蠢错误。但除此之外,您的主要目标始终是编写易于阅读和理解的简单代码。然后,当您注意到 "this feels sluggish" 或 "this doesn't meet our SLAs" 时,您就会仔细定义实验来衡量正在发生的事情,以确定导致性能问题的那段代码。仅作记录:您知道 JIT 可以优化哪些代码最好……惊喜:简单直接的代码,看起来像 90% 的优秀 java 编码员倾向于编写的代码。
我认为将实现不同接口的对象分成几个列表并在之后迭代这些列表会比将所有对象转储到一个列表中然后通过 instanceof
切换更快。例如。这个:
ArrayList<Visible> visibles = new ArrayList<>();
ArrayList<Highlightable> highlightables = new ArrayList<>();
ArrayList<Selectable> selectables = new ArrayList<>();
// populate the lists
// Visible is an interface, Highlightable is also interface (extends Visible),
// Selectable extends Highlightable
// All interfaces have 3 concrete subclasses each,
// to test situations when JVM is not able to optimize too much due to small number of classes
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
应该比
快ArrayList<Visible> visibles = new ArrayList<>();
// populate the list
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
不过好像不是这样的:
Main.separateLists thrpt 30 1546.898 ± 32.312 ops/s
Main.singleListAndInstanceof thrpt 30 1673.733 ± 29.804 ops/s
我在下面添加了基准测试的完整源代码。
instanceof
版本更快的原因可能是什么?即使我们假设 isntanceof
指令是免费的,那么两个版本至少在性能上应该是相等的(因为它们仍然向列表添加元素并迭代它们)。
package test;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long separateLists() {
ArrayList<Visible> visibles = new ArrayList<>(3_500);
ArrayList<Highlightable> highlightables = new ArrayList<>(3_500);
ArrayList<Selectable> selectables = new ArrayList<>(3_500);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size() + highlightables.size() + selectables.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return listSize + vsum * hsum * ssum;
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof() {
ArrayList<Visible> visibles = new ArrayList<>(10_000);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return listSize + vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
基准输出:
Main.separateLists thrpt 600 1690.522 ± 6.570 ops/s
Main.singleListAndInstanceof thrpt 600 1751.375 ± 4.368 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 2298.258 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 627.451 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1217756.290 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1135982.650 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 113.599 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 99.896 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 656048.382 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 694074.004 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 872.681 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 355.666 #/op
Main.separateLists:LLC-loads:u thrpt 2 12036.496 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 7445.434 #/op
Main.separateLists:LLC-stores:u thrpt 2 15277.223 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 10649.517 #/op
Main.separateLists:branch-misses:u thrpt 2 22463.763 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29940.958 #/op
Main.separateLists:branches:u thrpt 2 254018.586 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 275450.951 #/op
Main.separateLists:cycles:u thrpt 2 1988517.839 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1921584.057 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 66.212 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 64.442 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1217929.340 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1135799.981 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.179 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.876 #/op
Main.separateLists:iTLB-loads:u thrpt 2 656595.175 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 693913.010 #/op
Main.separateLists:instructions:u thrpt 2 2273646.245 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2045332.939 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 773671.154 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 619477.824 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 184604.485 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 271938.450 #/op
Main.separateLists:·gc.alloc.rate thrpt 600 217.266 ± 0.846 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 600 222.747 ± 0.556 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 600 202181.035 ± 2.986 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 600 200083.395 ± 4.720 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 600 217.792 ± 3.841 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 600 223.528 ± 4.973 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 600 202704.197 ± 3508.997 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 600 200804.794 ± 4414.457 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 600 0.095 ± 0.008 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 600 0.091 ± 0.008 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 600 88.896 ± 7.778 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 600 81.693 ± 7.269 B/op
Main.separateLists:·gc.count thrpt 600 2440.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 600 2289.000 counts
Main.separateLists:·gc.time thrpt 600 4501.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 600 4236.000 ms
更新:下面是基准代码和结果,其中数组设置代码提取到单独的方法中并从测量中删除。正如预期的那样,instanceof
在这种情况下速度较慢 - 以上差异可能与列表设置中的分支预测问题有关。 (虽然这些也很有趣,但它们可能应该单独提出问题)
package test;
import org.openjdk.jmh.annotations.*;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@State(Scope.Thread)
public static class SeparateListsState {
public ArrayList<Visible> visibles;
public ArrayList<Highlightable> highlightables;
public ArrayList<Selectable> selectables;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
highlightables = new ArrayList<>();
selectables = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long separateLists(SeparateListsState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : state.highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : state.selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return vsum * hsum * ssum;
}
@State(Scope.Thread)
public static class SingleListAndInstanceofState {
public ArrayList<Visible> visibles;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof(SingleListAndInstanceofState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
结果:
Main.separateLists thrpt 300 4211.552 ± 23.791 ops/s
Main.singleListAndInstanceof thrpt 300 3920.251 ± 15.478 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 3046.033 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 1089.122 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1090745.006 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1125243.609 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 150.542 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 143.304 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 600852.620 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 700771.042 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 1299.520 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 636.764 #/op
Main.separateLists:LLC-loads:u thrpt 2 14408.815 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 10429.768 #/op
Main.separateLists:LLC-stores:u thrpt 2 18999.178 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 15370.582 #/op
Main.separateLists:branch-misses:u thrpt 2 22578.062 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29257.959 #/op
Main.separateLists:branches:u thrpt 2 258026.890 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 284911.889 #/op
Main.separateLists:cycles:u thrpt 2 1915774.770 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1974841.023 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 101.573 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 99.982 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1090174.103 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1129185.929 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.432 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.955 #/op
Main.separateLists:iTLB-loads:u thrpt 2 600301.665 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 703339.482 #/op
Main.separateLists:instructions:u thrpt 2 1974603.052 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2040460.093 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 808914.974 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 685615.056 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 186013.216 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 272207.204 #/op
Main.separateLists:·gc.alloc.rate thrpt 300 346.891 ± 1.166 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 300 358.297 ± 0.614 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 300 310744.294 ± 0.107 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 300 328992.302 ± 0.110 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 300 349.387 ± 14.305 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 300 360.039 ± 9.075 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 300 313154.953 ± 13018.012 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 300 330629.833 ± 8345.712 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 300 0.092 ± 0.012 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 300 0.094 ± 0.011 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 300 82.348 ± 10.661 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 300 86.465 ± 10.417 B/op
Main.separateLists:·gc.count thrpt 300 1196.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 300 1235.000 counts
Main.separateLists:·gc.time thrpt 300 2178.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 300 2355.000 ms
正如 NoDataFound 在评论中建议的那样,您不仅在比较遍历列表的性能,还在比较列表填充方法。您需要将这部分代码放入设置方法中 - 否则您可能会受到三个 ArrayList
实例(除其他外)的调整大小操作的影响。
您还应该放弃使用 Random
来填充列表,或者至少在两个实现中使用相同的种子对其进行实例化 - 否则您不会在两个实现中创建可重复的元素顺序.
核心的一点是:你的测量有问题。您的问题的第一个版本不仅测量结果不同 "things",而且即使是更新后的问题也有一个大问题,即 10_000
!
这根本不是一个合理的样本量。至少,不是一个人。您应该检查一下您看到的 10K、50K、100K、100 万……循环迭代。
你看,你犯了很多人在使用 Java 时犯的错误:他们认为 源代码 方面的这个或那个构造决定了性能。
这只是部分正确。您会看到,真正的 性能提升来自 JVM 中的 JIT 编译器将在运行时根据其收集的分析信息进行的无数优化。
我认为,JIT 启动并将字节码转换为高度优化的机器代码的默认阈值类似于 90K(?) 方法 invocations/loop 次迭代。
请理解这一点:除非您的代码执行 100K 或更多规模的操作,否则 JIT 甚至不会考虑您的代码 "worth optimising"。但随后它会开始发挥作用,这会对整体性能产生巨大影响。然后,您在源代码中放入了什么……可能不再重要了。
因此,没有一个测量可以告诉您 "best" 解决方案是什么。这取决于您经历的迭代次数。
因此,真正的答案是:java 性能基准测试很难,你必须对你所做的事情以及你从结果中得出的结论非常勤奋。
除此之外,真正的答案是:性能是一个奢侈的问题。当然,应该避免白白消耗 CPU 个周期的愚蠢错误。但除此之外,您的主要目标始终是编写易于阅读和理解的简单代码。然后,当您注意到 "this feels sluggish" 或 "this doesn't meet our SLAs" 时,您就会仔细定义实验来衡量正在发生的事情,以确定导致性能问题的那段代码。仅作记录:您知道 JIT 可以优化哪些代码最好……惊喜:简单直接的代码,看起来像 90% 的优秀 java 编码员倾向于编写的代码。