在 Java 11 中使用堆栈跟踪明显比在 Java 8 中慢

Consuming stack traces noticeably slower in Java 11 than Java 8

我在使用 jmh 1.21 比较 JDK 8 和 11 的性能时,我 运行 得到了一些令人惊讶的数字:

Java version: 1.8.0_192, vendor: Oracle Corporation

Benchmark                              Mode  Cnt      Score    Error  Units
MyBenchmark.throwAndConsumeStacktrace  avgt   25  21525.584 ± 58.957  ns/op


Java version: 9.0.4, vendor: Oracle Corporation

Benchmark                              Mode  Cnt      Score     Error  Units
MyBenchmark.throwAndConsumeStacktrace  avgt   25  28243.899 ± 498.173  ns/op


Java version: 10.0.2, vendor: Oracle Corporation

Benchmark                              Mode  Cnt      Score     Error  Units
MyBenchmark.throwAndConsumeStacktrace  avgt   25  28499.736 ± 215.837  ns/op


Java version: 11.0.1, vendor: Oracle Corporation

Benchmark                              Mode  Cnt      Score      Error  Units
MyBenchmark.throwAndConsumeStacktrace  avgt   25  48535.766 ± 2175.753  ns/op

OpenJDK 11 和 12 的性能与 OracleJDK11 类似。为了简洁起见,我省略了它们的编号。

我了解微基准测试并不表示实际应用程序的性能行为。不过,我很好奇这种差异是从哪里来的。 有什么想法吗?


这里是完整的基准:

pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>jmh</groupId>
    <artifactId>consume-stacktrace</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>
    <name>JMH benchmark sample: Java</name>

    <dependencies>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-core</artifactId>
            <version>${jmh.version}</version>
        </dependency>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-generator-annprocess</artifactId>
            <version>${jmh.version}</version>
            <scope>provided</scope>
        </dependency>
    </dependencies>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <jmh.version>1.21</jmh.version>
        <javac.target>1.8</javac.target>
        <uberjar.name>benchmarks</uberjar.name>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-enforcer-plugin</artifactId>
                <version>1.4.1</version>
                <executions>
                    <execution>
                        <id>enforce-versions</id>
                        <goals>
                            <goal>enforce</goal>
                        </goals>
                        <configuration>
                            <rules>
                                <requireMavenVersion>
                                    <version>3.0</version>
                                </requireMavenVersion>
                            </rules>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.8.0</version>
                <configuration>
                    <compilerVersion>${javac.target}</compilerVersion>
                    <source>${javac.target}</source>
                    <target>${javac.target}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <finalName>${uberjar.name}</finalName>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>org.openjdk.jmh.Main</mainClass>
                                </transformer>
                            </transformers>
                            <filters>
                                <filter>
                                    <!--
                                            Shading signed JARs will fail without this.
                                            
                                    -->
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
        <pluginManagement>
            <plugins>
                <plugin>
                    <artifactId>maven-clean-plugin</artifactId>
                    <version>2.6.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-deploy-plugin</artifactId>
                    <version>2.8.2</version>
                </plugin>
                <plugin>
                    <artifactId>maven-install-plugin</artifactId>
                    <version>2.5.2</version>
                </plugin>
                <plugin>
                    <artifactId>maven-jar-plugin</artifactId>
                    <version>3.1.0</version>
                </plugin>
                <plugin>
                    <artifactId>maven-javadoc-plugin</artifactId>
                    <version>3.0.0</version>
                </plugin>
                <plugin>
                    <artifactId>maven-resources-plugin</artifactId>
                    <version>3.1.0</version>
                </plugin>
                <plugin>
                    <artifactId>maven-site-plugin</artifactId>
                    <version>3.7.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-source-plugin</artifactId>
                    <version>3.0.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-surefire-plugin</artifactId>
                    <version>2.22.0</version>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>
</project>

src/main/java/jmh/MyBenchmark.java:

package jmh;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.infra.Blackhole;

import java.io.PrintWriter;
import java.io.StringWriter;
import java.util.concurrent.TimeUnit;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark
{
    @Benchmark
    public void throwAndConsumeStacktrace(Blackhole bh)
    {
        try
        {
            throw new IllegalArgumentException("I love benchmarks");
        }
        catch (IllegalArgumentException e)
        {
            StringWriter sw = new StringWriter();
            e.printStackTrace(new PrintWriter(sw));
            bh.consume(sw.toString());
        }
    }
}

这是我使用的 Windows 特定脚本。 t运行将它移植到其他平台应该是微不足道的:

set JAVA_HOME=C:\Program Files\Java\jdk1.8.0_192
call mvn -V -Djavac.target=1.8 clean install
"%JAVA_HOME%\bin\java" -jar target\benchmarks.jar

set JAVA_HOME=C:\Program Files\Java\jdk-9.0.4
call mvn -V -Djavac.target=9 clean install
"%JAVA_HOME%\bin\java" -jar target\benchmarks.jar

set JAVA_HOME=C:\Program Files\Java\jdk-10.0.2
call mvn -V -Djavac.target=10 clean install
"%JAVA_HOME%\bin\java" -jar target\benchmarks.jar

set JAVA_HOME=C:\Program Files\Java\oracle-11.0.1
call mvn -V -Djavac.target=11 clean install
"%JAVA_HOME%\bin\java" -jar target\benchmarks.jar

我的运行环境是:

Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T14:41:47-04:00)
Maven home: C:\Program Files\apache-maven-3.6.0\bin\..
Default locale: en_CA, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"

更具体地说,我是 运行 Microsoft Windows [Version 10.0.17763.195]

我怀疑这是由于一些变化造成的。

8->9 回归发生在切换到 StackWalker 以生成堆栈跟踪时(JDK-8150778). Unfortunately, this made VM native code intern a lot of strings, and StringTable becomes the bottleneck. If you profile OP's benchmark, you will see the profile like in JDK-8151751。它应该足以 perf record -g 运行基准测试的整个 JVM,然后查看 perf report.(提示,提示,下次可以自己动手!)

而 10->11 回归一定是后来发生的。我 怀疑 这是由于 StringTable 准备切换到完全并发的哈希 table (JDK-8195100,正如 Claes 指出的那样,不完全在 11 中)或其他(class 数据共享更改?)。

无论哪种方式,在快速路径上实习都是一个坏主意,JDK-8151751 的补丁应该已经处理了这两种回归。

看这个:

8u191: 15108 ± 99 ns/op [目前一切顺利]

-   54.55%     0.37%  java     libjvm.so           [.] JVM_GetStackTraceElement 
   - 54.18% JVM_GetStackTraceElement                          
      - 52.22% java_lang_Throwable::get_stack_trace_element   
         - 48.23% java_lang_StackTraceElement::create         
            - 17.82% StringTable::intern                      
            - 13.92% StringTable::intern                      
            - 4.83% Klass::external_name                      
            + 3.41% Method::line_number_from_bci              

"head": 22382 ± 134 ns/op [回归]

-   69.79%     0.05%  org.sample.MyBe  libjvm.so  [.] JVM_InitStackTraceElement
   - 69.73% JVM_InitStackTraceElementArray                    
      - 69.14% java_lang_Throwable::get_stack_trace_elements  
         - 66.86% java_lang_StackTraceElement::fill_in        
            - 38.48% StringTable::intern                      
            - 21.81% StringTable::intern                      
            - 2.21% Klass::external_name                      
              1.82% Method::line_number_from_bci              
              0.97% AccessInternal::PostRuntimeDispatch<G1BarrierSet::AccessBarrier<573

"head" + JDK-8151751 补丁:7511 ± 26 ns/op [哇,比8u还好]

-   22.53%     0.12%  org.sample.MyBe  libjvm.so  [.] JVM_InitStackTraceElement
   - 22.40% JVM_InitStackTraceElementArray                    
      - 20.25% java_lang_Throwable::get_stack_trace_elements  
         - 12.69% java_lang_StackTraceElement::fill_in        
            + 6.86% Method::line_number_from_bci              
              2.08% AccessInternal::PostRuntimeDispatch<G1BarrierSet::AccessBarrier
           2.24% InstanceKlass::method_with_orig_idnum        
           1.03% Handle::Handle        

我用 async-profiler 调查了这个问题,它可以绘制很酷的火焰图来展示 CPU 时间花在哪里。

正如@AlekseyShipilev 指出的那样,JDK 8 和 JDK 9 之间的减速主要是 StackWalker 变化的结果。此外,G1 自 JDK 9 以来已成为默认 GC。如果我们显式设置 -XX:+UseParallelGC(JDK 8 中的默认值),分数会稍微好一些。

但最有趣的部分是 JDK 11.
中的减速 这是 async-profiler 显示的内容(可点击的 SVG)。

两个配置文件的主要区别在于 java_lang_Throwable::get_stack_trace_elements 块的大小,由 StringTable::intern 主导。显然 StringTable::intern 在 JDK 上花费的时间更长 11.

让我们放大:

请注意,JDK 11 中的 StringTable::intern 调用 do_intern,后者又分配一个新的 java.lang.String 对象。看起来很可疑。在 JDK 10 个人资料中看不到此类内容。是时候查看源代码了。

stringTable.cpp (JDK 11)

oop StringTable::intern(Handle string_or_null_h, jchar* name, int len, TRAPS) {
  // shared table always uses java_lang_String::hash_code
  unsigned int hash = java_lang_String::hash_code(name, len);
  oop found_string = StringTable::the_table()->lookup_shared(name, len, hash);
  if (found_string != NULL) {
    return found_string;
  }
  if (StringTable::_alt_hash) {
    hash = hash_string(name, len, true);
  }
  return StringTable::the_table()->do_intern(string_or_null_h, name, len,
                                       |     hash, CHECK_NULL);
}                                      |
                       ----------------
                      |
                      v
oop StringTable::do_intern(Handle string_or_null_h, const jchar* name,
                           int len, uintx hash, TRAPS) {
  HandleMark hm(THREAD);  // cleanup strings created
  Handle string_h;

  if (!string_or_null_h.is_null()) {
    string_h = string_or_null_h;
  } else {
    string_h = java_lang_String::create_from_unicode(name, len, CHECK_NULL);
  }

JDK11中的函数首先在共享StringTable中查找字符串,没有找到,然后转到do_intern并立即创建一个新的String对象。

JDK 10 sources 调用 lookup_shared 后,主 table 中有一个额外的查找,它返回现有的字符串而不创建新对象:

  found_string = the_table()->lookup_in_main_table(index, name, len, hashValue);

此重构是 JDK-8195097 "Make it possible to process StringTable outside safepoint" 的结果。

TL;DR While interning method names in JDK 11, HotSpot creates redundant String objects. This has happened after JDK-8195097.