为什么在 StringLatin1 实用程序 class 的 hashCode 方法中使用了额外的 var2 字节数组？

Question

当前代码是：

public static int hashCode(byte[] value) {
    int h = 0;
    byte[] var2 = value;
    int var3 = value.length;

    for(int var4 = 0; var4 < var3; ++var4) {
        byte v = var2[var4];
        h = 31 * h + (v & 255);
    }

    return h;
}

可能的代码是：

public static int hashCode(byte[] value) {
    int h = 0;
    int var2 = value.length;

    for(int var3 = 0; var3 < var2; ++var3) {
        byte v = value[var3];
        h = 31 * h + (v & 255);
    }

    return h;
}

在 java.lang 包中，有一个名为 StringLatin1 的实用程序 class。此 class 具有 hashCode 方法，如果当前字符串值为拉丁语，该方法将从 String class 的 hashCode 方法中调用。

PS：我用Java11.

Answer 1

无论您发布的当前代码是什么，都不是真正的代码；它是反编译代码，可能因反编译器而异，因此您不能依赖它。

Answer 2

这是 for-each 循环的标准模式。

写的时候

for(Type variable: expression) {
    // body
}

expression 将在循环开始时准确计算一次，并且在整个循环中记住生成的集合或数组引用。这也意味着，如果 expression 是一个变量并且这个变量在循环体中赋值，它对正在进行的循环没有影响。

relevant part of the specification 说：

…
Otherwise, the Expression necessarily has an array type, T[].

Let L1 ... Lm be the (possibly empty) sequence of labels immediately preceding the enhanced for statement.

The enhanced for statement is equivalent to a basic for statement of the form:
T[] #a = Expression;
L1: L2: ... Lm:
for (int #i = 0; #i < #a.length; #i++) {
    {VariableModifier} TargetType Identifier = #a[#i];
    Statement
}
#a and #i are automatically generated identifiers that are distinct from any other identifiers (automatically generated or otherwise) that are in scope at the point where the enhanced for statement occurs.

TargetType is the declared type of the local variable in the header of the enhanced for statement.

如果对比反编译后的版本

public static int hashCode(byte[] value) {
    int h = 0;
    byte[] var2 = value;
    int var3 = value.length;

    for(int var4 = 0; var4 < var3; ++var4) {
        byte v = var2[var4];
        h = 31 * h + (v & 255);
    }

    return h;
}

和the actual source code

public static int hashCode(byte[] value) {
    int h = 0;
    for (byte v : value) {
        h = 31 * h + (v & 0xff);
    }
    return h;
}

你会认得翻译的。 var2、var3、var4都是合成变量。注意事项：

原则上，编译器可以通过分析场景来识别 value 是一个局部变量，它没有在循环体中赋值，所以这里不需要额外的变量。但是，与遵循标准翻译策略相比节省的成本并不值得实施额外的逻辑。
同样，编译器决定是否记住另一个局部变量中的不变数组大小。如上所示，规范没有强制要求。

您可以说反编译器的弱点是无法识别 for-each 循环并将其翻译回来，但是，在尝试将已编译代码映射到源代码结构时通常会出现歧义，因为很多存在生成相同代码的变体。

为什么在 StringLatin1 实用程序 class 的 hashCode 方法中使用了额外的 var2 字节数组？

Why extra var2 byte array is used in hashCode method of StringLatin1 utility class?

java

java-11