为什么将变量移位超过其位宽度会清零？

Question

这个问题的灵感来自 Whosebug 的其他问题。今天，在浏览 Whosebug 时，我遇到了一个问题，即通过值 k 对变量进行位移，该值 >= 该变量的宽度（以位为单位）。这意味着将 32 位 int 移动 32 位或更多位。

Unexpected C/C++ bitwise shift operators outcome

从这些问题中可以明显看出，如果我们尝试将一个数字移位 >= 变量位宽的 k 位，则只会采用最低有效的 log2k 位。对于 32 位的 int，将低 5 位屏蔽并作为移位量。

所以一般来说，如果 w = 变量的宽度（以位为单位）， x >> k 变为 x >> (k % w) 对于 int，这是 x >> (k % 32)。

The count is masked to five bits, which limits the count range to 0 to 31.

所以我写了一个小程序来观察理论上应该产生的行为。我在评论中写下了结果转移量 % 32.

#include <stdio.h>
#include <stdlib.h>

#define PRINT_INT_HEX(x) printf("%s\t%#.8x\n", #x, x);

int main(void)
{
    printf("==============================\n");
    printf("Testing x << k, x >> k, where k >= w\n");

    int      lval = 0xFEDCBA98 << 32;
    //int      lval = 0xFEDCBA98 << 0;

    int      aval = 0xFEDCBA89 >> 36;
    //int      aval = 0xFEDCBA89 >> 4;

    unsigned uval = 0xFEDCBA89 >> 40;
    //unsigned uval = 0xFEDCBA89 >> 8;

    PRINT_INT_HEX(lval)
    PRINT_INT_HEX(aval)
    PRINT_INT_HEX(uval)

    putchar('\n');

    return EXIT_SUCCESS;
}

并且输出与移位指令的预期行为不匹配！

==============================
Testing x << k, x >> k, where k >= w
lval    00000000 
aval    00000000
uval    00000000

============================================= ========================

其实我对Java有点困惑。在 C/C++ 中，将 int 移动大于位宽的位数可能会减少 k % w，但这不是 C 标准所保证的。没有规定说这种行为应该一直发生。这是未定义的行为。

然而，Java就是这样。这是Java编程语言的规则。

Answer 1

链接的问题明确指出，移位量大于被移位类型的位宽会调用 undefined behavior，标准将其定义为 "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"

当你调用未定义的行为时，任何事情都可能发生。程序可能会崩溃，可能会输出奇怪的结果，或者看起来工作正常。此外，如果您使用不同的编译器或在同一编译器上使用不同的优化设置，未定义行为的表现方式也会发生变化。

C 标准在第 6.5.7p3 节中对位移运算符作了如下说明：

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

在这种情况下，编译器可能会像您建议的那样减少以位宽为模的移位量，或者它可以将其视为在数学上移位该量，从而导致所有位都为 0。两者都是有效的结果，因为标准没有指定行为。

Answer 2

未定义的一个原因是 8086，即原始的 x86，没有屏蔽移位计数中的任何位。它实际上执行了轮班，每个位置使用一个时钟滴答。

Intel 随后意识到允许 255+ 个时钟滴答用于移位指令可能不是一个好主意。例如，他们可能考虑了最大中断响应时间。

来自我的旧 80286 手册：

To reduce the maximum execution time, the iAPX 286 does not allow shift counts greater than 31. If a shift count greater than 31 is attempted, only the bottom five bits of the shift count are used. The iAPX 86 uses all 8 bits of the shift count.

对于 PC/XT 和 PC/AT 上的完全相同的程序，您会得到不同的结果。

那么语言标准应该怎么说呢？

Java 通过不使用底层硬件解决了这个问题。 C反而选择说效果不明

为什么将变量移位超过其位宽度会清零？

Why does shifting a variable by more than its width in bits zeroes out?

c

bit-shift

undefined-behavior