理解太小类型的移位操作

Understanding shift operations on too small types

我的代码在 C 中有一个按位移位的问题,我归结为以下示例:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>

int main(int argc, char **argv)
{
  char a = 1;
  int i=0;
  uint64_t b;
  for(i=0; i<64; i++)
  {
    b=0;
    printf("i=%d\n", i);
    b = (a<< ((uint64_t) i));
    printf("%" PRIu64 "\n", b);
  }
  return 0;

}

由于在此 MWE 中不明显的原因,a 是一个 char,我想从中生成 2 的幂,直到 2^63。它失败了,因为发生了一些奇怪的事情,因为 a 而不是 uint64_t。明显的修复是

b = ((uint64_t)a<< ((uint64_t) i))

为了理解到底发生了什么,我写了上面的 MWE 示例,我从中获得了输出(部分显示):

i=30  
1073741824  
i=31  
18446744071562067968  
i=32  
1  
i=33  
2  

现在我想知道如何解释从 i=30i=31 的跳转?它是如何在 i=32 再次变为 1 的?

如果有兴趣,我使用 gcc (gcc (SUSE Linux) 4.8.5)

编译了上面的代码

因为这是未定义的行为。

C99

shift-expression:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

此处您的 char a 被提升为 int[查看 here 原因]。而在你的机器中,int 的宽度不大于 63,因此根据标准,行为是未定义的。

您正在执行的转换大于相关类型的大小。

首先,a(属于 char 类型)升级为 int 类型。这在 C standard:

的第 6.3.1.1 节中有详细说明

The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

假设 int 在您的系统上是 32 位,向左移动超过 30 位会调用未定义的行为。这在标准的第 6.5.7 节中有详细说明:

3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined

4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

这里出现的(有点出乎意料的)规则是,当涉及到 << 时,

the type of the result is that of the promoted left operand

这意味着 a << (uint64_t) i 的类型是 int,因为 achar 类型自动扩展为 int

您的 int 似乎是 32 位 2 的补码符号类型。因此,对于大于或等于 31 的 i,表达式的行为是 undefined.

很多答案已经解决了您的代码具有未定义行为的问题。因此,对发生的事情进行推理通常没有意义(因为任何事情都可能发生......)。

但是....有时候这样做实际上很有趣 - 请记住,这纯粹是猜测并且不能保证是正确的 高度依赖系统 和 ....

所有免责声明都已到位...

Where could the odd number 18446744071562067968 come from?

让我们假设 32 位 int 并且还要注意您的 char a = 1; 也可能是 int a = 1; 因为整数提升。所以我们可以这样写:

int a = 1;
int as = a << 31;  // Undefined behavior here as 1*2^31 can't be stored in 32 bit int
uint64_t b = as;
printf("%" PRIu64 "\n", b);

输出:

18446744071562067968

嗯....我们有那个神秘的 18446744071562067968 但是为什么呢?

让我们再打印一次:

int a = 1;
int as = a << 31;  // Undefined behavior here
uint64_t b = as;
printf("%d\n", as);
printf("%" PRIu64 "\n", b);

输出

-2147483648
18446744071562067968

所以 as 是负的,所以我们确实做到了:

uint64_t b = -2147483648;

由于 b 是无符号的,上面的计算公式为:

uint64_t b = UINT64_MAX + 1 - 2147483648;  // which is 18446744071562067968

所以现在我们知道 18446744071562067968 来自哪里了。毕竟没那么神秘。

但这留下了另一个问题 - 为什么 as 是负数?

嗯....让我们再打印一次:

int a = 1;
int as = a << 31;  // Undefined behavior here
uint64_t b = as;
printf("%d\n", as);
printf("%x\n", as);  // Print as in hex
printf("%" PRIu64 "\n", b);

输出:

-2147483648
80000000
18446744071562067968

所以在十六进制中 as80000000 实际上是 1 左移了 31 次。所以处理器只是做了我们要求的事情,即 1 << 31 C 标准没有定义它,但是 your/my 处理器只是实现了那样做——其他机器可能会做其他事情。

80000000 因为 32 位 2 的补码是 -2147483648 因此我们知道为什么 as 是负数。