具有常量表达式的初始化程序在 C99 中可能溢出

Question

这是有效的 C99 代码吗？如果是，它是否定义了实现定义的行为？

int a;
unsigned long b[] = {(unsigned long)&a+1};

根据我对 C99 标准的理解，来自 ISO C99 标准的 §6.6，这可能是有效的：

An integer constant expression shall have integer type and shall only have operands that are integer constants (...) Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.

More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:

an arithmetic constant expression,

(...)

an address constant for an object type plus or minus an integer constant expression.

但是，由于存在加法溢出的可能性，这可能不被视为常量表达式，因此不是有效的 C99 代码。

有人可以确认我的推理是否正确吗？

请注意，即使在使用 -std=c99 -pedantic 时，GCC 和 Clang 都会在没有警告的情况下接受此代码。但是，当转换为 unsigned int 而不是 unsigned long 时，即使用以下代码：

int a;
unsigned long b[] = {(unsigned int)&a+1};

然后两个编译器都抱怨该表达式不是编译时常量。

Answer 1

来自这个 clang 开发人员线程的类似问题：Function pointer is compile-time constant when cast to long but not int? 理由是标准不要求编译器支持这个（这个场景不包含在任何项目符号中6.6p7)，尽管允许支持此支持截断地址会很麻烦：

I assume that sizeof(int) < sizeof(void(*)()) == sizeof(long) on your target. The problem is that the tool chain almost certainly can't express a truncated address as a relocation.

C only requires the implementation to support initializer values that are either (1) constant binary data, (2) the address of some object, or (3) or an offset added to the address of some object. We're allowed, but not required, to support more esoteric things like subtracting two addresses or multiplying an address by a constant or, as in your case, truncating the top bits of an address away. That kind of calculation would require support from the entire tool chain from assembler to loader, including various file formats along the way. That support generally doesn't exist.

你的案例是将指针转换为整数类型，不符合 6.6 段落 7:

下的任何案例

More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:

an arithmetic constant expression,

anull pointer constant,

an address constant, or

an address constant for an object type plus or minus an integer constant expression.

但如post中所述，编译器允许支持其他形式的常量表达式：

An implementation may accept other forms of constant expressions.

但 clang 和 gcc 都不接受。

Answer 2

首先，你的初始化器不一定是常量表达式。如果 a 具有局部作用域，则在运行时间内将其压入堆栈时为其分配一个地址。 C11 6.6/7 说，为了使指针成为常量表达式，它必须是一个 地址常量，它在 6.6/9 中定义为：

An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator; it shall be created explicitly using the unary & operator or an integer constant cast to pointer type, or implicitly by the use of an expression of array or function type.

（强调我的）

至于你的代码是不是标准C，是的。允许将指针转换为整数，尽管它们可能带有各种形式的指定不当的行为。 6.5/6 中指定：

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

为了安全地确保指针可以放入整数，您需要使用uintptr_t。但我不认为指向整数转换的指针是您发布此问题的原因。

关于整数溢出是否会阻止它成为编译时间常量，我不确定您是从哪里得到这个想法的。我不相信你的推理是正确的，例如 (INT_MAX + INT_MAX) 是一个编译时间常数，它也肯定会溢出。（GCC 给你一个警告。）如果它溢出，它会调用未定义的行为。

至于为什么会出现表达式不是编译时常量的错误，我不知道。我无法在 gcc 4.9.1 上重现它。我尝试用静态和自动存储持续时间声明 a，但没有区别。

听起来你好像不小心编译成 C90，在这种情况下 gcc 会告诉你 "error: initializer element is not computable at load time"。或者可能有一个编译器错误已在我的 gcc 版本中修复。

Answer 3

符合规范的实现不需要接受此代码。你在问题中引用了相关段落：

More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:

an arithmetic constant expression,

a null pointer constant,

an address constant, or

an address constant for an object type plus or minus an integer constant expression.

(unsigned long)&x 是那些东西的 none。它不是算术常数，因为 C11 6.6/8:

Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types

(指针类型不是算术类型，6.2.5/18);并且它不是地址常量，因为所有地址常量都是指针 (6.6/9)。最后一个指针加上或减去一个 ICE 是另一个指针，所以也不是。

但是 6.6/10 说一个实现可以接受其他形式的常量表达式。我不确定这是否意味着原始代码应该被称为格式错误（格式错误的代码需要诊断）。很明显你的编译器在这里接受一些其他常量表达式。

下一个问题是从指针到整数的转换是实现定义的。如果没有对应于特定指针的整数表示，它也可能是未定义的。 (6.3.2.3/6)

最后，最后的+ 1没有区别。 unsigned long 算术在加法和减法上是明确定义的，所以当且仅当 (unsigned long)&x 是 OK 时才可以。

具有常量表达式的初始化程序在 C99 中可能溢出

Initializer with constant expression having possible overflow in C99

c

c99

undefined-behavior

language-lawyer

constant-expression