C语言静态存储联合和命名成员初始化

Static storage union and named members initialization in C language

ISO/IEC C9899:1999 标准的第 6.7.8.10 章描述了如何初始化具有静态存储持续时间的联合:

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:

  • if it has pointer type, it is initialized to a null pointer;
  • if it has arithmetic type, it is initialized to (positive or unsigned) zero;
  • if it is an aggregate, every member is initialized (recursively) according to these rules;
  • if it is a union, the first named member is initialized (recursively) according to these rules.

假设我们有以下代码,实际上联合体的第二个成员比联合体的第一个成员具有更大的内存占用:

typedef struct
{
    uint32_t a;
    uint32_t b;
    uint32_t c; 
    uint32_t d;
} my_structure_t;

typedef union
{
    uint8_t *first_member;
    my_structure_t later_member;  
} my_union_t;

static my_union_t data;

是否在 C 标准的某处定义了 later_member 占用的内存区域将如何初始化?因为下面的陈述我怀疑这是实现定义的行为,但是我需要确认并至少链接到一些描述它的 gcc、clang、ghs 文档。

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1311.pdf

Then comes DR_016 which (in question 2) starts of with: This one is relevant only for hardware on which either null pointer or floating point zero is /not/ represented as all zero bits. It deals with

union { char *p; int i; } x;

and then states:

If the null pointer is represented as, say, 0x80000000, then there is no way to implicitly initialize this object. Either the p member contains the null pointer, or the i member contains 0, but not both. So the behavior of this translation unit is undefined. This is a bad state of affairs. I assume it was not the Committee's intention to prohibit a large class of implicitly initialized unions; this would render a great deal of existing code nonconforming.

问题主要针对 C99 标准,但欢迎与其他 C 标准进行比较。

理论上只清零指针,但通常整个.bss段都清零。为了确保只需更改成员的顺序。

因为只有 union 的第一个命名成员被初始化,属于其他成员的任何其他剩余字节仍未初始化并且具有 不确定值 ,以及任何尾随填充设置为 0。

您上面提到的同一个子句在 C11 中的措辞略有不同:

if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

因为一个 union 一次只能包含一个成员,所以没有设置对应于其他成员的字节不是问题,因为标准说你无论如何都不应该读取它们。

根据 C99 标准,具有静态存储持续时间且未显式初始化的联合对象,编译器应初始化第一个命名成员(递归)1)。因此,如果第一个成员的大小大于 union 的第一个命名成员,则不应对第一个成员以外的成员的值做出任何假设,因为按照标准,编译器只会初始化 union 的第一个命名成员和剩余的字节大于第一个命名成员的成员未初始化。

C 标准没有提及任何有关数据段 (initialized/uninitialized)、堆栈、堆等的内容。这些都是 architecture/platform 特定的。对于对象初始化(在静态存储持续时间的情况下),C 标准仅指定要初始化为 0/NULL 和不初始化的内容,也没有指定哪个存储持续时间对象进入哪个段.标准规范是针对编译器的,一个好的编译器应该遵循它们。通常,0 初始化的静态数据进入 .BSS(由符号开始的块),非 0 初始化的数据进入 .DATA(数据段)。因此,您可能会发现 later_member 结构(这是联合 my_union_t 的第二个成员)成员值 0 但情况可能并非总是如此。


C11 标准包括关于联合的填充字节的规范(根据 6.7.9p10)[重点添加]:

10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then: ......

......

  • if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;