为什么 char 缓冲区内容大小是 4 个字节?

why the char buffer content size is 4 bytes?

我调用从套接字接收数据的 recv() 并通过十六进制打印缓冲区内容的结尾

char nbuff[BUFSZ];
while ((r_n=recv(sfd,rbuff,B_BUF,MSG_EOF))>-1)
{
    printf("r_n:%d eob_p:%x\n",r_n,rbuff[r_n-1]);
    if (r_n==0)
    {
        break;
    }
    memset(rbuff,0,B_BUF);
}

结果是

r_n:1674 eob_p:3c
r_n:1228 eob_p:76
r_n:2456 eob_p:ffffff81
r_n:1228 eob_p:4b
r_n:1228 eob_p:49
r_n:2456 eob_p:57
r_n:1417 eob_p:ffffff82

我很困惑为什么结果是 4 个字节。 我创建了另一个代码来打印从 buff

保存的文件
int main ()
{
    char buff[11686];
    memset(buff,0,11686);
    FILE *in =fopen("web/www.sse.com.cn.html","r");
    fread(buff,11686,1,in);
    for (int i = 0; i <  11686 ; i++)
    {
        printf("%x\n",buff[i]);
    }
}

结果是

....
buff[11684]:60
buff[11685]:ffffff82

为什么 char buff 的内容大小是 4 个字节 buff[11685]:ffffff82

诊断

在第二个例子中,buff 是一个 char 缓冲区,普通 char 是你机器上的一个有符号类型,你在 [= 中存储负值11=],因此当它们在 printf() 的调用中转换为 int 时,它们是负整数(小量级),以十六进制打印。

ISO/IEC9899:2018

实际上,链接指向 C11 的在线草案,而不是 C18,在 HTML 中,允许链接到标准中的相关段落。 AFAIK,无论如何,这些细节在 C90、C99、C11 和 C18 之间没有改变。

标准说普通 char 类型等同于 signed charunsigned char

§6.2.5 Types ¶15:

The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.45)

45) CHAR_MIN, defined in <limits.h>, will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either.

§6.3.1.1 Boolean, characters and integers ¶2,3:

2 The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.58) All other types are unchanged by the integer promotions.

3 The integer promotions preserve value including sign. As discussed earlier, whether a "plain" char is treated as signed is implementation-defined.

58) The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclasses.

§6.5.2.6 Function calls ¶6,7:

6 If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:

  • one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
  • both types are pointers to qualified or unqualified versions of a character type or void.

7 If the expression that denotes the called function has a type that does include a prototype, the arguments are implicitly converted, as if by assignment, to the types of the corresponding parameters, taking the type of each parameter to be the unqualified version of its declared type. The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments.

训诂

注意 §6.5.2.6 ¶7 的最后两句 — 当 char 值被 'integer promotions' 提升时,它们被提升为(带符号的)int,并且负值保持负值。由于 int 有 4 个字节,并且您可能拥有的所有机器都使用二进制补码算法,因此值的最重要的 3 个字节将分别为 0xFF。

处方

要始终为字符打印 2 位十六进制,请使用 %.2X(如果您愿意,也可以使用 %.2x;您也可以使用 %02X%02x)并将 (unsigned char)rbuff[r_n-1]rbuff[r_n-1] & 0xFF 作为参数传递(使用第一个示例中的变量)。或者,使用第二个示例中的变量:

printf("%.2X\n", (unsigned char)buff[i]);
printf("%.2X\n", buff[i] & 0xFF);