无论值大小如何，一次读取一个字节字节顺序是不可知的吗？

Question

假设我正在从流中读取和写入 uint32_t 值。如果我一次 read/write 一个字节 to/from 一个流并像下面的示例那样移动每个字节，无论机器字节序如何，结果都会一致吗？

在此处的示例中，流是内存中的缓冲区，称为 p。

static uint32_t s_read_uint32(uint8_t** p)
{
    uint32_t value;
    value  = (*p)[0];
    value |= (((uint32_t)((*p)[1])) << 8);
    value |= (((uint32_t)((*p)[2])) << 16);
    value |= (((uint32_t)((*p)[3])) << 24);
    *p += 4;
    return value;
}

static void s_write_uint32(uint8_t** p, uint32_t value)
{
    (*p)[0] = value & 0xFF;
    (*p)[1] = (value >> 8 ) & 0xFF;
    (*p)[2] = (value >> 16) & 0xFF;
    (*p)[3] = value >> 24;
    *p += 4;
}

我目前无法访问 big-endian 机器来对此进行测试，但我的想法是，如果每个字节一次写入一个，则每个单独的字节都可以独立地写入流或从流中读取。然后 CPU 可以通过将这些细节隐藏在移位操作后面来处理字节顺序。这是真的吗？如果不是，谁能解释一下为什么？

Answer 1

If I read/write one byte at a time to/from a stream and shift each byte like the below examples, will the results be consistent regardless of machine endianness?

是的。您的 s_write_uint32() 函数按照从最不重要到最重要的顺序存储输入值的字节，而不考虑它们在该值的本机表示中的顺序。你的 s_read_uint32() 正确地反转了这个过程，不管 uint32_t 的底层表示如何。这些工作因为

移位运算符（<<、>>）的行为是根据左操作数的值定义的，而不是其表示
& 0xff 屏蔽了左操作数的所有位，但其 least-significant 字节的位，无论值的表示形式如何（因为 0xff 具有匹配的表示形式），并且
|=操作只是将字节放入结果；这些位置是通过前面的左移适当地选择的。如果改用 += 这可能会更清楚，但结果不会有什么不同。

但是请注意，在某种程度上，您是在重新发明轮子。 POSIX 定义了一个函数对 htonl() 和 nothl() -- 许多 non-POSIX 系统也支持 -- 用于处理 four-byte 数字中的 byte-order 问题.这个想法是，在发送时，每个人都使用 htonl() 将主机字节顺序（无论是什么）转换为网络字节顺序（big endian）并发送结果 four-byte 缓冲区。收到后，每个人都接受四个字节为一个数字，然后使用 ntohl() 将网络字节顺序转换为主机字节顺序。

Answer 2

它会工作，但是 memcpy 后跟用于写入功能的条件字节交换 will give you much better codegen。

#include <stdint.h>
#include <string.h>

#define LE (((char*)&(uint_least32_t){1})[0]) // little endian ? 
void byteswap(char*,size_t);

uint32_t s2_read_uint32(uint8_t** p)
{
    uint32_t value;
    memcpy(&value,*p,sizeof(value));
    if(!LE) byteswap(&value,4);
    return *p+=4, value;
}

 void s2_write_uint32(uint8_t** p, uint32_t value)
{
    memcpy(*p,&value,sizeof(value));
    if(!LE) byteswap(*p,4);
    *p+=4;
}

Gcc 自第 8 系列以来（但不是 clang）可以消除 little-endian 平台上的这种变化，但你应该通过 restrict 来帮助它 - 限定 doubly-indirect 指向目的地，否则它可能认为写入 (*p)[0] 会使 *p 无效（uint8_t 是一个 char 类型，因此允许为任何东西起别名）。

void s_write_uint32(uint8_t** restrict p, uint32_t value)
{
    (*p)[0] = value & 0xFF;
    (*p)[1] = (value >> 8 ) & 0xFF;
    (*p)[2] = (value >> 16) & 0xFF;
    (*p)[3] = value >> 24;
    *p += 4;
}

无论值大小如何，一次读取一个字节字节顺序是不可知的吗？

Is reading one byte at a time endianness agnostic regardless of value size?

c

endianness