读取二进制数据到long int

Question

我需要读取包含一列数字（时间标签）的二进制数据，并使用8个字节来记录每个数字。我知道它们是以 little endian 顺序记录的。如果阅读正确，它们应该被解码为（示例）

我认识到上面的数字在 2^31 -1 的阈值上。我尝试读取数据并通过以下方式反转字节顺序：（length 是字节总数，buffer 是指向包含字节的数组的指针）

unsigned long int tag;
//uint64_t tag;    
for (int j=0; j<length; j=j+8) //read the whole file in 8-byte blocks
   { tag = 0;  
     for (int i=0; i<=7; i++) //read each block ,byte by byte
        {tag ^=  ((unsigned char)buffer[j+i])<<8*i ;} //shift each byte to invert endiandness and add them with ^=
   }
                                                                                              }

当运行时，代码给出：

  ...  
  2147426467  
  2147426635  
  18446744071562097256  
  similar big numbers   
  ...

最后一个数字不是（2^64 - 1 - 正确值）。使用 uint64_t 标签 的结果相同。代码成功声明标签为

unsigned int tag;

但对于大于 2^32 -1 的标签失败。至少这是有道理的。
我想我需要对 buffer[i+j] 进行某种类型的转换，但我不知道该怎么做。

(static_cast<uint64_t>(buffer[j+i]))

也不行。
我阅读了 a similar question，但仍需要一些帮助。

Answer 1

您使用了临时值。计算机将自动保留存储临时值所需的最少数量。在您的情况下，这将是 32 位。一旦您将字节移动超过 32 位，它将被转移到遗忘中。为了解决这个问题，您需要先明确地将值存储在 64 位整数中。所以而不是

    {tag ^=  ((unsigned char)buffer[j+i])<<8*i ;}

你应该使用这样的东西

    {
       unsigned long long tmp = (unsigned char)buffer[j+i];
       tmp <<= 8*i;
       tag ^=  tmp;
    }

Answer 2

我们假设 buffer[j+i] 是 char，并且 char 已在您的平台上签名。转换为 unsigned char 会将 buffer[j+i] 转换为无符号类型。但是，在应用 << 运算符时，只要 int 可以包含 unsigned char 可表示的所有值，unsigned char 值就会提升为 int。

您尝试将 buffer[j+i] 直接转换为 uint64_t 失败，因为如果 char 是有符号的，则在值转换为无符号类型之前仍会应用符号扩展。

双重转换可能有效（即转换为 unsigned char 然后转换为 unsigned long），但是使用 unsigned long 变量来保存中间值应该使意图代码更清晰。对我来说，代码看起来像：

decltype(tag) val = static_cast<unsigned char>(buffer[j+i]);
tag ^= val << 8*i;

读取二进制数据到long int

Read binary data to long int

c++

casting

binaryfiles

endianness