如果 std::numeric_limits<float>::is_iec559 为真，这是否意味着我可以以明确定义的方式提取指数和尾数？

Question

我构建了 frexp 的自定义版本：

auto frexp(float f) noexcept
{
    static_assert(std::numeric_limits<float>::is_iec559);

    constexpr uint32_t ExpMask = 0xff;
    constexpr int32_t ExpOffset = 126;
    constexpr int MantBits = 23;

    uint32_t u;
    std::memcpy(&u, &f, sizeof(float)); // well defined bit transformation from float to int

    int exp = ((u >> MantBits) & ExpMask) - ExpOffset; // extract the 8 bits of the exponent (it has an offset of 126)

    // divide by 2^exp (leaving mantissa intact while placing "0" into the exponent)
    u &= ~(ExpMask << MantBits); // zero out the exponent bits
    u |= ExpOffset << MantBits; // place 126 into exponent bits (representing 0)

    std::memcpy(&f, &u, sizeof(float)); // copy back to f
    return std::make_pair(exp, f);
}

通过检查 is_iec559，我确保 float 满足

the requirements of IEC 559 (IEEE 754) standard.

我的问题是：这是否意味着我正在做的位操作定义明确并且可以做我想做的事情？如果不是，是否有办法修复它？

我测试了它的一些随机值，它似乎是正确的，至少在使用 msvc 编译的 Windows 10 和 wandbox 上是这样。但是请注意，（故意）我没有处理次正规 NaN 和 inf.

的边缘情况

如果有人想知道我为什么这样做：在基准测试中，我发现这个版本的 frexp 比 Windows 10 上的 std::frexp 快 15 倍。我没有尚未测试其他平台。但我想确保这不仅仅是巧合，而且将来可能会停止。

编辑：

如评论中所述，字节顺序可能是一个问题。有人知道吗？

Answer 1

"Does this mean that the bit operations I'm doing are well defined..."

TL;DR;，根据"well defined"的严格定义：没有.

您的假设可能是正确的，但定义不明确，因为无法保证位宽或float。来自 § 3.9.1:

there are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined.

is_iec559 子句仅符合：

True if and only if the type adheres to IEC 559 standard

如果文字精灵给你写了一个糟糕的编译器，并使 float = binary16, double = binary32，并且 long double = binary64，并使 is_iec559 对所有类型为真，它仍然会遵守标准。

does that mean that I can extract exponent and mantissa in a well defined way?

TL;DR;，由 C++ 标准的有限保证：否。

假设您使用 float32_t 且 is_iec559 为真，并且从所有规则逻辑上推断它只能是 binary32 且没有陷阱表示，并且您正确地争辩说 memcpy 对于相同宽度的算术类型之间的转换定义明确，并且不会破坏严格的别名。即使有所有这些假设，行为可能 明确定义 但它只是可能而不是保证您可以通过这种方式提取尾数。

IEEE 754 标准和 2 的补码考虑 位字符串编码，memcpy 的行为使用字节描述.虽然假设 uint32_t 和 float32_t 的位串将以相同的方式编码（例如相同的字节顺序）是合理的，但标准中并没有对此做出保证。如果位字符串的存储方式不同，并且您移动并屏蔽复制的整数表示形式以获得尾数，则答案将不正确，尽管 memcpy 行为已明确定义。

As mentioned in the comments, endianess could be an issue. Does anybody know?

至少 a few architectures 对浮点寄存器和整数寄存器使用了不同的字节顺序。同样的 link 表示，除了小型嵌入式处理器，这不是问题。我完全信任维基百科的所有主题，并拒绝做任何进一步的研究。

如果 std::numeric_limits<float>::is_iec559 为真，这是否意味着我可以以明确定义的方式提取指数和尾数？

If std::numeric_limits<float>::is_iec559 is true, does that mean that I can extract exponent and mantissa in a well defined way?

c++

floating-point

undefined-behavior

language-lawyer

编辑：