什么时候是浮点运算'invalid'？

Question

考虑在 Xeon 15something 上（msvc15 和 16，即 Visual Studio 2017 和 2019）：

int main()
{
    unsigned int x;
    uint8_t val;
    float f;

    x = _status87();    // x = 0 here, OK
    f = -1.00e+9;
    x = _status87();    // x = 0 here, OK
    val = uint8_t(f);   // val = 0 here, I can live with that
    x = _status87();    // x = 0 here, OK
    f = -1.00e+10;
    val = uint8_t(f);   // val = 0 here, I can live with that
    x = _status87();    // x = 16 = _EM_INVALID, wtf?
}

很明显，某些类型转换会给出 'wrong' 结果，即当您要存储的数字大于适合特定类型变量的数字时，无法存储该值。我的问题是 - 为什么浮点寄存器的状态标志设置为 'invalid'？ Over/underflow and/or 不准确我可以忍受，为什么 'invalid'？我无法在任何地方找到特定 CPU 考虑 'invalid' 浮点运算的任何定义。我也不知道为什么，对于尾数 9，该寄存器未设置（尽管值不合适且转换结果为 0），但对于尾数 10，它被标记.在我看来，没有相关的 maximum/minimum 达到该阈值。

更重要的是（对我而言），有没有一种方法可以让我转换成不触及浮点寄存器的方式？原因是我正在处理的代码（稍后）依赖于不处于 'invalid' 状态的寄存器，并且我无法合理或可靠地修改该寄存器标志检查的每次使用。但也只是重置标志是容易出错的（由于其他地方的假设，'elsewhere' 是我无法触及的代码）。我一直在看 boost::numeric_cast 但这似乎对这里没有任何帮助，除非我在某处遗漏了什么？

但总的来说，任何有关 'invalid' 浮点运算如何工作的帮助都会有所帮助。

Answer 1

在generated assembly, we can see that for the conversion is used the instruction cvttss2si. The documentation for this instruction reads中：

Converts a single-precision floating-point value in the source operand (the second operand) to a signed double-word integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand).

由于这里使用的寄存器是eax，这里适用double-word的情况。接下来，写着：

If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised.

在您的情况下，-1e9 可以存储在带符号的双字中，但 -1e10 不能。然后异常似乎只是翻译成 _status87() 函数读取的状态寄存器。

请注意，根据此处的 C++ 标准，行为是 未定义 conv.fpint/1:

A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

这适用于 f 的两个值。

什么时候是浮点运算'invalid'？

When is a floating point operation 'invalid'?

c++

precision

boost

numeric