定义浮点等价关系的符合标准的方法

Standard-compliant way to define a floating-point equivalence relationship

我知道浮点运算和精度损失的常见问题,所以这不是关于为什么 0.1 + 0.2 != 0.3 之类的常见问题。

相反,我实际上想在 C++ 中实现一个二进制谓词(以 100% 标准兼容的方式),它实际上实现了一个真正的数学 equivalence relationship(即是自反的、传递的和对称的),这样两个双打在相同的等价 class 如果它们在所有方面都代表完全相同的值,区分像 0.0-0.0 这样的极端情况但是将所有 NaN 值视为相同的等价物 class。 (特别是,默认的 == 不是我想要的,因为它在 NaN 的情况下是非自反的,并且不区分 0.0 和否定 -0.0,我希望处于不同的等价关系 classes,因为它们实际上是不同的值并导致不同的运行时行为)。

不以任何方式或任何实现定义的行为依赖类型双关的最短和最简单的方法是什么?到目前为止我有:

#include <cmath>

bool equiv(double x, double y)
{   
    return (x == y && (x != 0.0 || std::signbit(x) == std::signbit(y))) ||
           (std::isnan(x) && std::isnan(y));
}

我相信这可以处理我之前知道和描述的极端情况,但是是否还有其他我遗漏的无法处理的极端情况?并且上面的二元谓词保证根据C++标准定义等价关系,还是任何行为未指定、实现定义等?

看起来不错。

您实际上可以摆脱对实现 IEEE 754 的平台(Intel、Power 和 ARM 这样做)的函数调用,因为无需调用即可确定特殊的浮点值。

bool equiv(double x, double y) {
    return (x == y && (x || (1 / x == 1 / y))) || (x != x && y != y);
}

以上使用了 IEEE:

  • 非零除以零会产生保留符号的无穷大特殊值。因此 1 / -0. 产生 -infinity。具有相同符号的无限特殊值比较相等。
  • NaN 比较不相等。

但对于大多数人来说,原始版本读起来更好。从面试经验来看,并不是每个开发人员都知道特殊的浮点值是如何产生和表现的。

要是 NaN 有一种表示就好了 memcmp.


关于 C++ 和 C 语言标准,The New C Standard 书说:

The term IEEE floating point is often heard. This usage came about because the original standards on this topic were published by the IEEE. This standard for binary floating-point arithmetic is what many host processors have been providing for over a decade. However, its use is not mandated by C99.

The representation for binary floating-point specified in this standard is used by the Intel x86 processor family, Sun SPARC, HP PA-RISC, IBM P OWER PC, HP–was DEC – Alpha, and the majority of modern processors (some DSP processors support a subset, or make small changes, for cost/performance reasons; while others have more substantial differences e.g., TMS320C3x uses two’s complement). There is also a publicly available software implementation of this standard.

Other representations are still supported by processors (IBM 390 and HP–was DEC – VAX) having an existing customer base that predates the publication the documents on which this standard is based. These representations will probably continue to be supported for some time because of the existing code that relies on it (the IBM 390 and HP–was DEC– Alpha support both their companies respective older representations and the IEC 60559 requirements).

There is a common belief that once the IEC 60559 Standard has been specified all of its required functionality will be provided by conforming implementations. It is possible that a C program’s dependencies on IEC 60559 constructs, which can vary between implementations, will not be documented because of this common, incorrect belief (the person writing documentation is not always the person who is familiar with this standard).

Like the C Standard the IEC 60559 Standard does not fully specify the behavior of every construct. It also provides optional behavior for some constructs, such as when underflow is raised, and has optional constructs that an implementation may or may not make use of, such as double standard. C99 does not always provide a method for finding out an implementation’s behavior in these optional areas. For instance, there are no standard macros describing the various options for handling underflow.

What Every Computer Scientist Should Know About Floating-Point Arithmetic 说:

Languages and Compilers

Ambiguity

Ideally, a language definition should define the semantics of the language precisely enough to prove statements about programs. While this is usually true for the integer part of a language, language definitions often have a large grey area when it comes to floating-point. Perhaps this is due to the fact that many language designers believe that nothing can be proven about floating-point, since it entails rounding error. If so, the previous sections have demonstrated the fallacy in this reasoning. This section discusses some common grey areas in language definitions, including suggestions about how to deal with them.

... Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point.

... Another grey area concerns the interpretation of parentheses. Due to round-off errors, the associative laws of algebra do not necessarily hold for floating-point numbers... Whether or not the language standard specifies that parenthesis must be honored, (x+y)+z can have a totally different answer than x+(y+z), as discussed above.

.... rounding can be a problem. The IEEE standard defines rounding very precisely, and it depends on the current value of the rounding modes. This sometimes conflicts with the definition of implicit rounding in type conversions or the explicit round function in languages.

语言标准不可能指定浮点运算的结果,因为例如,可以使用 std::fesetround 在 运行 时更改舍入模式。

所以 C 和 C++ 语言别无选择,只能将浮点类型的操作直接映射到硬件指令,而不是像它们那样进行干扰。因此,这些语言不复制 IEEE/IEC 标准,也不强制执行。