std::min(0.0, 1.0) 和 std::max(0.0, 1.0) 会产生未定义的行为吗？

Question

问题很清楚了。下面给出了我认为这些表达式可能产生未定义行为的原因。我想知道我的推理是对还是错，为什么。

简读:

(IEEE 754) double 不是 Cpp17LessThanComparable 因为 < 由于 NaN 而不是严格的弱排序关系。因此，违反了 std::min<double> 和 std::max<double> 的 Requires 元素。

长读:

所有参考文献都在 n4800 之后。 std::min 和 std::max 的规范在 24.7.8 中给出：

template<class T> constexpr const T& min(const T& a, const T& b);
template<class T> constexpr const T& max(const T& a, const T& b);
Requires: [...] type T shall be Cpp17LessThanComparable (Table 24).

Table 24 定义了 Cpp17LessThanComparable 并表示：

Requirement: < is a strict weak ordering relation (24.7)

第 24.7/4 节定义了严格的弱排序。特别是，对于 <，它指出 "if we define equiv(a, b) as !(a < b) && !(b < a) then equiv(a, b) && equiv(b, c) implies equiv(a, c)".

现在，根据 IEEE 754 equiv(0.0, NaN) == true，equiv(NaN, 1.0) == true 但 equiv(0.0, 1.0) == false 我们得出结论 < 不是一个严格的弱排序。因此，(IEEE 754) double not Cpp17LessThanComparable 违反了 Requires std::min 和 std::max 的子句。

最后，15.5.4.11/1 说：

Violation of any preconditions specified in a function’s Requires: element results in undefined behavior [...].

更新一：

问题的重点不是要争论 std::min(0.0, 1.0) 是未定义的，当程序计算这个表达式时任何事情都可能发生。它returns0.0。时期。（我从来没有怀疑过。）

关键是要显示标准的（可能的）缺陷。在对精确度的值得称赞的追求中，标准经常使用数学术语，弱严格排序只是一个例子。在这些情况下，数学精度和推理必须一路走下去。

例如，请看维基百科对 strict weak ordering 的定义。它包含四个要点，每个要点都以 "For every x [...] in S..." 开头。 None 其中 "For some values x in S that make sense for the algorithm"（什么算法？）。此外，std::min 的规范明确指出“T 应为 Cpp17LessThanComparable”，这意味着 < 是一个严格的弱排序T。因此，T 在维基百科页面中扮演集合 S 的角色，当 T 的值被整体考虑时，四个要点必须成立。

显然，NaN 与其他双精度值完全不同，但它们仍然是 可能的值。我在标准（相当大，1719 页，因此这个问题和语言律师标签）中没有看到任何 数学上 得出 std::min 的结论如果不涉及 NaN，则可以使用双打。

实际上，可以说 NaN 很好，而其他双精度数才是问题所在！事实上，回想一下，有几个可能的 NaN 双精度值（2^52 - 其中 1 个，每个承载不同的有效负载）。考虑包含所有这些值的集合 S 和一个 "normal" double，比如 42.0。在符号中，S = { 42.0, NaN_1, ..., NaN_n }。事实证明 < 是 S 上的严格弱排序（证明留给 reader）。这组值是 C++ 委员会在指定 std::min 时考虑到的，就像在 "please, do not use any other value otherwise the strict weak ordering is broken and the behavior of std::min is undefined" 中那样吗？我打赌它不是，但我更愿意在标准中阅读这个而不是推测 "some values" 的意思。

更新二：

将 std::min 的声明（上图）与 clamp 的声明进行对比 24.7.9：

template<class T> constexpr const T& clamp(const T& v, const T& lo, const T& hi);
Requires: The value of lo shall be no greater than hi. For the first form, type T shall be Cpp17LessThanComparable (Table 24). [...]
[Note : If NaN is avoided, T can be a floating-point type. — end note]

这里我们清楚地看到“std::clamp 对双打没问题，前提是不涉及 NaN”。我正在为 std::min.

寻找相同类型的句子

Barry 在他的 . Apparently, this was added post-C++17 coming from P0898R0 中提到的段落 [structure.requirements]/8 值得注意：

Required operations of any concept defined in this document need not be total functions; that is, some arguments to a required operation may result in the required semantics failing to be satisfied. [Example: The required < operator of the StrictTotallyOrdered concept (17.5.4) does not meet the semantic requirements of that concept when operating on NaNs. — end example ] This does not affect whether a type satisfies the concept.

这是解决我在这里提出的问题的明确尝试，但在概念的上下文中（正如 Barry 所指出的，Cpp17LessThanComparable 不是概念）。另外，恕我直言，这一段也缺乏精确性。

Answer 1

唯一可能的（不仅仅是似是而非的）解释是方程适用于函数范围内的值；即算法中实际使用的值。

您可能会想到定义一组值的类型，但对于 UDT 来说这无论如何都没有意义。您将范围解释为类型的每个可能值显然是荒谬的。

这里没问题这里。

在 实现中可能存在一个非常严重的问题，其中浮点值的精度不能超过类型 所允许的精度，作为数学值的整体概念浮点类型的值失去所有意义，因为编译器可能随时决定更改浮点类型的值以删除精度。事实上，在这种情况下无法定义任何语义。任何这样的实现都被破坏了，任何程序都可能只是偶然运行。

编辑：

类型不定义算法的一组值。这对于具有未在任何代码中正式指定的内部不变量的用户数据类型来说是显而易见的。

可用于任何容器、算法（容器在内部对元素使用算法）的值集...是该容器或算法的特定用途的属性。这些库组件不共享它们的元素：如果您有两个 set<fraction> S1 和 S2，它们的元素将不会被另一个使用：S1 将比较 S1 中的元素，S2 将比较 S2 中的元素。这两个集合存在于不同的 "universes" 中，并且它们的逻辑属性是隔离的。不变量对每个独立地成立； 如果你在S2中插入一个不小于或大于S1中x1的元素x2（因此认为是等价的），你不希望在S1中x1的位置找到x2！ 容器之间不可能共享数据结构，元素不能在算法之间共享（不能有模板类型的静态变量，因为它会有意外的生命周期）。

有时标准是一个谜语，您必须在其中找到正确的解释（最合理、最有用、最有可能是预期的）；如果委员会成员被要求澄清一个问题，他们会选择最多的 X 解释（X = 合理的，有用的......），即使它与之前的措辞完全矛盾，所以当文本晦涩难懂或给出疯狂的结论时，你不妨跳过字面阅读，直接跳到最有用的

此处唯一的解决方案是模板库组件的每次使用都是独立的，并且方程式只需要在该次使用期间成立。

您不希望 vector<int*> 无效，因为指针可能具有无法复制的无效值：只有使用此类值是非法的。

因此

vector<int*> v;
v.push_back(new int);
vector<int*> v2 = v; // content must be valid
delete v[0];
v[0] = null; // during v[0] invocation (int*)(v[0]) has no valid value

是有效的，因为 元素类型的必需属性在短时间内有效，需要 。

在那种情况下，我们可以调用一个向量的成员函数，因为它的元素不符合可分配概念，因为不允许分配，因为无异常保证不允许这样做：值存储在v[0] 不能被 v[0] 使用，在 vector<>::operator[].

允许的元素上没有用户定义的操作

库组件只能对该调用中使用的值使用特定函数描述中提到的特定操作；即使对于内置类型，它也不能以任何其他方式生成值：如果未在特定实例中插入或查找 0，则特定 set<int,comp> 实例可能不会将值与 0 进行比较，因为 0 甚至可能不在comp.

的域

所以内置或class类型在这里被统一处理。即使使用内置类型实例化，库实现也不能假设值集上的任何内容。

Answer 2

在新的 [concepts.equality] 中，在稍微不同的上下文中，我们有：

An expression is equality-preserving if, given equal inputs, the expression results in equal outputs. The inputs to an expression are the set of the expression's operands. The output of an expression is the expression's result and all operands modified by the expression.

Not all input values need be valid for a given expression; e.g., for integers a and b, the expression a / b is not well-defined when b is 0. This does not preclude the expression a / b being equality-preserving. The domain of an expression is the set of input values for which the expression is required to be well-defined.

虽然这个表达式域的概念没有在整个标准中完全表达，但这是唯一合理的意图：句法要求是类型的属性，语义要求是实际值的属性。

更一般地说，我们还有 [structure.requirements]/8:

Required operations of any concept defined in this document need not be total functions; that is, some arguments to a required operation may result in the required semantics failing to be satisfied. [ Example: The required < operator of the StrictTotallyOrdered concept ([concept.stricttotallyordered]) does not meet the semantic requirements of that concept when operating on NaNs. — end example ] This does not affect whether a type satisfies the concept.

这专门指的是概念，而不是像 Cpp17LessThanComparable 这样的命名要求，但这是理解库打算如何工作的正确精神。

当Cpp17LessThanComparable给出

的语义要求

< is a strict weak ordering relation (24.7)

违反此规定的唯一方法是提供一对违反严格弱排序要求的值。对于像 double 这样的类型，那将是 NaN。 min(1.0, NaN) 是未定义的行为——我们违反了算法的语义要求。但是对于没有 NaN、< 的浮点数， 是一个严格的弱排序——所以没关系……你可以使用 min、max, sort, 随便你。

展望未来，当我们开始编写使用 operator<=> 的算法时，域的概念是表达 ConvertibleTo<decltype(x <=> y), weak_ordering> 的句法要求是错误要求的原因之一。让 x <=> y 成为 partial_ordering 很好，它只是看到一对 x <=> y 不是 partial_ordering::unordered 的值（至少我们可以通过 [=34= 诊断） ])

Answer 3

免责声明：我不知道完整的 C++ 标准，我确实研究了一些关于浮点数的内容。我知道 IEEE 754-2008 浮点数和 C++。

是的，你是对的，这是 C++17 标准未定义的行为。

简读：

标准没有说std::min(0.0, 1.0);是未定义的行为，而是说constexpr const double& min(const double& a, const double& b);是未定义的行为。这意味着，它没有应用未定义的函数，它是未定义的函数声明本身。正如数学上的情况：正如您所指出的，在 IEEE 754 浮点数的 全范围 上不可能有最小函数。

但未定义的行为并不一定意味着崩溃或编译错误。它只是表示它没有被C++标准定义，具体说它可能 "behaving during translation or program execution in a documented manner characteristic of the environment"

为什么不应该在双打上使用 std::min：

因为我意识到下面的长读部分会变得无聊，这里有一个玩具示例，说明比较中 NaN 的风险（我什至没有尝试排序算法……）：

#include <iostream>
#include <cmath>
#include <algorithm>

int main(int, char**)
{
    double one = 1.0, zero = 0.0, nan = std::nan("");

    std::cout << "std::min(1.0, NaN) : " << std::min(one, nan) << std::endl;
    std::cout << "std::min(NaN, 1.0) : " << std::min(nan, one) << std::endl;

    std::cout << "std::min_element(1.0, 0.0, NaN) : " << std::min({one, zero, nan}) << std::endl;
    std::cout << "std::min_element(NaN, 1.0, 0.0) : " << std::min({nan, one, zero}) << std::endl;

    std::cout << "std::min(0.0, -0.0) : " << std::min(zero, -zero) << std::endl;
    std::cout << "std::min(-0.0, 0.0) : " << std::min(-zero, zero) << std::endl;
}

在我的 macbookpro 上使用 Apple LLVM 版本 10.0.0 (clang-1000.10.44.4) 进行编译时（我精确地说，因为，这个是未定义的行为，所以这在理论上可能对其他编译器有不同的结果）我得到：

$ g++ --std=c++17 ./test.cpp
$ ./a.out
std::min(1.0, NaN) : 1
std::min(NaN, 1.0) : nan
std::min_element(1.0, 0.0, NaN) : 0
std::min_element(NaN, 1.0, 0.0) : nan
std::min(0.0, -0.0) : 0
std::min(-0.0, 0.0) : -0

这意味着与您可能假设的相反，当涉及 NaN 甚至 -0.0 时，std::min 不是对称的。并且 NaN 不会传播。简短的故事：这确实让我在之前的项目中感到有些痛苦，我必须实现自己的 min 函数才能按照项目规范的要求在两侧正确传播 NaN。因为双打 上的 std::min 未定义 !

IEEE 754:

如您所见，IEEE 754 浮点数（或 ISO/IEC/IEEE 60559:2011-06，这是 C11 标准使用的规范，见下文，它或多或少地复制了 IEEE754 用于C语言）没有严格的弱序，因为NaNs违反了不可比性的传递性 (fourth point of the Wikipedia page)

有趣的是，IEEE754 规范在 2008 年进行了修订（现命名为 IEEE-754-2008），which includes a total ordering function。事实上，C++17 和 C11 都没有实现 IEE754-2008，而是 ISO/IEC/IEEE 60559:2011-06

但谁知道呢？也许将来会改变。

长读：

首先，让我们首先回顾一下未定义的行为实际上是什么，来自 the same standard draft you linked（重点是我的）：

undefined behavior behavior for which this document imposes no requirements

[Note 1 to entry: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. Evaluation of a constant expression never exhibits behavior explicitly specified as undefined in Clause 4 through Clause 14 of this document (7.7). —end note]

不存在 "yielding" 未定义的行为。它只是 C++ 标准中没有定义的东西。这可能意味着您可以使用它并自行承担风险获得正确的结果（例如通过 std::min(0.0, 1.0); 或者它可能会引发警告甚至编译错误，如果您发现编译器对浮点数非常小心！

关于子集……你说：

I do not see anything in the Standard (which is quite big, 1719 pages, and hence this question and the language-lawyer tag) that mathematically leads to the conclusion that std::min is fine with doubles provided that NaNs are not involved.

我自己也没看过标准，但是从你发的那部分来看，标准好像已经说可以了。我的意思是，如果你构造 一个新类型 T 来包装不包括 NaN 的双精度数，那么 template<class T> constexpr const T& min(const T& a, const T& b); 的定义适用于你的新类型 将具有定义的行为，并且行为与您对最小函数的期望完全相同。

我们还可以查看 double 上操作 < 的标准定义，它在 25.8 浮点类型的数学函数 部分中定义] 这说的不是很有帮助：

The classification / comparison functions behave the same as the C macros with the corresponding names defined in the C standard library. Each function is overloaded for the three floating-point types. See also: ISO C 7.12.3, 7.12.4

the C11 standard 说的是什么？（因为我猜C++17不使用C18）

The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of the relationships — less, greater, and equal — is true. Relational operators may raise the ‘‘invalid’’ floating-point exception when argument values are NaNs. For a NaN and a numeric value, or for two NaNs, just the unordered relationship is true.241)

C11使用的规范，在该规范的附录F下：

This annex specifies C language support for the IEC 60559 floating-point standard. The IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for microprocessor systems, second edition (IEC 60559:1989), previously designated IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point Arithmetic (ANSI/IEEE854−1987) generalizes the binary standard to remove dependencies on radix and word length. IEC 60559 generally refers to the floating-point standard, as in IEC 60559 operation, IEC 60559 format, etc.

std::min(0.0, 1.0) 和 std::max(0.0, 1.0) 会产生未定义的行为吗？

Do std::min(0.0, 1.0) and std::max(0.0, 1.0) yield undefined behavior?

c++

floating-point

undefined-behavior

c++-standard-library

language-lawyer