这个有界递归的程序是否有未定义的行为？

Question

我知道无限递归或迭代是未定义的行为，但有界不是。但是，这个程序段对于大多数输入都是错误的。它是否有未定义的行为，为什么或为什么不？如果它有未定义的行为，我是否可以进行一些修改以消除未定义的行为？

#include <cstdint>

using bigint = ::std::uint64_t;
constexpr const bigint add_val = 1442695040888963407;
constexpr const bigint mult_val = 6364136223846793005;

bigint compute(bigint t)
{
   if (t > 1) {
      return add_val + compute(t * mult_val);
   } else {
      return 1;
   }
}

int main(int argc, char const * const argv[])
{
   return compute(argc < 0 ? -argc : argc);
}

这基本上是使用递归循环遍历 a linear congruential random number generator with a period of 2^64 的所有值，因此可以保证最终达到 0 或 1。

我问的是一个非常明确和具体的问题。这个任何人都可以编译并运行完整地调用未定义行为的程序是否符合 C++ 标准？

Answer 1

事实优先，来自标准草案 N4800 §7.1/P4 [expr.pre]（强调我的）：

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [Note: Treatment of division by zero, forming a remainder using a zero divisor, and all floating-point exceptions vary among machines, and is sometimes adjustable by a library function. — end note]

还有条款 §6.7.1/p2 基本类型 [basic.fundamental]（Emphasis Mine）：

For each of the standard signed integer types, there exists a corresponding (but different) standard unsigned integer type: “unsigned char”, “unsigned short int”, “unsigned int”, “unsigned long int”, and “unsigned long long int”. Likewise, for each of the extended signed integer types, there exists a corresponding extended unsigned integer type. The standard and extended unsigned integer types are collectively called unsigned integer types. An unsigned integer type has the same range exponent N as the corresponding signed integer type. The range of representable values for the unsigned type is 0 to 2^N − 1 (inclusive); arithmetic for the unsigned type is performed modulo 2^N . [Note: Unsigned arithmetic does not overflow. Overflow for signed arithmetic yields undefined behavior (7.1). — end note]

还有 §5.13.2/p2 整数文字 [lex.icon]

The type of an integer literal is the first of the corresponding list in Table 7 in which its value can be represented.

还有 §7.4 常用的算术转换 [expr.arith.conv]

(1.5) — Otherwise, the integral promotions (7.3.6) shall be performed on both operands⁶⁰. Then the following rules shall be applied to the promoted operands:

(1.5.1) — If both operands have the same type, no further conversion is needed.

(1.5.2) — Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.

(1.5.3) — Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.

(1.5.4) — Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.

(1.5.5) — Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type

所以问题是：这个程序中的算术结果是否经过数学定义并且在其类型的可表示值范围内。更具体地说，表达式 1442695040888963407 + compute(t * 6364136223846793005) 是否在其类型的可表示值范围内？

为此，整数文字 1442695040888963407 和 6364136223846793005 的类型必须小于或等于 conversion rank 和 std::uint64_t，以便结果转换为 std::uint64_t。不幸的是，没有这样的保证。

因此，为了让您的程序避免 UB，我会用 LU.

标记整数文字

bigint compute(bigint t)
{
   if (t > 1) {
      return 1442695040888963407LU + compute(t * 6364136223846793005LU);
   } else {
      return 1;
   }
}

现在，至于为什么会出现分段错误，是因为您的堆栈溢出了。虽然，上面的程序理论上没有无限次的递归，也就是UB，递归的次数会耗尽你机器的资源。

Answer 2

with thanks to 101010's standard link,我觉得相关的词组是

4.1 Implementation compliance [intro.compliance]

(2.1) - If a program contains no violations of the rules in this document, a conforming implementation shall, within its resource limits, accept and correctly execute that program.

和

Annex B (informative)

Implementation quantities [implimits]

主要解决编译器限制（最大符号长度、每行字符等），但语言听起来相关：

Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process. Every implementation shall document those limitations where known. This documentation may cite fixed limits where they exist, say how to compute variable limits as a function of available resources, or say that fixed limits do not exist or are unknown.

The limits may constrain quantities that include those described below or others ...

因此，该标准没有说明实施可能会限制哪些数量，也没有给出除它所描述的这些限制的最小值指南之外的任何内容。

如果你的编译器没有记录它的堆栈深度限制，我猜它可能是 non-conforming 根据 bold 的句子，但是声明 "stack depth is limited by these runtime properties" 可能就足够了..

Does this program that anybody can compile and run in its entirety invoke undefined behavior according to the C++ standard?

没有。但根据 4.1/2.1 的实现，允许编译失败或无法正确执行正确的程序。

Answer 3

该标准的标准作者认识到，不同的实现可以而且应该在它们可以有效处理的程序范围内有所不同。虽然标准可能已经指定质量实现应该尝试有效地处理尽可能广泛的任务——"practicality" 的定义留给实现者的判断——但他们可能认为这样的概念是self-evident.

如果将术语 "Undefined Behavior" 应用于标准不强加任何要求的每个程序，那么所有程序都会调用未定义行为，但有一个警告除外：符合要求的实现必须能够处理至少一个--可能是人为的和无用的--以标准定义的方式编程。因此，如果一个程序不会执行标准描述为未定义行为的任何操作，则标准将定义程序的行为 when 运行通过符合规范的实现运行以标准定义的方式运行任何其他程序。它不会对任何其他情况下的任何程序的行为强加任何行为。

这个有界递归的程序是否有未定义的行为？

Does this program with bounded recursion have undefined behavior?

c++

recursion

undefined-behavior

c++17

4.1 Implementation compliance [intro.compliance]

Annex B (informative)

Implementation quantities [implimits]