是否应该通过 std::cin 将负数读入 unsigned 失败(gcc,clang 不同意)?

Should reading negative into unsigned fail via std::cin (gcc, clang disagree)?

例如,

#include <iostream>

int main() {
  unsigned n{};
  std::cin >> n;
  std::cout << n << ' ' << (bool)std::cin << std::endl;
}

当输入-1时,clang 6.0.0 outputs 0 0 while gcc 7.2.0输出4294967295 1。我想知道谁是正确的。或者也许两者都是正确的,因为标准没有指定这个?通过失败,我认为 (bool)std::cin 被评估为错误。 clang 6.0.0 也无法输入 -0


从 Clang 9.0.0 和 GCC 9.2.0 开始,两个编译器(在 Clang 的情况下使用 libstdc++ 或 libc++)都同意上述程序的结果,与 C++ 版本无关(>= C+ +11) 使用,并打印

4294967295 1

即他们将值设置为 ULLONG_MAX 并且不在流上设置 failbit。

std::cin >> n 命令的预期语义在 here 中进行了描述(显然,此操作调用了 std::num_get::get())。这个函数有一些语义变化,特别是 w.r.t。在 C++11 中选择是否放置 0,然后在 C++17 中再次放置。

我不完全确定,但我相信这些差异可能是您所看到的不同行为的原因。

我认为在 C++171 中两者都是错误的并且预期的输出应该是:

4294967295 0

虽然 returned 值对于两个编译器的最新版本都是正确的,但我认为应该设置 ios_­base​::​failbit,但我也认为对 标准中要转换的字段,可能会解释当前的行为。

标准说 — [facet.num.get.virtuals#3.3]:

The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>:

  • For a signed integer value, the function strtoll.

  • For an unsigned integer value, the function strtoull.

  • For a floating-point value, the function strtold.

所以我们回到std::strtoull,在这种情况下必须return2 ULLONG_MAX 而不是设置errno (这是两个编译器所做的)。

但在同一个街区(重点是我的):

The numeric value to be stored can be one of:

  • zero, if the conversion function does not convert the entire field.

  • the most positive (or negative) representable value, if the field to be converted to a signed integer type represents a value too large positive (or negative) to be represented in val.

  • the most positive representable value, if the field to be converted to an unsigned integer type represents a value that cannot be represented in val.

  • the converted value, otherwise.

The resultant numeric value is stored in val. If the conversion function does not convert the entire field, or if the field represents a value outside the range of representable values, ios_­base​::​failbit is assigned to err.

请注意,所有这些讨论的都是 “要转换的字段”,而不是 return 编辑的实际值std::strtoull。这里的字段其实就是加宽的字符序列'-', '1'.

由于该字段表示的值 (-1) 不能由 unsigned 表示,因此 returned 值应为 UINT_MAX 并且 failbit 应设置为 std::cin.


1clang 实际上是在 C++17 之前,因为上面引用的第三个项目符号是:

- the most negative representable value or zero for an unsigned integer type, if the field represents a value too large negative to be represented in val. ios_base::failbit is assigned to err.

2 std::strtoull returns ULLONG_MAX 因为(感谢@NathanOliver)- C/7。 22.1.4.5:

If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1. [...] If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).

问题是关于库实现之间的差异 and - and not so much about differences between the compilers(, ).

cppreference 很好地消除了这些不一致:

The result of converting a negative number string into an unsigned integer was specified to produce zero until , although some implementations followed the protocol of std::strtoull which negates in the target type, giving ULLONG_MAX for "-1", and so produce the largest value of the target type instead. As of , strictly following std::strtoull is the correct behavior.

总结为:

  • ULLONG_MAX (4294967295) 是正确的,因为 (两个编译器现在都正确)
  • 之前严格阅读标准应该是0()
  • 一些实现(特别是 )遵循 std::strtoull 协议(现在被认为是正确的行为)

failbit 设置及其设置原因可能是一个更有趣的问题(至少从 的角度来看)。在 () 版本 7 中,它现在与 相同 - 这似乎表明它被选择为与前进相同(即使这违反了标准的规定,在 ) 之前它应该为零 - 但到目前为止我一直无法找到此更改的更改日志或文档。

有趣的文本块读取(假设 pre-c++17):

If the conversion function results in a negative value too large to fit in the type of v, the most negative representable value is stored in v, or zero for unsigned integer types.

据此指定值为0。此外,没有任何地方表明这会导致设置故障位。