为什么快速整数类型比其他整数类型更快？

Question

在ISO/IEC9899:2018(C18)中，在7.20.1.3下说明：

7.20.1.3 Fastest minimum-width integer types

1 Each of the following types designates an integer type that is usually fastest²⁶⁸⁾ to operate with among all integer types that have at least the specified width.

2 The typedef name int_fastN_t designates the fastest signed integer type with a width of at least N. The typedef name uint_fastN_t designates the fastest unsigned integer type with a width of at least N.

3 The following types are required:

int_fast8_t, int_fast16_t, int_fast32_t, int_fast64_t, uint_fast8_t, uint_fast16_t, uint_fast32_t, uint_fast64_t

All other types of this form are optional.

²⁶⁸⁾ The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.

但没有说明为什么这些 "fast" 整数类型更快。

为什么这些快速整数类型比其他整数类型快？

_{我用 C++ 标记了这个问题，因为在 cstdint 的头文件中，C++17 中也提供了快速整数类型。不幸的是，在 ISO/IEC 14882:2017 (C++17) 中没有关于它们解释的部分；我已经在问题的正文中实现了该部分。}

提示：在C中，它们是在stdint.h的头文件中声明的。

Answer 1

想象一个只执行 64 位算术运算的 CPU。现在想象一下如何在 CPU 上实现一个无符号的 8 位加法。要获得正确的结果，必然涉及不止一种操作。在这样的 CPU 上，64 位操作比其他整数宽度上的操作更快。在这种情况下，所有 Xint_fastY_t 都可能是 64 位类型的别名。

如果 CPU 支持窄整数类型的快速操作，因此较宽的类型并不比较窄的类型快，那么 Xint_fastY_t 不会（不应该）是较宽的类型的别名表示所有 Y 位所需的类型。

出于好奇，我检查了某些架构上特定实现（GNU，Linux）的大小。这些在同一架构上的所有实现中都不相同：

┌────╥───────────────────────────────────────────────────────────┐
│ Y  ║   sizeof(Xint_fastY_t) * CHAR_BIT                         │
│    ╟────────┬─────┬───────┬─────┬────────┬──────┬────────┬─────┤
│    ║ x86-64 │ x86 │ ARM64 │ ARM │ MIPS64 │ MIPS │ MSP430 │ AVR │
╞════╬════════╪═════╪═══════╪═════╪════════╪══════╪════════╪═════╡
│ 8  ║ 8      │ 8   │ 8     │ 32  │ 8      │ 8    │ 16     │ 8   │
│ 16 ║ 64     │ 32  │ 64    │ 32  │ 64     │ 32   │ 16     │ 16  │
│ 32 ║ 64     │ 32  │ 64    │ 32  │ 64     │ 32   │ 32     │ 32  │
│ 64 ║ 64     │ 64  │ 64    │ 64  │ 64     │ 64   │ 64     │ 64  │
└────╨────────┴─────┴───────┴─────┴────────┴──────┴────────┴─────┘

请注意，尽管对较大类型的操作可能更快，但此类类型也会占用更多 space 缓存，因此使用它们不一定会产生更好的性能。此外，人们不能总是相信实施一开始就做出了正确的选择。与往常一样，需要进行测量才能获得最佳结果。

table 的屏幕截图，Android 用户：

^{(Android 单色字体中没有方框绘图字符 - ref)}

Answer 2

它们不是，至少不可靠。

快速类型只是常规类型的 typedef，但是如何定义它们取决于实现。它们必须至少达到要求的尺寸，但可以更大。

的确，在某些体系结构上，某些整数类型比其他整数类型具有更好的性能。例如，早期的 ARM 实现具有针对 32 位字和无符号字节的内存访问指令，但它们没有针对半字或有符号字节的指令。半字和有符号字节指令是后来添加的，但它们的寻址选项仍然不够灵活，因为它们必须硬塞进备用编码 space。此外，ARM 上的所有实际数据处理指令都对字进行处理，因此在某些情况下，可能需要在计算后屏蔽掉较小的值才能给出正确的结果。

然而，还有高速缓存压力的竞争问题，即使它需要更多的指令来 load/store/process 一个较小的值。如果减少缓存未命中数，较小的值可能仍会表现更好。

很多常见平台的类型定义好像没有考虑清楚。特别是，现代 64 位平台往往对 32 位整数有很好的支持，但 "fast" 类型在这些平台上通常是不必要的 64 位。

此外，C 中的类型成为平台 ABI 的一部分。因此，即使平台供应商发现他们做出了愚蠢的选择，以后也很难改变这些愚蠢的选择。

忽略 "fast" 类型。如果您真的很关心整数性能，请使用所有可用大小对您的代码进行基准测试。

Answer 3

快速类型并不比所有其他整数类型快 -- 它们实际上 与某些 "normal" 整数类型相同（它们只是那种类型）——无论哪种类型恰好是保存至少那么多位的值最快的。

它只是依赖于平台每个快速类型是哪个整数类型的别名。

为什么快速整数类型比其他整数类型更快？

Why are the fast integer types faster than the other integer types?

c

c++

int

performance

types