是否保证 OpenMP 矢量化?

Is OpenMP vectorization guaranteed?

OpenMP 标准是否保证 #pragma omp simd 正常工作,即如果编译器无法向量化代码,编译是否会失败?

#include <cstdint>
void foo(uint32_t r[8], uint16_t* ptr)
{
    const uint32_t C = 1000;
    #pragma omp simd
    for (int j = 0; j < 8; ++j)
        if (r[j] < C)
            r[j] = *(ptr++);
}

gcc 和 clang 无法对其进行矢量化,但不要抱怨(除非您使用 -fopt-info-vec-optimized-missed 等)。

不,不能保证。我能找到的 OpenMP 4.5 standard 的相关部分(强调我的):

(1.3) When any thread encounters a simd construct, the iterations of the loop associated with the construct may be executed concurrently using the SIMD lanes that are available to the thread.

(2.8.1) The simd construct can be applied to a loop to indicate that the loop can be transformed into a SIMD loop (that is, multiple iterations of the loop can be executed concurrently using SIMD instructions).

(Appendix C) The number of iterations that are executed concurrently at any given time is implementation defined.

(1.2.7) implementation defined: Behavior that must be documented by the implementation, and is allowed to vary among different compliant implementations. An implementation is allowed to define this behavior as unspecified.