Opengl：最大共享内存大小小于硬件规格

Question

如果我查询最大计算着色器共享内存大小：

GLint maximum_shared_mem_size;
glGetIntegerv(GL_MAX_COMPUTE_SHARED_MEMORY_SIZE, &maximum_shared_mem_size);

结果我得到了 48KB。然而，根据这份白皮书： https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf

第 13 页指出，对于我的 GPU (2080TI)：

The Turing L1 can be as large as 64 KB in size, combined with a 32 KB per SM shared memory allocation, or it can reduce to 32 KB, allowing 64 KB of allocation to be used for shared memory. Turing’s L2 cache capacity has also been increased.

因此，我预计 OpenGL 的最大共享内存大小为 return 64KB。这是一个错误的假设吗？如果是，为什么？

Answer 1

看起来 48KB 是预期结果，如 Turing Tuning Guide for CUDA:

中所述

Turing allows a single thread block to address the full 64 KB of shared memory. To maintain architectural compatibility, static shared memory allocations remain limited to 48 KB, and an explicit opt-in is also required to enable dynamic allocations above this limit. See the CUDA C Programming Guide for details.

看来您可以采用默认的 48KB 或使用 CUDA 来控制 carveout 配置。

Opengl：最大共享内存大小小于硬件规格

Opengl: Maximum shared memory size smaller than hardware specification

opengl

gpu

shared-memory

compute-shader