为什么 -g -G 会导致 Cuda 错误 7：请求启动的资源过多？

Why does -g -G result in Cuda error 7: too many resources requested for launch?

我有一个 C++ Cuda 工具包 v9.2 应用程序，它使用 -O 构建时运行良好，但如果我使用 -g -G 构建，我在运行时遇到 cuda 错误 7：

too many resources requested for launch

我理解from here这意味着：

the number of registers available on the multiprocessor is being exceeded. Reduce the number of threads per block to solve the problem.

我不想减少每个块的线程数，因为它可以优化工作。我应该怎么做才能在调试版本中使用更少的寄存器，更符合优化？我如何在我的应用程序中追踪额外寄存器使用的来源？

如上述评论中所述，由于各种原因，调试版本通常需要更多资源。

您可以使用--maxrregcount option or __launch_bounds__ qualifier 来设置允许编译器使用的寄存器数量的限制。请注意，转动这个旋钮实际上只是意味着用一种资源交换另一种资源。强制编译器使用更少的寄存器通常意味着它必须溢出更多。更多溢出通常意味着增加本地内存需求。在极端情况下，你可能运行进入另一个限制......

为什么 -g -G 会导致 Cuda 错误 7：请求启动的资源过多？

Why does -g -G result in Cuda error 7: too many resources requested for launch?

cuda

cpu-registers