Pytorch 在错误的行上使 cuda 崩溃
Pytorch crashes cuda on wrong line
如何查看在 GIL 之外执行异步代码的 Pytorch 中哪 python 行导致 cuda 崩溃?
is a case where I had Pytorch crash cuda, running this code on this 数据集和每个 运行 都会在不同的 python 行上与调试器一起崩溃,这使得调试变得非常困难。
我在论坛的 completely unrelated thread 中找到了答案。无法找到可通过 Google 搜索的答案,因此为了以后的用户而在此发布。
Since CUDA calls are executed asynchronously, you should run your code
with
CUDA_LAUNCH_BLOCKING=1 python script.py
This makes sure the right line of code will throw the error message.
如何查看在 GIL 之外执行异步代码的 Pytorch 中哪 python 行导致 cuda 崩溃?
我在论坛的 completely unrelated thread 中找到了答案。无法找到可通过 Google 搜索的答案,因此为了以后的用户而在此发布。
Since CUDA calls are executed asynchronously, you should run your code with
CUDA_LAUNCH_BLOCKING=1 python script.py
This makes sure the right line of code will throw the error message.