如何在不初始化 CUDA 的情况下检查 torch gpu 兼容性?

How to check torch gpu compatibility without initializing CUDA?

尽管有最新的 cuda 版本,但较旧的 GPU 似乎不支持 torch。

在我的案例中,崩溃有以下错误:

/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/cuda/__init__.py:83: UserWarning: 
    Found GPU%d %s which is of cuda capability %d.%d.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is %d.%d.
    
  warnings.warn(old_gpu_warn.format(d, name, major, minor, min_arch // 10, min_arch % 10))
WARNING:lightwood-16979:Exception: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1. when training model: <lightwood.model.neural.Neural object at 0x7f9c34df1e80>
Process LearnProcess-1:13:
Traceback (most recent call last):
  File "/home/maxs/dev/mdb/venv38/sources/lightwood/lightwood/model/helpers/default_net.py", line 59, in forward
    output = self.net(input)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 96, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/functional.py", line 1847, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

这发生在尽管:

assert torch.cuda.is_available() == True
torch.version.cuda == '10.2'

如果没有实际上try/catching 张量到 GPU 的传输,我如何检查不支持 torch 的旧 GPU?传输会初始化 cuda,这会浪费 2GB 的内存,这是我无法承受的,因为我会 运行 在数十个进程中进行此检查,所有这些都会因初始化而额外浪费 2GB 的内存。

根据 torch.cuda.__init__ 中实际抛出错误的代码,以下检查似乎有效:

import torch
from torch.cuda import device_count, get_device_capability

def is_cuda_compatible():
    compatible_device_count = 0
    if torch.version.cuda is not None:
        for d in range(device_count()):
            capability = get_device_capability(d)
            major = capability[0]
            minor = capability[1]
            current_arch = major * 10 + minor
            min_arch = min((int(arch.split("_")[1]) for arch in torch.cuda.get_arch_list()), default=35)
            if (not current_arch < min_arch
                    and not torch._C._cuda_getCompiledVersion() <= 9000):
                compatible_device_count += 1

    if compatible_device_count > 0:
        return True
    return False

不确定它是否 100% 正确,但将它放在这里以征求反馈,以防其他人需要它。