如何在没有 CUDA 的情况下为 PyTorch 构建 CUDA 自定义 C++ 扩展?
How to build CUDA custom C++ extension for PyTorch without CUDA?
我的任务是创建一个 CI 工作流来为此应用程序构建 PyTorch CUDA 扩展。到目前为止,该应用程序是通过使用 CUDA GPU 创建目标 AWS VM,将所有源推送到那里和 运行 setup.py
来部署的,但我想在我们的 [=22= 中进行构建] 系统并将预构建的二进制文件部署到生产环境。
当 运行 setup.py
在 CI 系统中时,我收到错误“没有可用的 CUDA GPU” - 这是真的,[= 中没有 CUDA GPU 22=] 系统。有没有一种方法可以在没有 CUDA GPU 的情况下构建 CUDA 扩展?
这是错误信息:
gcc -pthread -shared -B /usr/local/miniconda/envs/build/compiler_compat -L/usr/local/miniconda/envs/build/lib -Wl,-rpath=/usr/local/miniconda/envs/build/lib -Wl,--no-as-needed -Wl,--sysroot=/ /app/my-app/build/temp.linux-x86_64-3.6/my-extension/my-module.o -L/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-3.6/my-extension/my-module.cpython-36m-x86_64-linux-gnu.so
building 'my-extension.my-module._cuda_ext' extension
creating /app/my-app/build/temp.linux-x86_64-3.6/my-extension/src
Traceback (most recent call last):
File "setup.py", line 128, in <module>
'build_ext': BuildExtension
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions
build_ext.build_extensions(self)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 468, in unix_wrap_ninja_compile
cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 377, in unix_cuda_flags
cflags + _get_cuda_arch_flags(cflags) +
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1407, in _get_cuda_arch_flags
capability = torch.cuda.get_device_capability()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 291, in get_device_capability
prop = get_device_properties(device)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 296, in get_device_properties
_lazy_init() # will define _get_device_properties
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
我对CUDA不是很熟悉,对Python只有一半精通(我在这里作为“devops”的“ops”部分)。
这不是一个完整的解决方案,因为我缺乏完全找出解决方案的细节。但它应该对你或你的队友有帮助。
所以首先根据 source code,如果您设置了 CUDA arch flags
,则不需要达到 torch._C._cuda_init()
。
这意味着 pytorch
正在尝试找出 CUDA arch
因为它不是由用户指定的。
这是相关的thread。如您所见,将 TORCH_CUDA_ARCH_LIST
环境设置为适合您环境的内容应该适合您。
我的任务是创建一个 CI 工作流来为此应用程序构建 PyTorch CUDA 扩展。到目前为止,该应用程序是通过使用 CUDA GPU 创建目标 AWS VM,将所有源推送到那里和 运行 setup.py
来部署的,但我想在我们的 [=22= 中进行构建] 系统并将预构建的二进制文件部署到生产环境。
当 运行 setup.py
在 CI 系统中时,我收到错误“没有可用的 CUDA GPU” - 这是真的,[= 中没有 CUDA GPU 22=] 系统。有没有一种方法可以在没有 CUDA GPU 的情况下构建 CUDA 扩展?
这是错误信息:
gcc -pthread -shared -B /usr/local/miniconda/envs/build/compiler_compat -L/usr/local/miniconda/envs/build/lib -Wl,-rpath=/usr/local/miniconda/envs/build/lib -Wl,--no-as-needed -Wl,--sysroot=/ /app/my-app/build/temp.linux-x86_64-3.6/my-extension/my-module.o -L/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-3.6/my-extension/my-module.cpython-36m-x86_64-linux-gnu.so
building 'my-extension.my-module._cuda_ext' extension
creating /app/my-app/build/temp.linux-x86_64-3.6/my-extension/src
Traceback (most recent call last):
File "setup.py", line 128, in <module>
'build_ext': BuildExtension
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions
build_ext.build_extensions(self)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/usr/local/miniconda/envs/build/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 468, in unix_wrap_ninja_compile
cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 377, in unix_cuda_flags
cflags + _get_cuda_arch_flags(cflags) +
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1407, in _get_cuda_arch_flags
capability = torch.cuda.get_device_capability()
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 291, in get_device_capability
prop = get_device_properties(device)
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 296, in get_device_properties
_lazy_init() # will define _get_device_properties
File "/usr/local/miniconda/envs/build/lib/python3.6/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
我对CUDA不是很熟悉,对Python只有一半精通(我在这里作为“devops”的“ops”部分)。
这不是一个完整的解决方案,因为我缺乏完全找出解决方案的细节。但它应该对你或你的队友有帮助。
所以首先根据 source code,如果您设置了 CUDA arch flags
,则不需要达到 torch._C._cuda_init()
。
这意味着 pytorch
正在尝试找出 CUDA arch
因为它不是由用户指定的。
这是相关的thread。如您所见,将 TORCH_CUDA_ARCH_LIST
环境设置为适合您环境的内容应该适合您。