CUDA:从二进制文件中获取所需的计算能力
CUDA: get required compute capabilities from binary
有没有办法从使用 CUDA 的二进制文件中获取所需的计算能力?我知道该应用程序适用于特定的图形卡(具有计算能力 2.1)。
运行 cuobjdump
应该能帮到你。它会告诉你什么ptx(运行时jit编译的代码)在编译文件中可用,什么sass(真正的代码在特定设备上执行)也已预编译。下面是使用 -arch=sm_20
:
编译的设备代码的示例输出
$ cuobjdump quick
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = <unknown>
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = cuda
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin ptx code:
================
arch = sm_20
code version = [4,1]
producer = cuda
host = linux
compile_size = 64bit
compressed
identifier = quick.cu
ptxasOptions = --generate-line-info
有没有办法从使用 CUDA 的二进制文件中获取所需的计算能力?我知道该应用程序适用于特定的图形卡(具有计算能力 2.1)。
运行 cuobjdump
应该能帮到你。它会告诉你什么ptx(运行时jit编译的代码)在编译文件中可用,什么sass(真正的代码在特定设备上执行)也已预编译。下面是使用 -arch=sm_20
:
$ cuobjdump quick
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = <unknown>
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin elf code:
================
arch = sm_20
code version = [1,7]
producer = cuda
host = linux
compile_size = 64bit
identifier = quick.cu
Fatbin ptx code:
================
arch = sm_20
code version = [4,1]
producer = cuda
host = linux
compile_size = 64bit
compressed
identifier = quick.cu
ptxasOptions = --generate-line-info