使用 Cuda 10.2 构建 OpenCV 2.4xx 时出错

Error in building OpenCV 2.4xx with Cuda 10.2

我正在尝试使用安装在 Jetson AGX Xavier 上的 Cuda-10.2 构建 OpenCV 2.4。我已经关注 this 博客 post 以更改文件,以便 opencv 能够找到所有 cuda 库。

我是运行生成cmake缓存的命令如下:

cmake -DCMAKE_INSTALL_PREFIX=~/lib/opencv_2.4/installed -DCMAKE_BUILD_TYPE="Release" -DWITH_CUDA=ON -DCUDA_GENERATION=Volta -D OPENCV_DNN_CUDA=ON -DCUDA_ARCH_BIN=7.5 -DCUDA_HOST_COMPILER=/usr/bin/gcc-8 -DCMAKE_C_COMPILER=gcc-8 -DCMAKE_CXX_COMPILER=g++-8 ..

当我执行 make 或 make -j8

时出现以下错误
[ 56%] Linking CXX executable ../../bin/opencv_perf_photo
[ 56%] Built target opencv_perf_photo
[ 56%] Built target opencv_gpu_pch_dephelp
[ 57%] Built target pch_Generate_opencv_gpu
[ 58%] Building NVCC (Device) object modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(60): error: identifier "__shfl" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(71): error: identifier "__shfl" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(92): error: identifier "__shfl_down" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(103): error: identifier "__shfl_down" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(124): error: identifier "__shfl_up" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(135): error: identifier "__shfl_up" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(84): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T cv::gpu::device::shfl_down(T, unsigned int, int) [with T=float]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(75): here
            instantiation of "void cv::gpu::device::bf_knnmatch::findBestMatch<BLOCK_SIZE>(float &, float &, int &, int &, float *, int *) [with BLOCK_SIZE=16]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(401): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(cv::gpu::PtrStepSz<T>, cv::gpu::PtrStepSz<T>, Mask, int2 *, float2 *) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(420): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSz<int2> &, const cv::gpu::PtrStepSz<float2> &, cudaStream_t) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(852): here
            instantiation of "void cv::gpu::device::bf_knnmatch::match2Dispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1149): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchDispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, int, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1166): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchL1_gpu<T>(const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, int, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with T=cv::gpu::device::uchar]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1172): here

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(84): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T cv::gpu::device::shfl_down(T, unsigned int, int) [with T=int]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(77): here
            instantiation of "void cv::gpu::device::bf_knnmatch::findBestMatch<BLOCK_SIZE>(float &, float &, int &, int &, float *, int *) [with BLOCK_SIZE=16]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(401): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(cv::gpu::PtrStepSz<T>, cv::gpu::PtrStepSz<T>, Mask, int2 *, float2 *) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(420): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSz<int2> &, const cv::gpu::PtrStepSz<float2> &, cudaStream_t) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(852): here
            instantiation of "void cv::gpu::device::bf_knnmatch::match2Dispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1149): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchDispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, int, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1166): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchL1_gpu<T>(const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, int, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with T=cv::gpu::device::uchar]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1172): here

8 errors detected in the compilation of "/tmp/tmpxft_000021a9_00000000-7_bf_knnmatch.compute_72.cpp1.ii".
CMake Error at cuda_compile_generated_bf_knnmatch.cu.o.cmake:264 (message):
  Error generating file
  /home/nvidia/opencv-2.4/build/modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/./cuda_compile_generated_bf_knnmatch.cu.o


modules/gpu/CMakeFiles/opencv_gpu.dir/build.make:503: recipe for target 'modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o' failed
make[2]: *** [modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o] Error 1
CMakeFiles/Makefile2:4741: recipe for target 'modules/gpu/CMakeFiles/opencv_gpu.dir/all' failed
make[1]: *** [modules/gpu/CMakeFiles/opencv_gpu.dir/all] Error 2
Makefile:162: recipe for target 'all' failed
make: *** [all] Error 2 

我也试过 -DCUDA_ARCH_BIN=7.2。我犯了同样的错误。 我该如何解决这个错误?

OpenCV 2.4 不能与 CUDA 工具包 10.2 一起使用。它最初是为 4.1 和 4.2 版本设计的。如果您可以降级到 CUDA 9,您可能能够编译该代码并 运行。否则,您将不得不重写这些内核以删除不再支持且无法在您的系统上运行的已弃用指令的使用Volta GPU.

参考:OpenCv Compiling with Cuda