OpenCL/OpenGL 互操作性纹理段错误

OpenCL/OpenGL interoperability texture segfault

我正在尝试将 OpenCL 与 OpenGL 互操作结合使用。在 GPU 上计算路径跟踪算法,然后将 GL 纹理绘制为四边形。在 Intel CPU 上按预期工作,但当我尝试在 GTX 970 上 运行 时,解锁 GL 纹理时出现段错误。不知道这是原因还是 运行ning 内核。我会让代码说明一切。顺便说一句,我正在使用 OpenCL C++ 包装器。

GL 纹理创建

glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_2D, texture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, framebuffer);
glBindTexture(GL_TEXTURE_2D, 0); //Unbind texture

CL纹理分配

m_textureCL = cl::ImageGL(m_context,
        CL_MEM_READ_WRITE, 
        GL_TEXTURE_2D,
        0,
        texture,
        &errCode);

运行内核函数

//-----------------------------------------------------------------------------
//  Lock texture
//-----------------------------------------------------------------------------
std::vector<cl::Memory> glObjects; //Create vector of GL objects to lock
glObjects.push_back(m_textureCL); //Add created CL texture buffer
glFlush(); //Flush GL queue

errCode = m_cmdQueue.enqueueAcquireGLObjects(&glObjects, NULL, NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error locking texture" << errCode << std::endl;
    return errCode;
}
//-----------------------------------------------------------------------------

//-----------------------------------------------------------------------------
//  Run queue
//-----------------------------------------------------------------------------
errCode = m_cmdQueue.enqueueNDRangeKernel(
        m_kernel,
        cl::NullRange,
        cl::NDRange(height*width),
        cl::NullRange,
        NULL,
        NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error running queue: " << errCode << std::endl;
    return errCode;
}
//---------------------------------------


//-----------------------------------------------------------------------------
//  Unlock
//-----------------------------------------------------------------------------
errCode = m_cmdQueue.enqueueReleaseGLObjects(&glObjects, NULL, NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error unlocking texture: " << errCode << std::endl;
    return errCode;
} <<------ Here's where segfault occurs, can't get past this point

内核函数定义

__kernel void RadianceGPU (
    __write_only image2d_t texture,
    other_stuff...)

写入内核中的纹理

write_imagef(
        texture,
        (int2)(x, height-y-1),
        (float4)(
            clamp(framebuffer[id].x, 0.0f, 1.0f),
            clamp(framebuffer[id].y, 0.0f, 1.0f),
            clamp(framebuffer[id].z, 0.0f, 1.0f),
            1.0f) * 1.0f);

有趣的是 write_imagef() 尽管纹理是 UNSIGNED_BYTE。

编辑: 所以我终于弄清楚了导致问题的原因。它在创建 CL 属性时设置了错误的显示。我刚刚从 GLFW 粘贴到 window,这会导致 Nvidia 驱动程序出现问题。您需要使用 glxGetCurrentDisplay 或 glfwGetX11Display。这修复了段错误。

我不确定这是不是你的问题,但我还是会试一试。

您没有以可移植的方式同步访问 glObjects。来自 OpenCL 1.1:

Prior to calling clEnqueueAcquireGLObjects, the application must ensure that any pending GL operations which access the objects specified in mem_objects have completed. This may be accomplished portably by issuing and waiting for completion of a glFinish command on all GL contexts with pending references to these objects. Implementations may offer more efficient synchronization methods; for example on some platforms calling glFlush may be sufficient, or synchronization may be implicit within a thread, or there may be vendor - specific extensions that enable placing a fence in the GL command stream and waiting for completion of that fence in the CL command queue. Note that no synchronization methods other than glFinish are portable between OpenGL implementations at this time.

基本上,可移植行为需要 glFinish。

在下面已引用的段落中,有更多可能感兴趣的信息:

Similarly, after calling clEnqueueReleaseGLObjects, the application is responsible for ensuring that any pending OpenCL operations which access the objects specified in mem_objects have completed prior to executing subsequent GL commands which reference these objects. This may be accomplished portably by calling clWaitForEvents with the event object returned by clEnqueueRelease GL Objects, or by calling clFinish. As above, some implementations may offer more efficient methods.

这里是引用自https://www.khronos.org/registry/cl/specs/opencl-1.1.pdf#nameddest=section-9.8.6

的文档的link