如何使用 pybind11 在 C++ 线程中调用 Python 函数作为回调

How to invoke Python function as a callback inside C++ thread using pybind11

我设计了一个 C++ 系统,它在单独的线程中从过程 运行 调用用户定义的回调。简化的 system.hpp 看起来像这样:

#pragma once

#include <atomic>
#include <chrono>
#include <functional>
#include <thread>

class System
{
public:
  using Callback = std::function<void(int)>;
  System(): t_(), cb_(), stop_(true) {}
  ~System()
  {
    stop();
  }
  bool start()
  {
    if (t_.joinable()) return false;
    stop_ = false;
    t_ = std::thread([this]()
    {
      while (!stop_)
      {
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
        if (cb_) cb_(1234);
      }
    });
    return true;
  }
  bool stop()
  {
    if (!t_.joinable()) return false;
    stop_ = true;
    t_.join();
    return true;
  }
  bool registerCallback(Callback cb)
  {
    if (t_.joinable()) return false;
    cb_ = cb;
    return true;
  }

private:
  std::thread t_;
  Callback cb_;
  std::atomic_bool stop_;
};

它工作得很好,可以用这个简短的例子进行测试 main.cpp:

#include <iostream>
#include "system.hpp"

int g_counter = 0;

void foo(int i)
{
  std::cout << i << std::endl;
  g_counter++;
}

int main()
{
  System s;
  s.registerCallback(foo);
  s.start();
  while (g_counter < 3)
  {
    std::this_thread::sleep_for(std::chrono::milliseconds(1));
  }
  s.stop();
  return 0;
}

会输出1234几次然后就停止了。但是,我在尝试为我的 System 创建 python 绑定时遇到了问题。如果我将 python 函数注册为回调函数,我的程序将在调用 System::stop 后死锁。我对该主题进行了一些调查,似乎我遇到了 GIL 的问题。可重现的例子:

binding.cpp:

#include "pybind11/functional.h"
#include "pybind11/pybind11.h"

#include "system.hpp"

namespace py = pybind11;

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start)
    .def("stop", &System::stop)
    .def("registerCallback", &System::registerCallback);
}

python 脚本:

#!/usr/bin/env python

import mysystembinding
import time

g_counter = 0

def foo(i):
  global g_counter
  print(i)
  g_counter = g_counter + 1

s = mysystembinding.System()
s.registerCallback(foo)
s.start()
while g_counter < 3:
  time.sleep(1)
s.stop()

我已阅读 pybind11 docs 部分关于在 C++ 端获取或释放 GIL 的可能性。但是我没有设法摆脱我的案例中发生的僵局:

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start)
    .def("stop", &System::stop)
    .def("registerCallback", [](System* s, System::Callback cb)
      {
        s->registerCallback([cb](int i)
        {
          // py::gil_scoped_acquire acquire;
          // py::gil_scoped_release release;
          cb(i);
        });
      });
}

如果我在调用回调之前调用 py::gil_scoped_acquire acquire;,无论如何都会发生死锁。 如果我在调用回调之前调用 py::gil_scoped_release release;,我会得到

Fatal Python error: PyEval_SaveThread: NULL tstate

我应该怎么做才能将 python 函数注册为回调并避免死锁?

感谢 this discussion and many other resources (1, 2, 3) 我发现用 gil_scoped_release 保护启动和加入 C++ 线程的函数似乎可以解决问题:

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start, py::call_guard<py::gil_scoped_release>())
    .def("stop", &System::stop, py::call_guard<py::gil_scoped_release>())
    .def("registerCallback", &System::registerCallback);
}

显然发生死锁是因为 python 在调用负责 C++ 线程操作的绑定时持有锁。我仍然不确定我的推理是否正确,因此我将不胜感激任何专家的评论。

join() 之前调用 gil_scoped_release 将摆脱我的僵局。

void Tick::WaitLifeOver() {
  if (thread_.joinable()) {
    thread_.join();
  }
}
PYBIND11_MODULE(tick_pb, m) {
  py::class_<Tick, std::shared_ptr<Tick>>(m, "Tick")
    // ...
    .def("wait_life_over", &Tick::WaitLifeOver,
        py::call_guard<py::gil_scoped_release>());
}

代码如下:C++ Thread Callback Python Function