Haskell FFI 中 "wrapper" 包装器的实现

Question

根据 Haskell Wiki，在 Haskell FFI 中，我们可以使用 "wrapper" 包装器从 Haskell 函数创建一个 C 函数：

foreign import ccall "wrapper" createFunPtr :: (Int -> Int) -> IO (FunPtr (Int -> Int))

如果我理解正确，这意味着 do 块中的 f <- createFunPtr (+ 42) 给出了 C 中类型 int (*)(int) 的函数指针 f。

在Haskell中，类型Int -> Int的函数内部可能有一些局部绑定（例如，lambda表达式可能引用外部作用域中的变量），而在C中，函数指针是只是函数的内存地址，调用这些函数指针只是类似于原始跳转的东西。所以 Haskell 函数的附加数据没有其他地方可以存放在 FunPtr.

C++ 中的 Lambda 表达式是对象，调用 operator() 会传递一个隐式的 this 指针。但是 FunPtrs 就像 C 中的普通函数指针一样对待，所以不可能传递一些额外的参数。

那么 GHC 是如何实现这个 "wrapper" 包装器的呢？（我猜可能是直接在内存中的代码段写指令传递额外的参数来实现的，但我记得，代码段通常是只读的。）

Answer 1

快速 Google 搜索会出现 the GHC commentary:

Occasionally, it is convenient to treat Haskell closures as C function pointers. This is useful, for example, if we want to install Haskell callbacks in an existing C library. This functionality is implemented with the aid of adjustor thunks.

An adjustor thunk is a dynamically allocated code snippet that allows Haskell closures to be viewed as C function pointers.

Stable pointers provide a way for the outside world to get access to, and evaluate, Haskell heap objects, with the RTS providing a small range of ops for doing so. So, assuming we've got a stable pointer in our hand in C, we can jump into the Haskell world and evaluate a callback procedure, say. This works OK in some cases where callbacks are used, but does require the external code to know about stable pointers and how to deal with them. We'd like to hide the Haskell-nature of a callback and have it be invoked just like any other C function pointer.

Enter adjustor thunks. An adjustor thunk is a little piece of code that's generated on-the-fly (one per Haskell closure being exported) that, when entered using some 'universal' calling convention (e.g., the C calling convention on platform X), pushes an implicit stable pointer (to the Haskell callback) before calling another (static) C function stub which takes care of entering the Haskell code via its stable pointer.

An adjustor thunk is allocated on the C heap, and is called from within Haskell just before handing out the function pointer to the Haskell (IO) action. User code should never have to invoke it explicitly.

An adjustor thunk differs from a C function pointer in one respect: when the code is through with it, it has to be freed in order to release Haskell and C resources. Failure to do so will result in memory leaks on both the C and Haskell side.

我记得在某处读到过，包装器 FFI 导入实际上是 GHC 执行运行时代码生成的唯一地方。

我相信评论所说的是你的 createFunPtr 在编译时被定义为这样的东西（我设置 -ddump-simpl 以获得 createFunPtr 的核心，以下是我尝试将其反编译回 Haskell)

createFunPtr fun = do stable <- newStablePtr fun
                      pkg_ccall stable :: IO (FunPtr (Int -> Int))

newStablePtr 是 StablePtr API 的一部分，它允许 Haskell 将对 Haskell 对象的引用导出到外部代码。在创建调整器 thunk 后，允许 GC 移动传递给 createFunPtr 的函数。因此，所述调整器需要引用在 GC 之后仍然有效的函数，并且该函数由稳定的指针提供。

pkg_ccall（实际上相当神奇）为 C 堆上的调整器分配 space。这个 space 必须稍后用 freeHaskellFunPtr 释放，否则 C 堆（保存调节器）和 Haskell 堆（保存函数闭包，不能被GC'直到释放稳定指针）。调整器的内容取决于平台以及 GHC 是否（在构建时）配置为使用 libffi 作为调整器。实际的汇编代码可以在the comments in the relevant RTS file中找到，但要点一般是：

int adjustor(int arg) {
  return zdmainzdTzdTzucreateAddPtr($stable, arg);
  // with stable "baked" into each adjustor, as a "push <constant>" instruction
}

zdmainzdTzdTzucreateAddPtr 是取消引用给定稳定指针并调用那里生成的 Haskell 函数的存根。它是静态的，嵌入到二进制文件中，大致等同于：（如果你通过 GHC -v 和 -keep-tmp-files，你应该能够找到包含真正定义的 ghc_<some_num>.c 文件，这需要做一些簿记。）

HsInt zdmainzdTzdTzucreateAddPtr(StgStablePtr ptr, HsInt a) {
  HaskellObj fun, arg, app, ret;
  fun = deRefStablePtr(ptr);
  arg = rts_mkInt(a);
  app = rts_apply(fun, arg);
  eval(app, &ret);
  return rts_getInt(ret);
}

Haskell FFI 中 "wrapper" 包装器的实现

Implementation for the "wrapper" wrapper in Haskell FFI

haskell

ffi