OCaml 中退出循环的惯用异常

Question

在 OCaml 中，可以通过引发异常提前退出命令式循环。

虽然在 OCaml 中命令式循环的使用本身并不惯用，但我想知道什么是最惯用的方式来模拟带有提前退出的命令式循环（采用如果可能，考虑性能等方面）。

例如，an old OCaml FAQ 提到异常 Exit：

Exit: used to jump out of loops or functions.

它仍然是最新的吗？ standard library 只是将其作为通用异常提及：

The Exit exception is not raised by any library function. It is provided for use in your programs.

相关地，this answer 另一个问题提到使用预先计算的 let exit = Exit 异常来避免循环内的分配。还需要吗？

此外，有时人们希望以特定值退出循环，例如raise (Leave 42)。是否有惯用的例外或命名约定来执行此操作？在这种情况下我应该使用引用吗（例如 let res = ref -1 in ... <loop body> ... res := 42; raise Exit）？

最后，在嵌套循环中使用 Exit 可以防止某些情况下人们想要退出多个循环，例如 Java 中的 break <label>。这将需要定义具有不同名称的异常，或者至少使用一个整数来指示应退出多少范围（例如 Leave 2 以指示应退出 2 个级别）。同样，这里是否有惯用的 approach/exception 命名？

Answer 1

Exit 可以（我不确定我是否可以说它是地道的）。但是，如果您使用的是足够新的编译器（自 4.02 起），请确保您使用的是 raise_notrace。

更好的解决方案是使用 OCaml Core library 中的 with_return。它不会有任何范围问题，因为它会为每个嵌套创建一个全新的异常类型。

当然你也可以达到同样的效果，或者直接拿Core的实现源码。

更惯用的是，不要使用异常来缩短迭代，并考虑使用现有算法（find、find_map、exists 等）或如果没有适合您的算法，只需编写一个递归函数。

Answer 2

正如最初在评论中发布的那样，在 OCaml 中提前退出的惯用方法是使用延续。在您希望 early return 到达的位置，您创建一个延续，并将其传递给可能 return early 的代码。这比循环标签更通用，因为您可以退出几乎任何可以访问延续的内容。

此外，如评论中所述，请注意 raise_notrace 的用法，以表示您永远不希望运行时生成其跟踪的异常。

A "naive" 第一次尝试：

module Continuation :
sig
  (* This is the flaw with this approach: there is no good choice for
     the result type. *)
  type 'a cont = 'a -> unit

  (* with_early_exit f passes a function "k" to f. If f calls k,
     execution resumes as if with_early_exit completed
     immediately. *)
  val with_early_exit : ('a cont -> 'a) -> 'a
end =

struct
  type 'a cont = 'a -> unit

  (* Early return is implemented by throwing an exception. The ref
     cell is used to store the value with which the continuation is
     called - this is a way to avoid having to generate an exception
     type that can store 'a for each 'a this module is used with. The
     integer is supposed to be a unique identifier for distinguishing
     returns to different nested contexts. *)
  type 'a context = 'a option ref * int64
  exception Unwind of int64

  let make_cont ((cell, id) : 'a context) =
    fun result -> cell := Some result; raise_notrace (Unwind id)

  let generate_id =
    let last_id = ref 0L in
    fun () -> last_id := Int64.add !last_id 1L; !last_id

  let with_early_exit f =
    let id = generate_id () in
    let cell = ref None in
    let cont : 'a cont = make_cont (cell, id) in
    try
      f cont
    with Unwind i when i = id ->
      match !cell with
      | Some result -> result
        (* This should never happen... *)
      | None        -> failwith "with_early_exit"
end



let _ =
  let nested_function i k = k 15; i in

  Continuation.with_early_exit (nested_function 42)
  |> string_of_int
  |> print_endline

如你所见，上面通过隐藏异常实现了提前退出。 continuation 实际上是一个部分应用的函数，它知道为其创建它的上下文的唯一 id，并且有一个引用单元格来存储结果值，同时将异常抛出到该上下文。上面的代码打印 15。您可以根据需要传递延续 k。您还可以在将函数 f 传递给 with_early_exit 的地方立即定义函数，从而产生类似于在循环上添加标签的效果。我经常用这个。

上面的问题是'a cont的结果类型，我随意设置为unit。实际上，'a cont 类型的函数永远不会 returns，所以我们希望它表现得像 raise —— 在需要任何类型的地方都可用。但是，这不会立即起作用。如果你做类似 type ('a, 'b) cont = 'a -> 'b 的事情，并将其传递给你的嵌套函数，类型检查器将在一个上下文中推断出 'b 的类型，然后强制你仅在具有相同上下文的上下文中调用延续类型，即您将无法执行

之类的操作

(if ... then 3 else k 15)
...
(if ... then "s" else k 16)

因为第一个表达式强制 'b 为 int，但第二个表达式要求 'b 为 string。

为了解决这个问题，我们需要为早期return提供一个类似于raise的函数，即

(if ... then 3 else throw k 15)
...
(if ... then "s" else throw k 16)

这意味着远离纯粹的延续。我们必须取消部分应用上面的 make_cont（我将其重命名为 throw），并改为传递裸上下文：

module BetterContinuation :
sig
  type 'a context

  val throw : 'a context -> 'a -> _
  val with_early_exit : ('a context -> 'a) -> 'a
end =

struct
  type 'a context = 'a option ref * int64
  exception Unwind of int64

  let throw ((cell, id) : 'a context) =
    fun result -> cell := Some result; raise_notrace (Unwind id)

  let generate_id = (* Same *)

  let with_early_exit f =
    let id = generate_id () in
    let cell = ref None in
    let context = (cell, id) in
    try
      f context
    with Unwind i when i = id ->
      match !cell with
      | Some result -> result
      | None        -> failwith "with_early_exit"
end



let _ =
  let nested_function i k = ignore (BetterContinuation.throw k 15); i in

  BetterContinuation.with_early_exit (nested_function 42)
  |> string_of_int
  |> print_endline

表达式 throw k v 可用于需要不同类型的上下文中。

我在我处理的一些大型应用程序中普遍使用这种方法。我什至更喜欢它而不是常规异常。我有一个更复杂的变体，其中 with_early_exit 的签名大致如下：

val with_early_exit : ('a context -> 'b) -> ('a -> 'b) -> 'b

其中第一个函数表示尝试做某事，第二个函数表示可能导致的 'a 类型错误的处理程序。与变体和多态变体一起，这提供了更明确类型的异常处理。它对于多态变体特别强大，因为编译器可以推断出错误变体集。

Jane Street 方法的作用与此处描述的相同，事实上我之前有一个使用 first-class 模块生成异常类型的实现。我不再确定为什么我最终选择了这个——可能会有细微的差别:)

Answer 3

只是回答我问题中其他答案中没有提到的特定部分：

... using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?

我在 4.02.1+fp 上使用 Core_bench 做了一些微基准测试，结果表明没有显着差异：比较两个相同的循环时，一个包含在循环之前声明的本地 exit另一个没有它，时差很小

此示例中 raise Exit 和 raise_notrace Exit 之间的差异也很小，在某些运行中约为 2%，在其他运行中高达 7%，但很可能在误差范围内这么短的实验。

总的来说，我无法衡量任何明显的差异，所以除非有人举出 Exit/exit 显着影响性能的示例，否则我更喜欢前者，因为它更清晰并且避免创建几乎无用的变量。

最后，我还比较了两种习语的区别：在退出循环之前使用对值的引用，或者创建包含 return 值的特定异常类型。

参考结果值+Exit:

 let res = ref 0 in
 let r =
   try
     for i = 0 to n-1 do
       if a.(i) = v then
        (res := v; raise_notrace Exit)
     done;
     assert false
   with Exit -> !res
 in ...

具有特定异常类型：

 exception Res of int
 let r =
   try
     for i = 0 to n-1 do
       if a.(i) = v then
         raise_notrace (Res v)
     done;
     assert false
   with Res v -> v
 in ...

再一次，差异很小，并且在运行之间变化很大。总体而言，第一个版本（参考 + Exit）似乎略有优势（快 0% 到 10%），但差异不足以推荐一个版本优于另一个版本。

因为前者需要定义一个初始值（可能不存在）或使用选项类型来初始化引用，而后者需要为循环中的每种类型的值定义一个新的异常return ，这里没有理想的解决方案。

Answer 4

关于这点

using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?

答案是否足够新的 OCaml 版本。这是 OCaml 4.02.0 更新日志的相关摘录。

PR#6203: Constant exception constructors no longer allocate (Alain Frisch)

这是 PR6203：http://caml.inria.fr/mantis/view.php?id=6203

OCaml 中退出循环的惯用异常

Idiomatic exceptions for exiting loops in OCaml

ocaml

loops

idioms

imperative-programming