分析 F# 以实现递归函数的性能

Question

我决定用F#来解决2018年代码降临节第一天的第二题（进行循环求和，求第一个重复求和），但是性能欠缺，我做不到'找不到减速的原因。

问题已解决 Python 3

For a given input 约 140,000 次求和，此代码在几秒钟内执行。

data = list(map(int, '''
+1
-1
'''.strip().splitlines()))
from itertools import cycle, accumulate
class superset(set):
    def add(self, other):
        super().add(other)
        return other

def mapwhile(func, pred, iterable):
    for i in iterable:
        if not pred(i):
            yield func(i)
            return
        yield func(i)

def last(iterable):
    return list(iterable)[-1]

s = superset([0])
print(last(mapwhile(s.add, lambda x: x not in s, accumulate(cycle(data)))))

问题已解决，如 F#

我在匹配表达式上添加了一个条件断点，以每千分之一 i 计时，这段代码似乎执行了 ~100 sums/sec 并且即使在一个小时后也不会得出解决方案。以荒谬的幅度急剧放缓。

let input = @"
+1
-1
"
let cycle xs = seq { while true do yield! xs }
let accumusum xs = Seq.scan(fun acc elem -> acc + elem) 0 xs

let rec findfreqcycle i (s:int Set) (data:int seq) = 
    let head, tail = Seq.head data, Seq.tail data
    match s.Contains(head) with
    | false -> findfreqcycle (i+1) (s.Add(head)) (tail)
    | true ->  head


let data = input.Trim().Split('\n') |> Seq.map int |> Seq.toList |> cycle
accumusum data |> findfreqcycle 0 Set.empty

据我所知，每个代码示例背后的核心思想或多或少是相同的。输入只被热切地解析一次，生成器 function/sequence 懒惰地重复每个数字。

唯一的区别是实际查找第一个重复求和的函数在 F# 示例中是递归的。内存分析表明几乎恒定的内存使用，并且尾递归处于活动状态。

我可能做错了什么，我怎样才能更好地描述这些递归和生成函数的性能？

Answer 1

如评论中所述，Seq.tail 效率极低，尤其是当您按照自己的方式在循环中使用它时。原因是它创建了一个新序列，迭代原始序列并跳过第一个元素（因此，在 1000 次迭代后，你必须遍历 1000 多个序列，每个序列跳过一个元素）。

如果使用列表，带有头和尾的模式效果会更好，因为函数式列表就是为这种处理而设计的。在你的情况下，你可以做这样的事情（遵循与你的原始功能相同的模式）：

let rec findfreqcycle sum (s:int Set) input data = 
    match data with 
    | x::xs when s.Contains (sum + x) -> (sum + x)
    | x::xs -> findfreqcycle (sum + x) (s.Add (sum + x)) input xs
    | [] ->  findfreqcycle sum s input input

let data = input.Trim().Split('\n') |> Seq.map int |> Seq.toList 
findfreqcycle 0 Set.empty data data

我更改了它以便它使用模式匹配（在列表上）。我还更改了代码，使其采用有限列表，当它到达末尾时，它会重新开始。因此，它还即时对数字求和（而不是使用 Seq.scan - 这在这里不起作用，因为我没有使用无限列表）。

在 Pastebin 的输入中，我在大约 0.17 秒内得到结果 448。

Answer 2

我决定根据Tomas的回答尝试使用Seq.scan和Seq.pick实现，得到了这个结果。他是对的，这不是很好。从好的方面来说，它会在 ~0.3 秒内执行。

let cycle xs = seq { while true do yield! xs }    
let accumusum xs = Seq.scan(fun acc elem -> acc + elem) 0 xs

let tryfind (sum, s:int Set) =
    match s.Contains(sum) with
    | true -> Some(sum)
    | false -> None

let scanstate (sum, s:int Set) el =
    el, s.Add(sum)

let findfreqcycle (data:int seq) =
    let seen = Seq.scan scanstate (Seq.head data, Set.empty) (Seq.tail data)
    Seq.pick tryfind seen

let data = cycle <| (input.Trim().Split('\n') |> Seq.map int |> Seq.toList)
accumusum data |> findfreqcycle

Answer 3

OP 已经有一个可接受的答案，但我想我提出了一些变体。

该任务要求对输入值进行运行聚合（Set），同时当 Set 处于我们无法向其添加数字的状态时仍然允许提前退出，因为我们已经看到它。

通常我们fold聚合一个状态但是fold不允许我们提前退出。这就是为什么建议使用 scan，它是允许提前退出的流 fold + pick。

另一种方法是编写一个 fold 代码，允许在达到状态后进行快捷方式：val foldAndCheck: (a' -> 'b -> CheckResult<'a, 'c>) -> 'a -> 'b seq -> 'c option。 fold 就像一个聚合所有值的 for 循环，foldAndCheck 就像一个将值聚合到一个点然后 returns 结果的 for 循环。

它可能看起来像：

type [<Struct>] CheckResult<'T, 'U> =
  | Continue of c:'T
  | Done     of d:'U

// val foldAndCheck: (a' -> 'b -> CheckResult<'a, 'c>) -> 'a -> 'b seq -> 'c option
let foldAndCheck f z (s : _ seq) =
  let f = OptimizedClosures.FSharpFunc<_, _, _>.Adapt f
  use e = s.GetEnumerator ()
  let rec loop s =
    if e.MoveNext () then
      match f.Invoke (s, e.Current) with
      | Continue ss -> loop ss
      | Done     rr -> Some rr 
    else
      None
  loop z

let cycle xs = seq { while true do yield! xs }

let run (input : string) =
  let folder s v = if Set.contains v s then Done v else Continue (Set.add v s)
  input.Trim().Split('\n') 
  |> Seq.map int 
  |> cycle
  |> Seq.scan (+) 0
  |> foldAndCheck folder Set.empty

当运行它在我的机器上时，我得到这样的数字：

Result: Some 448
Took  : 280 ms
CC    : (31, 2, 1)

（CC 是第 0、1 和 2 代中的垃圾回收）

然后我创建了一个 F# 程序，我认为它等同于 Python 程序，因为它使用可变集和 mapWhile 函数：

let addAndReturn (set : HashSet<_>) =
  fun v ->
    set.Add v |> ignore
    v

let mapWhile func pred (s : _ seq) =
  seq {
    // F# for v in s ->
    //  doesn't support short-cutting. So therefore the use while
    use e = s.GetEnumerator ()
    let mutable cont = true
    while cont && e.MoveNext () do
      let v = e.Current
      if not (pred v) then
        cont <- false
        yield func v
      else
        yield func v
  }

let cycle xs = seq { while true do yield! xs }

let accumulate xs = Seq.scan (+) 0 xs

let last xs = Seq.last xs

let run (input : string) =
  let data = input.Trim().Split('\n') |> Seq.map int 
  let s = HashSet<int> ()

  data
  |> cycle
  |> accumulate
  |> mapWhile (addAndReturn s) (fun x -> s.Contains x |> not)
  |> last

性能数字：

Result: 448
Took  : 50 ms
CC    : (1, 1, 1)

如果我们说我们允许 mutation + seq，解决方案可能如下所示：

let cycle xs = seq { while true do yield! xs }

let run (input : string) =
  let s = HashSet<int> ()

  input.Trim().Split('\n')
  |> Seq.map int 
  |> cycle
  |> Seq.scan (+) 0
  |> Seq.find (fun v -> s.Add v |> not)

运行方式如下：

Result: 448
Took  : 40 ms
CC    : (1, 1, 1)

还有其他一些很酷的技巧可以用来进一步提高搜索性能，但这样做并不值得，因为此时大部分成本都花在解析整数上了。

分析 F# 以实现递归函数的性能

Profiling F# for performance of recursive functions

f#

profiling

问题已解决 Python 3

问题已解决，如 F#