Haskell 的懒惰是如何工作的？

Question

考虑这个将列表中的所有元素加倍的函数：

doubleMe [] = []
doubleMe (x:xs) = (2*x):(doubleMe xs)

然后考虑表达式

doubleMe (doubleMe [a,b,c])

很明显，在运行时，这首先扩展为：

doubleMe ( (2*a):(doubleMe [b,c]) )

（很明显，因为据我所知，不存在其他可能性）。

但我的问题是：为什么 现在扩展到

2*(2*a) : doubleMe( doubleMe [b,c] )

而不是

doubleMe( (2*a):( (2*b) : doubleMe [c] ) )

?

凭直觉，我知道答案：因为 Haskell 是懒惰的。但是有人可以给我更准确的答案吗？

列表是否有什么特别之处导致了这种情况，或者这个想法比列表更普遍？

Answer 1

doubleMe (doubleMe [a,b,c]) 不会扩展为 doubleMe ( (2*a):(doubleMe [b,c]) )。它扩展为：

case doubleMe [a,b,c] of
  [] -> []
  (x:xs) -> (2*x):(doubleMe xs)

即外层函数调用先展开。这是惰性语言和严格语言之间的主要区别：扩展函数调用时，您不会首先评估参数 - 而是用函数调用主体替换函数调用，并暂时保留参数。

现在 doubleMe 需要扩展，因为模式匹配需要在计算之前知道其操作数的结构，所以我们得到：

case (2*a):(doubleMe [b,c]) of
  [] -> []
  (x:xs) -> (2*x):(doubleMe xs)

现在模式匹配可以替换为第二个分支的主体，因为我们现在知道第二个分支是匹配的。所以我们用 (2*a) 代替 x，用 doubleMe [b, c] 代替 xs，得到：

(2*(2*a)):(doubleMe (doubleMe [b,c]))

这就是我们得出该结果的方式。

Answer 2

现在是提取等式推理的好时机，这意味着我们可以用一个函数代替它的定义（模重命名事物以避免冲突）。不过，为了简洁起见，我打算将 doubleMe 重命名为 d：

d [] = []                           -- Rule 1
d (x:xs) = (2*x) : d xs             -- Rule 2

d [1, 2, 3] = d (1:2:3:[])
            = (2*1) : d (2:3:[])    -- Rule 2
            = 2 : d (2:3:[])        -- Reduce
            = 2 : (2*2) : d (3:[])  -- Rule 2
            = 2 : 4 : d (3:[])      -- Reduce
            = 2 : 4 : (2*3) : d []  -- Rule 2
            = 2 : 4 : 6 : d []      -- Reduce
            = 2 : 4 : 6 : []        -- Rule 1
            = [2, 4, 6]

所以现在如果我们用 doubleMe/d 的 2 层执行此操作：

d (d [1, 2, 3]) = d (d (1:2:3:[]))
                = d ((2*1) : d (2:3:[]))    -- Rule 2 (inner)
                = d (2 : d (2:3:[]))        -- Reduce
                = (2*2) : d (d (2:3:[]))    -- Rule 2 (outer)
                = 4 : d (d (2:3:[]))        -- Reduce
                = 4 : d ((2*2) : d (3:[]))  -- Rule 2 (inner)
                = 4 : d (4 : d (3:[]))      -- Reduce
                = 4 : 8 : d (d (3:[]))      -- Rule 2 (outer) / Reduce
                = 4 : 8 : d (6 : d [])      -- Rule 2 (inner) / Reduce
                = 4 : 8 : 12 : d (d [])     -- Rule 2 (outer) / Reduce
                = 4 : 8 : 12 : d []         -- Rule 1 (inner)
                = 4 : 8 : 12 : []           -- Rule 1 (outer)
                = [4, 8, 12]

或者，您可以选择在不同的时间点减少，结果

d (d [1, 2, 3]) = d (d (1:2:3:[]))
                = d ((2*1) : d (2:3:[]))
                = (2*(2*1)) : d (d (2:3:[]))
                = -- Rest of the steps left as an exercise for the reader
                = (2*(2*1)) : (2*(2*2)) : (2*(2*3)) : []
                = (2*2) : (2*4) : (2*6) : []
                = 4 : 6 : 12 : []
                = [4, 6, 12]

这是此计算的两种可能的扩展，但并不特定于列表。您可以将其应用于树类型：

data Tree a = Leaf a | Node a (Tree a) (Tree a)

Leaf 和 Node 上的模式匹配分别类似于 [] 和 : 上的匹配，如果您考虑 [=23= 的列表定义]

data [] a = [] | a : [a]

我之所以说这是两种可能的扩展，是因为它的扩展顺序取决于您所使用的编译器的特定运行时和优化。如果它看到可以使您的程序执行得更快的优化，它可以选择该优化。这就是为什么懒惰通常是一个福音，你不必考虑事情发生的顺序，因为编译器会为你考虑。这在没有纯度的语言中是不可能的，例如 C#/Java/Python/etc。您不能重新排列计算，因为这些计算可能会产生依赖于顺序的副作用。但是当执行纯计算时，您不会有副作用，因此编译器可以更轻松地优化您的代码。

Answer 3

之所以这样做，是因为列表的定义方式和惰性。当您请求列表的头部时，它会评估您请求的第一个元素并保存其余元素供以后使用。所有列表处理操作都建立在 head:rest 概念之上，因此永远不会出现中间结果。

Answer 4

您“显而易见”的第一步实际上并不那么明显。事实上发生的事情是这样的：

doubleMe (...)
doubleMe ( { [] | (_:_) }? )
doubleMe ( doubleMe (...)! )

只有在那个时候它才真正“进入”内部函数。所以它继续

doubleMe ( doubleMe (...) )
doubleMe ( doubleMe( { [] | (_:_) }? ) )
doubleMe ( doubleMe( a:_ ! ) )
doubleMe ( (2*a) : doubleMe(_) )
doubleMe ( (2*a):_ ! )

现在，外部 doubleMe 函数有了 [] | (_:_) 问题的“答案”，这是对内部函数中的任何内容进行求值的唯一原因。

其实下一步也不一定是你想的那样：要看你怎么评价外在的结果！例如，如果整个表达式是 tail $ doubleMe ( doubleMe [a,b,c] )，那么它实际上会展开得更像

tail( { [] | (_:_) }? )
tail( doubleMe(...)! )
tail( doubleMe ( { [] | (_:_) }? ) )
...
tail( doubleMe ( doubleMe( a:_ ! ) ) )
tail( doubleMe ( _:_ ) )
tail( _ : doubleMe ( _ ) )
doubleMe ( ... )

即它实际上永远不会真正达到 2*a!

Answer 5

写 \lambda y.m 表示 doubleMe 的抽象版本，t 表示列表 [a,b,c]。那么你要减少的术语是

\y.m (\y.m t)

也就是说，有两个redex。 Haskell 更喜欢首先触发最外层的 redexes，因为它是一种正常的有序语言。然而，事实并非如此。 doubleMe 并不是真正的 \y.m，只有当 "argument" 具有正确的形状（列表的形状）时才真正有一个 redex。由于这还不是一个 redex，并且 (\y.m) 内部没有任何 redex，我们移到应用程序的右侧。因为 Haskell 也更愿意首先评估最左边的 redexes。现在，t 确实具有列表的形状，因此 redex (\y.m t) 触发。

\y.m (a : (\y.m t'))

然后我们回到顶部，重新做一遍整个事情。除了这次，最外层的术语有一个 redex。

Answer 6

doubleMe [] = []
doubleMe (x:xs) = (2*x):(doubleMe xs)

doubleMe (doubleMe [a,b,c])

我认为不同的人会以不同的方式扩展这些。我并不是说它们会产生不同的结果或其他任何东西，只是在正确操作的人中并没有真正的标准符号。以下是我的做法：

-- Let's manually compute the result of *forcing* the following expression.
-- ("Forcing" = demanding that the expression be evaluated only just enough
-- to pattern match on its data constructor.)
doubleMe (doubleMe [a,b,c])

    -- The argument to the outer `doubleMe` is not headed by a constructor,
    -- so we must force the inner application of `doubleMe`.  To do that, 
    -- first force its argument to make it explicitly headed by a
    -- constructor.
    = doubleMe (doubleMe (a:[b,c]))

    -- Now that the argument has been forced we can tell which of the two
    -- `doubleMe` equations applies to it: the second one.  So we use that
    -- to rewrite it.
    = doubleMe (2*a : doubleMe [b,c])

    -- Since the argument to the outer `doubleMe` in the previous expression
    -- is headed by the list constructor `:`, we're done with forcing it.
    -- Now we use the second `doubleMe` equation to rewrite the outer
    -- function application. 
    = 2*2*a : doubleMe (doubleMe [b, c])

    -- And now we've arrived at an expression whose outermost operator
    -- is a data constructor (`:`).  This means that we've successfully 
    -- forced the expression, and can stop here.  There wouldn't be any
    -- further evaluation unless some consumer tried to match either of 
    -- the two subexpressions of this result.

这与sepp2k 和leftaroundabout 的答案相同，只是他们写的很有趣。 sepp2k 的答案有一个 case 表达式似乎无处不在 - doubleMe 的多方程定义被隐式重写为单个 case 表达式。 leftaroundabout 的答案中有一个 { [] | (_:_) }? 的东西，显然是 "I have to force the argument until it looks like either [] or (_:_)" 的符号。

bhelkir 的回答与我的相似，但它也递归地强制结果的所有子表达式，除非您有消费者要求，否则不会发生这种情况。

所以没有不尊重任何人，但我更喜欢我的。 :-P

Answer 7

其他人已经回答了一般问题。让我就这一点补充几点：

Is there something special about lists that causes this, or is the idea more general than that just lists?

不，列表并不特殊。 Haskell 中的每个 data 类型都有惰性语义。让我们尝试一个使用整数对类型 (Int, Int).

的简单示例

let pair :: (Int,Int)
    pair = (1, fst pair)
 in snd pair

上面，fst,snd 是一对投影，return 是一对的 first/second 分量。还要注意 pair 是一个递归定义的对。是的，在 Haskell 中你可以递归定义一切，而不仅仅是函数。

在惰性语义下，上面的表达式大致是这样计算的：

snd pair
= -- definition of pair
snd (1, fst pair)
= -- application of snd
fst pair
= -- definition of pair
fst (1, fst pair)
= -- application of fst
1

相比之下，使用热切语义，我们会这样评估它：

snd pair
= -- definition of pair
snd (1, fst pair)
= -- must evaluate arguments before application, expand pair again
snd (1, fst (1, fst pair))
= -- must evaluate arguments
snd (1, fst (1, fst (1, fst pair)))
= -- must evaluate arguments
...

在eager evaluation中，我们坚持在应用fst/snd之前评估参数，我们得到了一个无限循环的程序。在某些语言中，这会触发 "stack overflow" 错误。

在惰性求值中，我们很快应用函数，即使参数没有被完全求值。这使得 snd (1, infiniteLoop) return 1 立即。

因此，惰性求值并不特定于列表。 Haskell 中的任何事物都是惰性的：树、函数、元组、记录、用户定义的 data 类型等

(挑剔：如果程序员真的要求它们，则可以定义具有严格/急切评估组件的类型。这可以使用严格注释或使用未装箱类型等扩展来完成。虽然有时这些有它们的用途，它们在 Haskell 程序中并不常见。）

Haskell 的懒惰是如何工作的？

How does Haskell's laziness work?

haskell