有人可以向我解释这些 Iterator、Yield monad 类型和函数的含义吗,就像我 5 岁一样?

Could someone explain to me what these Iterator, Yield monad types and functions mean like I am 5?

下面是代码。 applicative, functor, traversable 和 monad 都有一定的了解。迭代器和 yield 类型以及 yield 函数是我最难理解的。例如,Iterator 中的 i or r 和 Yield 中的 i or a 是什么? traverseY 到底做了什么,签名是什么意思?以及这里如何应用 Monad Cont 也让我很困惑。感谢您的阅读,我们将不胜感激。

import Control.Monad.Cont
import Control.Monad.Random
import System.Random.Shuffle


data Iterator i o r =
    Result r
  | Susp o (i -> Iterator i o r)
--------------------------------------------------------------------------------------------
newtype Yield i o r a = Yield { unY :: Cont (Iterator i o r) a }

instance Functor (Yield i o r) where
  fmap f (Yield m) = Yield (fmap f m)

instance Applicative (Yield i o r) where
  pure a = Yield (pure a)
  (Yield mf) <*> (Yield ma) = Yield (mf <*> ma)

instance Monad (Yield i o r) where
  return a = Yield (return a)
  (Yield m) >>= k = Yield (m >>= \a -> unY (k a))

instance MonadCont (Yield i o r) where
  callCC c = Yield (callCC (\k -> unY (c (\a -> Yield (k a)))))

runYield :: Yield i o r r -> Iterator i o r
runYield (Yield m) = runCont m Result

yield :: o -> Yield i o r i
yield o = callCC (\k -> Yield (cont (\_ -> Susp o (\i -> runYield (k i)))))

-------------------------------------------------------------------------------------------
data Tree a = Empty | Node a (Tree a) (Tree a)

traverseY :: Tree a -> Yield (Tree a) (a,Tree a,Tree a) r ()
traverseY Empty = return ()
traverseY (Node a t1 t2) = do t <- yield (a, t1, t2); traverseY t

一个 Iterator i o r 表示一个过程,它重复获取类型 i 的值并输出类型 o 的值,最终在 [=19= 之一之后中断迭代]s 改为 returning a r。例如

accum :: Iterator Int Int Int
accum = go 0
  where go n | n >= 100 = Result n
             | otherwise = Susp n (\i -> go (n + i))
-- potential usage
main :: IO ()
main = go accum -- iterator returns modified copies instead of mutating, so prepare to replace the iterator through iteration/recursion
  where go (Result r) = putStrLn $ "Final total: " ++ show r -- check whether iterator has values/extract values by pattern matching; a finished iterator can return extra data of type r if it likes
        go (Susp o i) = do
          putStrLn $ "Running total: " ++ show o
          putStr $ "Add: "
          n <- readLn
          go (i n) -- bidirectional communication! get "incremented" iterator by feeding in an input value (you could write no-input iterators by setting i = ())

是一个跟踪某个累加器 n 的迭代器。它将每个输入添加到累加器,每次输出新的累加器。一旦累加器达到 100,它就停止接受输入并且 return 是最终值。请注意,由于一切都是不可变的,为了保持状态,Iterator 每次更改状态时都必须 return 自身的新版本 。依次“遍历”accum 的人必须使用 returned Iterator 而不是 accum 本身。在 Python 中,您可以将 accum 写为:

def accum(): # calling accum() instead creates a new object that mutates itself through iteration
  sum = 0
  while sum < 100: sum = sum + (yield sum)
  return sum

# same usage
def main():
  gen = accum()
  o = next(gen)
  try: # I was hoping the Python version would help understanding... but I think I overestimated how pythonic Python would be...
    while True:
      print(f"Running total: {o}")
      o = gen.send(int(input("Add: ")))
  except StopIteration as e:
    print(f"Final total: {e.value}")

Yield i o r r 被用作 Iterator i o r 的“生成器”。您可以直接为 Iterator 编写 Yield 接口的模拟:

instance Functor (Iterator i o) where
  fmap f (Result x) = Result (f x)
  fmap f (Susp o i) = Susp o (fmap f . i)
instance Applicative (Iterator i o) where
  pure = Result
  liftA2 = liftM2
instance Monad (Iterator i o) where
  Result r >>= f = f r
  Susp o i >>= f = Susp o ((>>= f) . i)

-- yieldI x is the iterator that sends x to the caller and receives a value in return
-- in Python: def yieldI(x): return yield x
yieldI :: o -> Iterator i o i
yieldI x = Susp x Result

例如DFS 示例中的生成器是

data Tree a = Empty | Node a (Tree a) (Tree a)
dfsI :: Tree a -> Iterator b a (Tree b) -- yield elements of the tree (o = a) *and also* receive new elements to replace them (i = b), building a new tree (r = Tree b)
-- dfs = traverse yieldI
dfsI Empty = Result Empty
dfsI (Node l m r) = do
  l' <- yieldI l
  m' <- dfsI m
  r' <- dfsI r
  return (Node l' m' r')

这里直接使用Iterator i o r的问题是效率低下。 (请记住 Haskell 是惰性计算以理解以下内容。)如果您“连接”许多迭代器,例如 ((x >>= f) >>= g) >>= h,那么当您尝试对其进行计算时,您 运行 就会遇到麻烦。假设 x 的计算结果为 Susp o i。然后评估更大的表达式首先执行三个函数调用,进入 >>=s,评估 xSusp o i,然后 创建新的暂停函数调用 到产生 Susp o (\a -> ((i a >>= f) >>= g) >>= h)。当您遍历此迭代器时(即提取 lambda 并使用一些参数调用它),每次迭代 必须遍历挂在 [= 上的所有 >>= 24=]。哎呀。 (也许用更熟悉的术语来说,我们已经将迭代器串联实现为另一个“包装器”,当包装器上有包装器时,这会变得很糟糕......)

使用 Cont 是避免这种情况的“标准技巧”。这个想法是,我们不直接处理迭代器 x,而是处理它的绑定函数(包装在 ContYield 新类型中)x' = \f -> x >>= f。请注意,将单子计算转换为其绑定函数是可逆的,因为 x' return = x >>= return = x.

newtype Yield i o r a = Yield { unYield :: (a -> Iterator i o r) -> Iterator i o r }
instance Functor (Yield i o r) where
  fmap f (Yield r) = Yield (\k -> r $ k . f)
instance Applicative (Yield i o r) where
  pure x = Yield (\k -> k x)
  liftA2 = liftM2
instance Monad (Yield i o r) where
  Yield r >>= f = Yield (\k -> r $ \x -> unYield (f x) k)
-- newtype Yield i o r a = Yield { unYield :: (a -> Iterator i o r) -> Iterator i o r }
-- compare         (>>=) :: Iterator i o a -> (a -> Iterator i o r) -> Iterator i o r
-- giving
liftIY :: Iterator i o a -> Yield i o r a
liftIY x = Yield (x >>=)
-- and the law x >>= return = x inspires
runYield :: Yield i o r r -> Iterator i o r
runYield (Yield r) = r return
-- e.g.
yield :: o -> Yield i o r i
-- yield = liftIY . yieldI
yield x = Yield (\k -> Susp x (\i -> k i)) -- note this is the same as yours after you drill through newtype baggage

而不是 ((x >>= f) >>= g) >>= h,使用 Yield 将使像 (((x' >>= liftIY . f) >>= liftIY . g) >>= liftIY . h) return 这样的术语出现在 运行 时间。计算结果仅为 x' >>= _someFunction,因此之前的所有包装器都合并为一个,希望能提高效率。 (这个折叠的包装器将在您迭代时“自我替换”,依次经历 fgh 指定的行为。这是在 Yield 中编码的的 >>=.)

-- magical free efficiency by replacing Iterator -> Yield, yieldI -> yield
dfs :: Tree a -> Yield b a r (Tree b)
dfs = traverse yield
-- when using Yields as builders, you will treat Yield i o _ r like Iterator i o r
-- the middle parameter should be polymorphic for builders like dfs
-- while the final consumer (particularly the "standard consumer" runYield) fixes it to something (runYield sets it to the final return type)
-- (this behavior is a loosely typed reflection of the Codensity monad)

在最终使用的地方,需要将Yield做成Iterator才可以迭代。你的 dfsDirect:

  1. 使用 dfs = traverse yieldI 构建 Yield 而不会招致“意外的二次左结合性”(技术术语 :))
  2. 使用 runYieldYield 构建 Iterator。此迭代器遍历 tr 和 yields/replaces 它的元素。
  3. 通过 loop 迭代该迭代器,这...
  4. 通过行 loop (Susp o i) = loop (i [o])dfs 进行“对话”:当它接收到 o 时,它会将 [o] 发送回迭代器,迭代器将其放入 Tree 正在建设中。
  5. 迭代器耗尽后,接收一个新的 Tree,其中每个元素都替换为单例列表 (loop (Result r) = _)。
  6. 通过 Foldable Tree 实例将所有列表连接在一起。

这是一种非常愚蠢的做事方式,因为订单不是来自 dfs,而是来自最后一步中使用的 Foldable Tree 实例。 Iterator 只是用作美化的 fmap 函数。如果 Foldable 实例不同(例如 BFS,甚至只是中序与预序),但您将 dfs 保留为预序 DFS(因此它不再写为 traverse yield),dfsDirect 不会按照 dfs 定义的实际顺序输出!您可以编写一个 正确地 Iterator 转换为列表的函数。

-- note the usage of () as an input type for "plain" iterators
-- since we cannot know what to pass in otherwise 
iToList :: Iterator () o r -> ([o], r)
iToList (Result r) = ([], r)
iToList (Susp o i) = (o : os, r)
  where ~(os, r) = iToList (i ())

traverseY也有点奇怪。如果它接收到 Node(作为初始值或作为迭代器输入),它会返回 Node 的字段,并且在 Empty 上返回 return。它实际上并没有“遍历”它的输入;您只需将那棵树作为输入发送,就可以将其发送到一棵全新的树中。我假设这个想法是,当你迭代它时,你发回它之前 returned 的 Trees 之一,所以它迭代 path 到一片树叶。国际海事组织写

会更好
data Direction = L | R
path :: Tree a -> Yield Direction a r () -- outputs root and goes in direction as told until it runs out of tree
path Empty = pure ()
path (Node x l r) = do
  d <- yield x
  case d of
    L -> path l
    R -> path r
-- potential use
elemBST :: Ord a => a -> Tree a -> Bool
elemBST x xs = go (runYield $ path xs)
  where go (Result ()) = False -- iteration ended without success
        go (Susp y i) = case compare x y of
          LT -> go (i L) -- go left
          EQ -> True     -- end
          GT -> go (i R) -- go right