堆栈溢出，两个函数在应用程序解析器中相互调用

Question

我正在学习 data61 的课程：https://github.com/data61/fp-course。在解析器一中，以下实现将导致 parse (list1 (character *> valueParser 'v')) "abc" 堆栈溢出。

现有代码：

data List t =
  Nil
  | t :. List t
  deriving (Eq, Ord)

-- Right-associative
infixr 5 :.

type Input = Chars

data ParseResult a =
    UnexpectedEof
  | ExpectedEof Input
  | UnexpectedChar Char
  | UnexpectedString Chars
  | Result Input a
  deriving Eq

instance Show a => Show (ParseResult a) where
  show UnexpectedEof =
    "Unexpected end of stream"
  show (ExpectedEof i) =
    stringconcat ["Expected end of stream, but got >", show i, "<"]
  show (UnexpectedChar c) =
    stringconcat ["Unexpected character: ", show [c]]
  show (UnexpectedString s) =
    stringconcat ["Unexpected string: ", show s]
  show (Result i a) =
    stringconcat ["Result >", hlist i, "< ", show a]

instance Functor ParseResult where
  _ <$> UnexpectedEof =
    UnexpectedEof
  _ <$> ExpectedEof i =
    ExpectedEof i
  _ <$> UnexpectedChar c =
    UnexpectedChar c
  _ <$> UnexpectedString s =
    UnexpectedString s
  f <$> Result i a =
    Result i (f a)

-- Function to determine is a parse result is an error.
isErrorResult ::
  ParseResult a
  -> Bool
isErrorResult (Result _ _) =
  False
isErrorResult UnexpectedEof =
  True
isErrorResult (ExpectedEof _) =
  True
isErrorResult (UnexpectedChar _) =
  True
isErrorResult (UnexpectedString _) =
  True

-- | Runs the given function on a successful parse result. Otherwise return the same failing parse result.
onResult ::
  ParseResult a
  -> (Input -> a -> ParseResult b)
  -> ParseResult b
onResult UnexpectedEof _ = 
  UnexpectedEof
onResult (ExpectedEof i) _ = 
  ExpectedEof i
onResult (UnexpectedChar c) _ = 
  UnexpectedChar c
onResult (UnexpectedString s)  _ = 
  UnexpectedString s
onResult (Result i a) k = 
  k i a

data Parser a = P (Input -> ParseResult a)

parse ::
  Parser a
  -> Input
  -> ParseResult a
parse (P p) =
  p

-- | Produces a parser that always fails with @UnexpectedChar@ using the given character.
unexpectedCharParser ::
  Char
  -> Parser a
unexpectedCharParser c =
  P (\_ -> UnexpectedChar c)

--- | Return a parser that always returns the given parse result.
---
--- >>> isErrorResult (parse (constantParser UnexpectedEof) "abc")
--- True
constantParser ::
  ParseResult a
  -> Parser a
constantParser =
  P . const

-- | Return a parser that succeeds with a character off the input or fails with an error if the input is empty.
--
-- >>> parse character "abc"
-- Result >bc< 'a'
--
-- >>> isErrorResult (parse character "")
-- True
character ::
  Parser Char
character = P p
  where p Nil = UnexpectedString Nil
        p (a :. as) = Result as a

-- | Parsers can map.
-- Write a Functor instance for a @Parser@.
--
-- >>> parse (toUpper <$> character) "amz"
-- Result >mz< 'A'
instance Functor Parser where
  (<$>) ::
    (a -> b)
    -> Parser a
    -> Parser b
  f <$> P p = P p'
    where p' input = f <$> p input 

-- | Return a parser that always succeeds with the given value and consumes no input.
--
-- >>> parse (valueParser 3) "abc"
-- Result >abc< 3
valueParser ::
  a
  -> Parser a
valueParser a = P p
  where p input = Result input a

-- | Return a parser that tries the first parser for a successful value.
--
--   * If the first parser succeeds then use this parser.
--
--   * If the first parser fails, try the second parser.
--
-- >>> parse (character ||| valueParser 'v') ""
-- Result >< 'v'
--
-- >>> parse (constantParser UnexpectedEof ||| valueParser 'v') ""
-- Result >< 'v'
--
-- >>> parse (character ||| valueParser 'v') "abc"
-- Result >bc< 'a'
--
-- >>> parse (constantParser UnexpectedEof ||| valueParser 'v') "abc"
-- Result >abc< 'v'
(|||) ::
  Parser a
  -> Parser a
  -> Parser a
P a ||| P b = P c
  where c input
          | isErrorResult resultA = b input
          | otherwise = resultA
            where resultA = a input

infixl 3 |||

我的代码：

instance Monad Parser where
  (=<<) ::
    (a -> Parser b)
    -> Parser a
    -> Parser b
  f =<< P a = P p
    where p input = onResult (a input) (\i r -> parse (f r) i)

instance Applicative Parser where
  (<*>) ::
    Parser (a -> b)
    -> Parser a
    -> Parser b
  P f <*> P a = P b
    where b input = onResult (f input) (\i f' -> f' <$> a i)

list ::
  Parser a
  -> Parser (List a)
list p = list1 p ||| pure Nil

list1 ::
  Parser a
  -> Parser (List a)
list1 p = (:.) <$> p <*> list p

但是，如果我将 list 更改为不使用 list1，或者在 list1 中使用 =<<，它就可以正常工作。如果 <*> 使用 =<<，它也有效。我觉得这可能是尾递归的问题。

更新：

如果我在这里使用惰性模式匹配

  P f <*> ~(P a) = P b
    where b input = onResult (f input) (\i f' -> f' <$> a i)

它工作正常。这里的模式匹配是问题所在。我不明白这个...请帮助！

Answer 1

If I use lazy pattern matching P f <*> ~(P a) = ... then it works fine. Why?

这个问题就是。您也可以使用 newtype 而不是 data 来修复它：newtype Parser a = P (Input -> ParseResult a).(*)

list1 的定义想知道 both 解析器参数到 <*>，但实际上第一个失败的时候（当输入耗尽时）我们不需要知道第二个！但是因为我们强制了它，它会强制它的第二个参数，而那个参数会强制它的第二个解析器，无穷无尽。(**) 也就是说，p 将失败输入已耗尽，但我们有 list1 p = (:.) <$> p <*> list p 强制 list p，即使在前面的 p 失败时它不会运行。这就是无限循环的原因，也是您使用惰性模式进行修复的原因。

What is the difference between data and newtype in terms of laziness?

(*)newtype'd 类型总是只有一个数据构造函数，并且其上的模式匹配确实 not 实际上强制值，所以它隐式地像懒惰的模式。尝试 newtype P = P Int、let foo (P i) = 42 in foo undefined，看看它是否有效。

(**) 这发生在解析器仍处于准备、组合状态时；在组合、组合的解析器甚至在实际输入上达到运行之前。这意味着还有另一种解决问题的第三种方法：定义

list1 p = (:.) <$> p <*> P (\s -> parse (list p) s)

无论 <*> 的惰性如何，无论使用 data 还是 newtype，这都应该有效。

有趣的是，上面的定义意味着解析器将在运行时间内实际创建，具体取决于输入，这是 Monad 的定义特征，而不是应该静态知道的 Applicative，在进步。但这里的区别在于 Applicative 取决于输入的隐藏状态，而不是 "returned" 值。

堆栈溢出，两个函数在应用程序解析器中相互调用

Stack overflow with two functions calling each other in Applicative parser

stack-overflow

recursion

parsing

haskell

functional-programming