为什么 GHC 不能对一些无限列表进行推理?
Why can't GHC reason about some infinite lists?
got me thinking about Haskell's ability to work with infinite lists. There are plenty of other Whosebug 上关于无限列表的问答,我理解为什么我们不能对所有无限列表有一个通用的解决方案,但为什么不能 Haskell 推理 一些无限列表?
让我们使用第一个链接问题中的示例:
list1 = [1..]
list2 = [x | x <- list1, x <= 4]
print list2
$ [1,2,3,4
@user2297560 在评论中写道:
Pretend you're GHCI. Your user gives you an infinite list and asks you to find all the values in that list that are less than or equal to 4. How would you go about doing it? (Keep in mind that you don't know that the list is in order.)
在这种情况下,用户 没有给你一个无限列表。 GHC 生成了它!事实上,它是按照自己的规则生成的。 Haskell 2010 Standard 声明如下:
enumFrom :: a -> [a] -- [n..]
For the types Int and Integer, the enumeration functions have the following meaning:
- The sequence enumFrom
e1
is the list [e1
,e1
+ 1,e1
+ 2,…].
在他对另一个问题的回答中,@chepner 写道:
You know that the list is monotonically increasing, but Haskell does not.
这些用户的说法在我看来似乎不符合标准。 Haskell 使用单调递增以有序方式创建列表。 Haskell 应该 知道列表既有序又单调。那么为什么它不能推理这个无限列表自动将 [x | x <- list1, x <= 4]
变成 takeWhile (<= 4) list1
?
So why can't it reason about this infinite list to turn [x | x <- list1, x <= 4]
into takeWhile (<= 4) list1
automatically?
答案并不比 "It doesn't use takeWhile
because it doesn't use takeWhile
" 更具体。 The spec says:
Translation: List comprehensions satisfy these identities, which may
be used as a translation into the kernel:
[ e | True ] = [ e ]
[ e | q ] = [ e | q, True ]
[ e | b, Q ] = if b then [ e | Q ] else []
[ e | p <- l, Q ] = let ok p = [ e | Q ]
ok _ = []
in concatMap ok l
[ e | let decls, Q ] = let decls in [ e | Q ]
也就是说,列表推导式的含义是通过使用 if
-表达式、let
-绑定和调用 concatMap
翻译成更简单的语言来给出的。我们可以通过以下步骤翻译它来弄清楚你的例子的意思:
[x | x <- [1..], x <= 4]
-- apply rule 4 --
let ok x = [ x | x <= 4 ]
ok _ = []
in concatMap ok [1..]
-- eliminate unreachable clause in ok --
let ok x = [ x | x <= 4 ]
in concatMap ok [1..]
-- apply rule 2 --
let ok x = [ x | x <= 4, True ]
in concatMap ok [1..]
-- apply rule 3 --
let ok x = if x <= 4 then [ x | True ] else []
in concatMap ok [1..]
-- apply rule 1 --
let ok x = if x <= 4 then [ x ] else []
in concatMap ok [1..]
-- inline ok --
concatMap (\x -> if x <= 4 then [ x ] else []) [1..]
理论上,可以想象一个重写规则,例如
{-# RULES
"filterEnumFrom" forall (n :: Int) (m :: Int).
filter (< n) (enumFrom m) = [m..(n-1)]
#-}
这会自动将 filter (< 4) (enumFrom 1)
等表达式转换为 [1..3]
。所以是可能。但是,有一个明显的问题:此 exact 句法模式的任何变体都不会起作用。结果是您最终定义了 束 规则,并且您再也无法确定它们是否触发。如果你不能依赖规则,你最终就是不使用它们。 (另请注意,我已将规则专门用于 Int
s - 正如作为评论简要发布的那样,对于其他类型,这可能会以微妙的方式分解。)
归根结底,要执行更高级的分析,GHC 必须将一些跟踪信息附加到列表中以说明它们是如何生成的。这要么会使列表的抽象变得不那么轻量级,要么意味着 GHC 中会有一些特殊的机制 just 用于在编译时优化列表。这些选项都不好。
也就是说,您始终可以通过在列表的 top 上创建列表类型来添加自己的 跟踪信息。
data List a where
EnumFromTo :: Enum a => a -> Maybe a -> List a
Filter :: (a -> Bool) -> List a -> List a
Unstructured :: [a] -> List a
这可能最终更容易优化。
让我们使用第一个链接问题中的示例:
list1 = [1..]
list2 = [x | x <- list1, x <= 4]
print list2
$ [1,2,3,4
@user2297560 在评论中写道:
Pretend you're GHCI. Your user gives you an infinite list and asks you to find all the values in that list that are less than or equal to 4. How would you go about doing it? (Keep in mind that you don't know that the list is in order.)
在这种情况下,用户 没有给你一个无限列表。 GHC 生成了它!事实上,它是按照自己的规则生成的。 Haskell 2010 Standard 声明如下:
enumFrom :: a -> [a] -- [n..]
For the types Int and Integer, the enumeration functions have the following meaning:
- The sequence enumFrom
e1
is the list [e1
,e1
+ 1,e1
+ 2,…].
在他对另一个问题的回答中,@chepner 写道:
You know that the list is monotonically increasing, but Haskell does not.
这些用户的说法在我看来似乎不符合标准。 Haskell 使用单调递增以有序方式创建列表。 Haskell 应该 知道列表既有序又单调。那么为什么它不能推理这个无限列表自动将 [x | x <- list1, x <= 4]
变成 takeWhile (<= 4) list1
?
So why can't it reason about this infinite list to turn
[x | x <- list1, x <= 4]
intotakeWhile (<= 4) list1
automatically?
答案并不比 "It doesn't use takeWhile
because it doesn't use takeWhile
" 更具体。 The spec says:
Translation: List comprehensions satisfy these identities, which may be used as a translation into the kernel:
[ e | True ] = [ e ] [ e | q ] = [ e | q, True ] [ e | b, Q ] = if b then [ e | Q ] else [] [ e | p <- l, Q ] = let ok p = [ e | Q ] ok _ = [] in concatMap ok l [ e | let decls, Q ] = let decls in [ e | Q ]
也就是说,列表推导式的含义是通过使用 if
-表达式、let
-绑定和调用 concatMap
翻译成更简单的语言来给出的。我们可以通过以下步骤翻译它来弄清楚你的例子的意思:
[x | x <- [1..], x <= 4]
-- apply rule 4 --
let ok x = [ x | x <= 4 ]
ok _ = []
in concatMap ok [1..]
-- eliminate unreachable clause in ok --
let ok x = [ x | x <= 4 ]
in concatMap ok [1..]
-- apply rule 2 --
let ok x = [ x | x <= 4, True ]
in concatMap ok [1..]
-- apply rule 3 --
let ok x = if x <= 4 then [ x | True ] else []
in concatMap ok [1..]
-- apply rule 1 --
let ok x = if x <= 4 then [ x ] else []
in concatMap ok [1..]
-- inline ok --
concatMap (\x -> if x <= 4 then [ x ] else []) [1..]
理论上,可以想象一个重写规则,例如
{-# RULES
"filterEnumFrom" forall (n :: Int) (m :: Int).
filter (< n) (enumFrom m) = [m..(n-1)]
#-}
这会自动将 filter (< 4) (enumFrom 1)
等表达式转换为 [1..3]
。所以是可能。但是,有一个明显的问题:此 exact 句法模式的任何变体都不会起作用。结果是您最终定义了 束 规则,并且您再也无法确定它们是否触发。如果你不能依赖规则,你最终就是不使用它们。 (另请注意,我已将规则专门用于 Int
s - 正如作为评论简要发布的那样,对于其他类型,这可能会以微妙的方式分解。)
归根结底,要执行更高级的分析,GHC 必须将一些跟踪信息附加到列表中以说明它们是如何生成的。这要么会使列表的抽象变得不那么轻量级,要么意味着 GHC 中会有一些特殊的机制 just 用于在编译时优化列表。这些选项都不好。
也就是说,您始终可以通过在列表的 top 上创建列表类型来添加自己的 跟踪信息。
data List a where
EnumFromTo :: Enum a => a -> Maybe a -> List a
Filter :: (a -> Bool) -> List a -> List a
Unstructured :: [a] -> List a
这可能最终更容易优化。