ByteString 正则表达式与 AllTextMatches 结果类型匹配

ByteString regex match with AllTextMatches result type

我在使用 Real World Haskell 第 8 章中的以下示例 ghci 交互时遇到问题。借助 rampion 对 a related question 的回答,预期输出应该是:

> :m +Text.Regex.Posix Data.ByteString.Char8
> getAllTextMatches $ pack "foo" =~ pack "o" :: [(Int, Int)]
[(1,1),(2,1)]

相反,我看到缺少实例错误:

> getAllTextMatches $ pack "foo" =~ pack "o" :: [(Int, Int)]
    No instance for (RegexContext
                       Regex ByteString (AllTextMatches [] (Int, Int)))
      arising from a use of `=~'
    Possible fix:
      add an instance declaration for
      (RegexContext Regex ByteString (AllTextMatches [] (Int, Int)))
    In the second argument of `($)', namely `pack "foo" =~ pack "o"'
    In the expression:
        getAllTextMatches $ pack "foo" =~ pack "o" :: [(Int, Int)]
    In an equation for `it':
        it = getAllTextMatches $ pack "foo" =~ pack "o" :: [(Int, Int)]

什么给了?

从你的错误开始:

No instance for (RegexContext
                   Regex ByteString (AllTextMatches [] (Int, Int)))

看看我们在范围内的内容,AllTextMatches 只定义了一个 RegexContext 实例:

λ :info AllTextMatches
...
instance RegexLike a b => RegexContext a b (AllTextMatches [] b)
  -- Defined in ‘Text.Regex.Base.Context’

范围内只有两个 RegexLike 个实例:

λ :i RegexLike
...
instance RegexLike Regex String
  -- Defined in ‘Text.Regex.Posix.String’
instance RegexLike Regex ByteString
  -- Defined in ‘Text.Regex.Posix.ByteString’

因为我们在这里处理 ByteStrings,我们必须使用 RegexLike Regex ByteString 实例,这让我们可以推断 ab in [=16] =]' RegexContext 的实例为 RegexByteString,所以我们有:

RegexContext Regex ByteString (AllTextMatches [] ByteString)

如果我们要求:

λ getAllTextMatches (pack "foo" =~ pack "o") :: [ByteString]
["o","o"]

有效!

但这并没有给我们任何类似 (MatchOffset, MatchLength) 的东西,这有点令人失望。

我猜你正在尝试 to run this example?

那里的评论中有相当多的讨论,但总而言之,Text.Regex.Posix 的 API 自 RWH 编写以来显然发生了变化。不足为奇,因为现在是 0.95.2when RWH was first published in November 2008 the latest version was 0.72.3.

这些天,to get a list of all the match positions and offsets use getAllMatches

λ getAllMatches (pack "foo" =~ pack "o") :: [(Int, Int)]
[(1,1),(2,1)]