Haskell 中文本文件中每一行的字符串连接

Question

我正在尝试编写一个代码来读取给定的文本文件作为它的输入并交付相同的文本文件，其中每行的每个字符串的长度与该字符串的长度连接。

为此，我创建了一个文本文件，每行一个字符串。我已经设法编写了一个代码，从文本文件中获取一行并输出该行，并将其长度显示为其中的一部分，但我无法编写此代码的递归版本，因此它会继续执行这对文本文件的每一行都如此，直到没有更多行为止。如果我一直在使用列表，那不会有问题，但我不能对充满字符串的文本文件使用模式匹配。

我只需要获取第二个代码示例以将其自身应用于整个列表，但我做不到。如何在不使用 functors/fmap 的情况下更改我的代码以使其正常工作？对于这个愚蠢的问题，我真的很抱歉，我对编程还很陌生。

import System.IO
main :: IO()
main = do
   file <- openFile ".txt" ReadMode
   x <- hGetContents file >>=
   xs <- hGetLine 
   if null xs 
    then return ()
    else do
        putStrLn $ xs ++ " has a length of " ++ show (length xs)
   hClose file

或

main :: IO()
main = do
file <- openFile ".txt" ReadMode
x <- hGetLine file
xs <- hGetLine file
if null x 
  then return ()
  else do 
      putStrLn $ x ++ " has a length of " ++ show (length x)
      putStrLn $ xs ++ " has a length of " ++ show (length xs)
 hClose file

Answer 1

要递归地写这个，你需要一个函数来调用它自己。您已经有一个可以调用自身的函数 main，但您不想多次打开该文件，因此最好拆分打开的部分（和关闭）使用辅助函数读取行的部分的文件：

main :: IO ()
main = do
  file <- openFile "test.txt" ReadMode
  processLines
  hClose file

现在我们可以编写递归processLines函数：

processLines :: Handle -> IO ()
processLines file = do
  x <- hGetLine file
  putStrLn $ x ++ " has a length of " ++ show (length x)
  processLines file

这行得通，但它无条件地调用自身，因此它会一直读取行，直到到达文件末尾并抛出异常。我们可以使用函数 hIsEOF 来解决这个问题：

processLines :: Handle -> IO ()
processLines file = do
  eof <- hIsEOF file
  if eof
    then return ()
    else do
      x <- hGetLine file
      putStrLn $ x ++ " has a length of " ++ show (length x)
      processLines file

因为 processLines 只是一个辅助函数，在 main 之外没有合理的用途，大多数 Haskell 程序员会使用 [=26] 将它拉入 main =] 子句或 let 定义，并可能给它一个更短的一次性名称，如 loop 或 process 或 go。 where 子句的结果是：

main :: IO ()
main = do
  file <- openFile "test.txt" ReadMode
  process file
  hClose file

  where
    process file = do
      eof <- hIsEOF file
      if eof
        then return ()
        else do
          x <- hGetLine file
          putStrLn $ x ++ " has a length of " ++ show (length x)
          process file

使用 let 的一个优点是您可以在范围内使用 file 变量定义助手，因此您不必将其作为参数传递：

main :: IO ()
main = do
  file <- openFile "test.txt" ReadMode
  let loop = do
        eof <- hIsEOF file
        if eof
          then return ()
          else do
            x <- hGetLine file
            putStrLn $ x ++ " has a length of " ++ show (length x)
            loop
  loop
  hClose file

Haskell 提供了更合理的循环形式编写此程序的方法，避免了显式递归和定义辅助函数的需要。例如：

-- using `whileM_` from `monad-loops`
main1 :: IO ()
main1 = withFile "test.txt" ReadMode $ \h ->
  whileM_ (not <$> hIsEOF h) $ do
      x <- hGetLine h
      putStrLn $ x ++ " has a length of " ++ show (length x)

或：

-- using `forM_` from `Control.Monad`
main2 :: IO ()
main2 = do
  contents <- readFile "test.txt"
  let lns = lines contents
  forM_ lns $ \x ->
    putStrLn $ x ++ " has a length of " ++ show (length x)

然而，

None 这些都是编写此程序的非常好的方法。从根本上说，您对一行输入进行了纯转换：

annotate :: String -> String
annotate x = x ++ " has a length of " ++ show (length x)

并且您想将它应用于文件中的所有行，您可以使用单行代码（诚然在 IO Functor 上使用 fmap）：

main = putStr =<< unlines . map annotate . lines <$> readFile "test.txt"

或更明确地说：

main = do
  content <- readFile "test.txt"
  let lns = lines content
      lns' = [annotate l | l <- lns]
  putStr $ unlines lns'

Answer 2

您似乎需要递归解决方案。但是 main 操作不太适合递归调用自身，因为它有特定的职责只发生一次：打开和关闭文件。

所以你需要一个单独的递归操作，它假设文件管理是由上面的某个层完成的，并且只处理一个预先准备好的文件句柄。假设我们将其命名为 processFileHandle.

使用那种类型签名：

processFileHandle :: Handle -> IO ()  -- for now

但是，等一下！我们有这个文本转换要做：

xs ++ " has a length of " ++ show (length xs)

我们要将这种代码硬编码到我们的processFileHandle函数中吗？绝对不！我们希望将文本处理与文件 I/O 分开。这样，我们就避免了每次我们必须进行的那种线转换发生变化时都重新编写 processFileHandle 的必要性。

所以更好的类型签名是：

processFileHandle :: (String -> String) ->  Handle -> IO ()

我们将线变换作为额外的 功能性 参数提供。在我们的例子中，这是：

transformLine1 :: String -> String
transformLine1 str =
    let  ln = length str
    in   str ++ " has a length of " ++ (show ln)

现在，要继续 processFileHandle，我们需要一种方法来优雅地检测文件结束条件。但是这个函数必然具有类型签名：Handle -> IO Bool

所以我们将这个类型签名提交到 Hoogle specialized search engine. And Hoogle points us towards the hIsEOF 库函数中，这正是我们所需要的。

我们现在可以编写我们的 main 动作，这就是：

main :: IO ()
main = do
    fh <- openFile  "foo.txt"  ReadMode
    processFileHandle transformLine1 fh
    hClose fh

现在，我们可以提供 processFileHandle 的代码，因为我们可以测试文件结尾：

processFileHandle :: (String -> String) -> Handle -> IO ()
processFileHandle fn fh =
  do
      atTheEnd <- hIsEOF fh    -- are we done ?
      if atTheEnd then
                      return ()  -- nothing left to do
                  else
                      do
                          line0 <- hGetLine fh
                          let  line1 = fn line0
                          putStrLn line1
                          processFileHandle fn fh  -- recursive call

测试：

$ 
$ cat foo.txt
alpha
beta
epsilon
eta
$ 
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.8.4
$ 
$ ghc q68324502.hs -o ./q68324502.x
[1 of 1] Compiling Main             ( q68324502.hs, q68324502.o )
Linking ./q68324502.x ...
$ 
$ ./q68324502.x
alpha has a length of 5
beta has a length of 4
epsilon has a length of 7
eta has a length of 3
$

Haskell 中文本文件中每一行的字符串连接

String concatenation with each line in a text file in Haskell

io

recursion

haskell

测试：