如何从 Haskell 中的列表中过滤字符串

How to filter Strings from a list in Haskell

我正在尝试创建一个程序,该程序读取一个文本文件并将文本拆分为一个列表,然后创建一个元组,其中包含每个 will 及其在文本中出现的次数。然后我需要能够从列表中删除某些单词并打印最终列表。

我尝试了不同的方法来尝试从 Haskell 中的字符串列表中过滤字符串,但没有成功。我发现 filter 函数最适合我想做的事情,但我不确定如何实现它。

到目前为止,我的代码是将从文件中读取的文本拆分为字符串列表:

toWords :: String -> [String]
toWords s = words s

然后我添加了这个以从列表中删除特定的字符串:

toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")

我知道这是错误的,但不确定该怎么做。请任何人帮助我。

到目前为止,这是我的完整代码:

main = do  
       contents <- readFile "testFile.txt"
       let lowContents = map toLower contents
       let outStr = countWords (lowContents)
       let finalStr = sortOccurrences (outStr)
       print outStr

-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)

-- Splits words.
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")

-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs

-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy comparing snd

Filter 是 Haskell 中已知的高阶函数。你应该看看它,那种功能非常有用。

也许您正在寻找的是这样的东西:

toWords s = filter (condition) s 

"condition" 也是一个函数,该函数必须包含您要应用的过滤器。

举个小例子,如果你有一个数字列表,而你只想取大于 10 的数字,它最终会是这样的:

filterNUmbers n = filter (>10) n

这将保留每个单词,但禁止的单词除外:

toWords s = filter (\w -> w `notElem` ["an","the","for"]) (words s)

等效变体:

-- explicit not
toWords s = filter (\w -> not (w `elem` ["an","the","for"])) (words s)
-- using and (&&) instead of elem
toWords s = filter (\w -> w/="an" && w/="the" && w/="for") (words s)
-- using where to define a custom predicate
toWords s = filter predicate (words s)
     where predicate w = w/="an" && w/="the" && w/="for") 
-- pointfree
toWords = filter (flip notElem ["an","the","for"]) . words