如何从 Haskell 中的列表中过滤字符串
How to filter Strings from a list in Haskell
我正在尝试创建一个程序,该程序读取一个文本文件并将文本拆分为一个列表,然后创建一个元组,其中包含每个 will 及其在文本中出现的次数。然后我需要能够从列表中删除某些单词并打印最终列表。
我尝试了不同的方法来尝试从 Haskell 中的字符串列表中过滤字符串,但没有成功。我发现 filter
函数最适合我想做的事情,但我不确定如何实现它。
到目前为止,我的代码是将从文件中读取的文本拆分为字符串列表:
toWords :: String -> [String]
toWords s = words s
然后我添加了这个以从列表中删除特定的字符串:
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
我知道这是错误的,但不确定该怎么做。请任何人帮助我。
到目前为止,这是我的完整代码:
main = do
contents <- readFile "testFile.txt"
let lowContents = map toLower contents
let outStr = countWords (lowContents)
let finalStr = sortOccurrences (outStr)
print outStr
-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)
-- Splits words.
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs
-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy comparing snd
Filter 是 Haskell 中已知的高阶函数。你应该看看它,那种功能非常有用。
也许您正在寻找的是这样的东西:
toWords s = filter (condition) s
"condition" 也是一个函数,该函数必须包含您要应用的过滤器。
举个小例子,如果你有一个数字列表,而你只想取大于 10 的数字,它最终会是这样的:
filterNUmbers n = filter (>10) n
这将保留每个单词,但禁止的单词除外:
toWords s = filter (\w -> w `notElem` ["an","the","for"]) (words s)
等效变体:
-- explicit not
toWords s = filter (\w -> not (w `elem` ["an","the","for"])) (words s)
-- using and (&&) instead of elem
toWords s = filter (\w -> w/="an" && w/="the" && w/="for") (words s)
-- using where to define a custom predicate
toWords s = filter predicate (words s)
where predicate w = w/="an" && w/="the" && w/="for")
-- pointfree
toWords = filter (flip notElem ["an","the","for"]) . words
我正在尝试创建一个程序,该程序读取一个文本文件并将文本拆分为一个列表,然后创建一个元组,其中包含每个 will 及其在文本中出现的次数。然后我需要能够从列表中删除某些单词并打印最终列表。
我尝试了不同的方法来尝试从 Haskell 中的字符串列表中过滤字符串,但没有成功。我发现 filter
函数最适合我想做的事情,但我不确定如何实现它。
到目前为止,我的代码是将从文件中读取的文本拆分为字符串列表:
toWords :: String -> [String]
toWords s = words s
然后我添加了这个以从列表中删除特定的字符串:
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
我知道这是错误的,但不确定该怎么做。请任何人帮助我。
到目前为止,这是我的完整代码:
main = do
contents <- readFile "testFile.txt"
let lowContents = map toLower contents
let outStr = countWords (lowContents)
let finalStr = sortOccurrences (outStr)
print outStr
-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)
-- Splits words.
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs
-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy comparing snd
Filter 是 Haskell 中已知的高阶函数。你应该看看它,那种功能非常有用。
也许您正在寻找的是这样的东西:
toWords s = filter (condition) s
"condition" 也是一个函数,该函数必须包含您要应用的过滤器。
举个小例子,如果你有一个数字列表,而你只想取大于 10 的数字,它最终会是这样的:
filterNUmbers n = filter (>10) n
这将保留每个单词,但禁止的单词除外:
toWords s = filter (\w -> w `notElem` ["an","the","for"]) (words s)
等效变体:
-- explicit not
toWords s = filter (\w -> not (w `elem` ["an","the","for"])) (words s)
-- using and (&&) instead of elem
toWords s = filter (\w -> w/="an" && w/="the" && w/="for") (words s)
-- using where to define a custom predicate
toWords s = filter predicate (words s)
where predicate w = w/="an" && w/="the" && w/="for")
-- pointfree
toWords = filter (flip notElem ["an","the","for"]) . words