减少此 Haskell 函数中的分配（和工作）的建议

Question

我的（更大的）Haskell 代码中有以下函数（有一些支持代码可以清楚地说明是什么）：

import qualified Data.Set as S
import qualified Data.IntMap.Strict as M
import Data.Ord
import Data.Monoid

data Atom = Neg { index :: Int }
          | Pos { index :: Int }
          deriving (Eq, Ord, Show, Read) 

newtype Clause = Clause { atoms :: S.Set Atom }
  deriving (Eq, Show, Read)

instance Ord Clause where
  compare = comparing (Down . S.size . atoms) <> comparing atoms

newtype Form = Form { clauses :: S.Set Clause }
  deriving (Eq, Ord, Show, Read)

type Interpretation = M.IntMap Bool

-- the function of interest
interpret :: Interpretation -> Form -> Maybe Bool
interpret interp = evalForm
  where evalAtom x@(Pos _) = M.lookup (index x) interp
        evalAtom x@(Neg _) = not <$> M.lookup (index x) interp
        evalClause (Clause x)
          | S.member (Just False) evaluated = Just False
          | evaluated == S.singleton (Just True) = Just True
          | otherwise = Nothing
          where evaluated = S.map evalAtom x
        evalForm (Form x)
          | S.member (Just True) evaluated = Just True
          | evaluated == S.singleton (Just False) = Just False
          | otherwise = Nothing
          where evaluated = S.map evalClause x

分析了我的 Haskell 程序后，我发现这个 interpret 函数的分配占我程序中所有分配的近 40%（以及 [=20] 的大约 40% =] 工作）。

有什么方法可以减少 interpret 的工作量或分配的工作量？这可能会为我赢得巨大的性能提升（这是我真正需要的，因为我需要多次运行这段代码，用于实验）。

Answer 1

我会尝试 S.foldr。

从你的代码来看，这些似乎是 AND 子句，所以我假设一个空子句是错误的。

evalClause (Clause x) = S.foldr f (Just False) $ S.map evalAtom x
     where f b@(Just False) _              = b
           f (Just True)    y              = y
           f Nothing        y@(Just False) = y
           f Nothing        y              = Nothing

和 evalForm 的类似物。

使用列表而不是集合可能也有好处。集合，作为实施，是严格的，并且（我认为）不会触发像 fusion/deforestation/etc 这样的一些优化。列表是延迟生成的，在这种代码中应该表现得更好。

evalClause (Clause x) = foldr f (Just False) . map evalAtom $ S.toList x
     ...

Answer 2

一个观察：

A Maybe Bool 只能有三个可能的值 - Nothing、Just False 和 Just True.

evaluated 在 evalClause 和 evalForm 中都有类型 Set (Maybe Bool) 可以用适合 Int.

的三个位来表示

我会定义：

data MaybeBool = Nuthin | JustFalse | JustTrue
  deriving (Eq, Ord, Enum, Bounded, Show, Read)

并更改 intepret return 的签名 MaybeBool

然后将evaluated定义为这样的位集：

import Data.Bits

evaluated = foldl' combine 0 (map evalAtom (S.toList x))
  where combine s a = s .|. (1 `shiftLeft` fromEnum a)

evaluated 将是一个介于 0 和 7 之间的 Int，如果 Nutin 在集合中，则位 0 设置，如果 JustFalse 在集合中，位 1 设置，位 2 设置如果 JustTrue 在集合中。这将从您的计算中消除集合的分配。

减少此 Haskell 函数中的分配（和工作）的建议

Suggestions for reducing allocations (and work) in this Haskell function

optimization

performance

haskell

allocation