Haskell vs C 速度:Eratosthenes 筛法
Haskell vs C speed: Sieve of Eratosthenes
Haskell 一个是使用优化的 Data.IntSet 实现的,复杂度为 O(lg n)。然而,尽管 Haskell 代码已经针对偶数情况进行了优化,但 n = 2000000 的速度差异为 15 倍(之前为 30 倍)。我想知道 whether/why 我在 Haskell 中的实现不完善。
原版Haskell
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : put S.empty [3,5..n]
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromList [x*2, x*3 .. n]
更新
fromDistinctAscList
速度提高 4 倍。 2-3-5-7 轮加速 50%。
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : 3 : 5 : 7 : put S.empty (takeWhile (<=n) (spin wheel 11))
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromDistinctAscList [x*x, x*(x+2) .. n]
spin (x:xs) n = n : spin xs (n + x)
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel
基准测试
所有时间均由 *nix time
命令测量,真实 space
Haskell original : 2e6: N/A; 2e7: >30s
Haskell optimized: 2e6: 0.396s; 2e7: 6.273s
C++ Set (ordered): 2e6: 4.694s; 2e7: >30s
C++ Bool Array : 2e6: 0.039s; 2e7: 0.421s
Haskell optimized 比 C++ Bool 慢 10~15x,比 C++ Set 快 10x。
源代码
C 编译器选项:g++ 5.3.1,g++ -std=c++11
Haskell 选项:ghc 7.8.4,ghc
C代码(布尔数组)http://pastebin.com/W0s7cSWi
prime[0] = prime[1] = false;
for (int i=2; i<=limit; i++) { //edited
if (!prime[i]) continue;
for (int j=2*i; j<=n; j+=i)
prime[j] = false;
}
C码(套)http://pastebin.com/sNpghrU4
nonprime.insert(1);
for (int i=2; i<=limit; i++) { //edited
if (nonprime.count(i) > 0) continue;
for (int j=2*i; j<=n; j+=i)
nonprime.insert(j);
}
Haskell代码http://pastebin.com/HuMqwvRW
代码如上所写。
I would like to know whether/why my implementation in Haskell is imperfect.
而不是 fromList
you better use fromDistinctAscList
执行 线性 。您还可以添加 仅以 x*x 而非 x*2 开头的奇数倍数 ,因为所有较小的奇数倍数均已添加。就风格而言,右折叠可能比递归更适合。
这样做,当 n 等于 2,000,000 时,我的性能提高了 3 倍以上:
import Data.IntSet (member, union, empty, fromDistinctAscList)
sieve :: Int -> [Int]
sieve n = 2: foldr go (const []) [3,5..n] empty
where
go i run obs
| member i obs = run obs
| otherwise = i: run (union obs inc)
where inc = fromDistinctAscList [i*i, i*(i + 2)..n]
然而,数组具有 O(1) 访问和 缓存友好 内存分配。使用可变数组,我发现您的 Haskell 代码的性能提高了 15 倍以上(n 再次等于 2,000,000):
{-# LANGUAGE FlexibleContexts #-}
import Data.Array.ST (STUArray)
import Control.Monad (forM_, foldM)
import Control.Monad.ST (ST, runST)
import Data.Array.Base (newArray, unsafeWrite, unsafeRead)
sieve :: Int -> [Int]
sieve n = reverse $ runST $ do
arr <- newArray (0, n) False :: ST s (STUArray s Int Bool)
foldM (go arr) [2] [3,5..n]
where
go arr acc i = do
b <- unsafeRead arr i
if b then return acc else do
forM_ [i*i, i*(i + 2).. n] $ \k -> unsafeWrite arr k True
return $ i: acc
Haskell 一个是使用优化的 Data.IntSet 实现的,复杂度为 O(lg n)。然而,尽管 Haskell 代码已经针对偶数情况进行了优化,但 n = 2000000 的速度差异为 15 倍(之前为 30 倍)。我想知道 whether/why 我在 Haskell 中的实现不完善。
原版Haskell
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : put S.empty [3,5..n]
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromList [x*2, x*3 .. n]
更新
fromDistinctAscList
速度提高 4 倍。 2-3-5-7 轮加速 50%。
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : 3 : 5 : 7 : put S.empty (takeWhile (<=n) (spin wheel 11))
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromDistinctAscList [x*x, x*(x+2) .. n]
spin (x:xs) n = n : spin xs (n + x)
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel
基准测试
所有时间均由 *nix time
命令测量,真实 space
Haskell original : 2e6: N/A; 2e7: >30s
Haskell optimized: 2e6: 0.396s; 2e7: 6.273s
C++ Set (ordered): 2e6: 4.694s; 2e7: >30s
C++ Bool Array : 2e6: 0.039s; 2e7: 0.421s
Haskell optimized 比 C++ Bool 慢 10~15x,比 C++ Set 快 10x。
源代码
C 编译器选项:g++ 5.3.1,g++ -std=c++11
Haskell 选项:ghc 7.8.4,ghc
C代码(布尔数组)http://pastebin.com/W0s7cSWi
prime[0] = prime[1] = false;
for (int i=2; i<=limit; i++) { //edited
if (!prime[i]) continue;
for (int j=2*i; j<=n; j+=i)
prime[j] = false;
}
C码(套)http://pastebin.com/sNpghrU4
nonprime.insert(1);
for (int i=2; i<=limit; i++) { //edited
if (nonprime.count(i) > 0) continue;
for (int j=2*i; j<=n; j+=i)
nonprime.insert(j);
}
Haskell代码http://pastebin.com/HuMqwvRW 代码如上所写。
I would like to know whether/why my implementation in Haskell is imperfect.
而不是 fromList
you better use fromDistinctAscList
执行 线性 。您还可以添加 仅以 x*x 而非 x*2 开头的奇数倍数 ,因为所有较小的奇数倍数均已添加。就风格而言,右折叠可能比递归更适合。
这样做,当 n 等于 2,000,000 时,我的性能提高了 3 倍以上:
import Data.IntSet (member, union, empty, fromDistinctAscList)
sieve :: Int -> [Int]
sieve n = 2: foldr go (const []) [3,5..n] empty
where
go i run obs
| member i obs = run obs
| otherwise = i: run (union obs inc)
where inc = fromDistinctAscList [i*i, i*(i + 2)..n]
然而,数组具有 O(1) 访问和 缓存友好 内存分配。使用可变数组,我发现您的 Haskell 代码的性能提高了 15 倍以上(n 再次等于 2,000,000):
{-# LANGUAGE FlexibleContexts #-}
import Data.Array.ST (STUArray)
import Control.Monad (forM_, foldM)
import Control.Monad.ST (ST, runST)
import Data.Array.Base (newArray, unsafeWrite, unsafeRead)
sieve :: Int -> [Int]
sieve n = reverse $ runST $ do
arr <- newArray (0, n) False :: ST s (STUArray s Int Bool)
foldM (go arr) [2] [3,5..n]
where
go arr acc i = do
b <- unsafeRead arr i
if b then return acc else do
forM_ [i*i, i*(i + 2).. n] $ \k -> unsafeWrite arr k True
return $ i: acc