为什么 Prelude.drop 比普通的快？

Question

我做了一些运行：

main = print $ head . drop (2^30) $ [1..]

——既有Prelude.drop又有琐碎的drop。 Prelude 变体始终快 45% 左右，我不知道为什么。我什至从 base-4.11.1.0 中提取了 GHC.List.drop 的定义，但它的性能比我的普通代码还差！怎么回事？

这就是我正在做的事情：

% stack ghc -- --version           
The Glorious Glasgow Haskell Compilation System, version 8.2.2

Prelude.drop:

% git checkout drop-prelude
% git clean --force
% cat ListTest.hs
module Main where

main = print $ head . drop (2^30) $ [1..]
% stack ghc -- ListTest.hs
[1 of 1] Compiling Main             ( ListTest.hs, ListTest.o )
Linking ListTest ...
% time ./ListTest
1073741825
./ListTest  18.76s user 0.09s system 99% cpu 18.906 total

一个琐碎的drop:

% git checkout drop-naive 
% git clean --force
% cat ListTest.hs
module Main where

dropNaive :: Int -> [a] -> [a]
dropNaive 0 xs = xs
dropNaive n [ ] = [ ]
dropNaive n (x: xs) = dropNaive (pred n) xs

main = print $ head . dropNaive (2^30) $ [1..]
% stack ghc -- ListTest.hs
[1 of 1] Compiling Main             ( ListTest.hs, ListTest.o )
Linking ListTest ...
% time ./ListTest
1073741825      
./ListTest  31.56s user 0.12s system 99% cpu 31.774 total

drop 来自 GHC.List:

% git checkout drop-ghc 
% git clean --force
% cat ListTest.hs 
{-# LANGUAGE BangPatterns #-}

module ListTest where

dropGhc :: Int -> [a] -> [a]
{-# INLINE dropGhc #-}
dropGhc n xs
  | n <= 0     = xs
  | otherwise  = unsafeDrop n xs
  where
    unsafeDrop :: Int -> [a] -> [a]
    unsafeDrop !_ []     = []
    unsafeDrop 1  (_:xs) = xs
    unsafeDrop m  (_:xs) = unsafeDrop (m - 1) xs
% cat ListTestRunner.hs 
import ListTest

main = print $ head . dropGhc (2^30) $ [1..]
% stack ghc -- ListTestRunner.hs
[1 of 2] Compiling ListTest         ( ListTest.hs, ListTest.o )
[2 of 2] Compiling Main             ( ListTestRunner.hs, ListTestRunner.o )
Linking ListTestRunner ...
% time ./ListTestRunner
1073741825            
./ListTestRunner  35.35s user 0.14s system 99% cpu 35.591 total

Answer 1

我以前犯过这个错误，所以我有一个怀疑...

看起来您只是没有在启用优化的情况下进行编译。 Prelude 是通过优化编译并链接进来的，因此当您使用 Prelude 的 drop 时，您隐式地使用了优化代码。本地复制（在将常量从 2^30 减少到 2^25 之后——没有理由让自己等那么久），启用优化导致 dropGhc 和前奏 drop 具有相同的时机，并且 dropNaive 并没有明显变差。 dropGhc 和 dropNaive 之间的措辞略有不同，可能会对生成的代码产生微妙的影响；对于为什么一个会比另一个更好，我没有很好的直觉。

另请注意，base 是一个特殊的包，因为它是作为 ghc 构建过程的一部分构建的，使用它自己的构建系统，而不是像普通的那样由 Cabal 构建包。正如@sjakobi 在评论中指出的那样，there is a setting in ghc's build configuration 将 base 的优化级别设置为 O2。

为什么 Prelude.drop 比普通的快？

Why is Prelude.drop faster than a trivial one?

optimization

haskell

singly-linked-list