我有一个37MB bin文件,我正试图将其转换为ppm序列。它工作得很好,我正在尝试将它作为一个练习来学习一些分析以及更多关于Haskell中懒惰的字节串的内容。我的程序似乎轰炸了concatMap,它用于复制每个字节三次,因此我有R、G和B。代码相当直接--每2048个字节我就编写一个新的头:
{-# LANGUAGE OverloadedStrings #-}
import System.IO
import System.Environment
import Control.Monad
import qualified Data.ByteString.Lazy.Char8 as B
main :: IO ()
main = do [from, to] <- getArgs
withFile from ReadMode $ \inH ->
withFile to WriteMode $ \outH ->
loop (B.hGet inH 2048) (process outH) B.null
loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)
process :: Handle -> B.ByteString -> IO ()
process h bs | B.null bs = return ()
| otherwise = B.hPut h header >> B.hPut h bs'
where header = "P6\n32 64\n255\n" :: B.ByteString
bs' = B.concatMap (B.replicate 3) bs这使得它在5s上有了一点进展。这并不可怕,我唯一的比较是我非常天真的C实现,它在4s下做了一些--所以,或者理想地说,在下面已经是我的目标了。
以下是上述代码中的RTS:
33,435,345,688 bytes allocated in the heap
14,963,640 bytes copied during GC
54,640 bytes maximum residency (77 sample(s))
21,136 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 64604 colls, 0 par 0.20s 0.25s 0.0000s 0.0001s
Gen 1 77 colls, 0 par 0.00s 0.01s 0.0001s 0.0006s
INIT time 0.00s ( 0.00s elapsed)
MUT time 5.09s ( 5.27s elapsed)
GC time 0.21s ( 0.26s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 5.29s ( 5.52s elapsed)
%GC time 3.9% (4.6% elapsed)
Alloc rate 6,574,783,667 bytes per MUT second
Productivity 96.1% of total user, 92.1% of total elapsed非常糟糕的结果。当我删除concatMap并每隔2048字节复制一次头部的所有内容时,它实际上是即时的:
70,983,992 bytes allocated in the heap
48,912 bytes copied during GC
54,640 bytes maximum residency (2 sample(s))
19,744 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 204 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.01s ( 0.07s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.02s ( 0.07s elapsed)
%GC time 9.6% (2.9% elapsed)
Alloc rate 5,026,838,892 bytes per MUT second
Productivity 89.8% of total user, 22.3% of total elapsed所以我想我的问题有两个方面:
谢谢。
编辑
这是最后的代码和RTS,如果有人感兴趣!在阅读了-prof -auto-all -caf-all的剖析与优化章节之后,我还通过使用ghc的分析器和真实世界Haskell找到了额外的瓶颈。
{-# LANGUAGE OverloadedStrings #-}
import System.IO
import System.Environment
import Control.Monad
import Data.Monoid
import qualified Data.ByteString.Builder as BU
import qualified Data.ByteString.Lazy.Char8 as BL
main :: IO ()
main = do [from, to] <- getArgs
withFile from ReadMode $ \inH ->
withFile to WriteMode $ \outH ->
loop (BL.hGet inH 2048) (process outH) BL.null
loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)
upConcatMap :: Monoid c => (Char -> c) -> BL.ByteString -> c
upConcatMap f bs = mconcat . map f $ BL.unpack bs
process :: Handle -> BL.ByteString -> IO ()
process h bs | BL.null bs = return ()
| otherwise = BU.hPutBuilder h frame
where header = "P6\n32 64\n255\n"
bs' = BU.toLazyByteString $ upConcatMap trip bs
frame = BU.lazyByteString $ mappend header bs'
trip c = let b = BU.char8 c in mconcat [b, b, b]6,383,263,640 bytes allocated in the heap
18,596,984 bytes copied during GC
54,640 bytes maximum residency (2 sample(s))
31,056 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 11165 colls, 0 par 0.06s 0.06s 0.0000s 0.0001s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.69s ( 0.83s elapsed)
GC time 0.06s ( 0.06s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.75s ( 0.89s elapsed)
%GC time 7.4% (7.2% elapsed)
Alloc rate 9,194,103,284 bytes per MUT second
Productivity 92.6% of total user, 78.0% of total elapsed发布于 2014-05-06 18:44:48
那构建器呢
这个版本对我来说快了5倍:
process :: Handle -> B.ByteString -> IO ()
process h bs
| B.null bs = return ()
| otherwise = B.hPut h header >> B.hPutBuilder h bs'
where header = "P6\n32 64\n255\n" :: B.ByteString
bs' = mconcat $ map triple $ B.unpack bs
triple c = let b = B.char8 c in mconcat [b, b, b]它分配的垃圾少得多。
添加:供参考,运行时统计数据:
4,642,746,104 bytes allocated in the heap
390,110,640 bytes copied during GC
63,592 bytes maximum residency (2 sample(s))
21,648 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 8992 colls, 0 par 0.54s 0.63s 0.0001s 0.0017s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0002s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.98s ( 1.13s elapsed)
GC time 0.54s ( 0.63s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 1.52s ( 1.76s elapsed)
%GC time 35.4% (36.0% elapsed)
Alloc rate 4,718,237,910 bytes per MUT second
Productivity 64.6% of total user, 55.9% of total elapsed发布于 2014-05-06 19:00:25
使用Builder将您的ByteString与较小的连接起来,它将运行得更快。在文档里。
从源代码来看,concatMap通过了一个列表:
concatMap :: (Word8 -> ByteString) -> ByteString -> ByteString
concatMap f = concat . foldr ((:) . f) []concat必须做大量的工作。看来Builder的建议是好的。
https://stackoverflow.com/questions/23501384
复制相似问题