文章/答案/技术大牛

发布

问ByteString concatMap性能
EN

Stack Overflow用户

提问于 2014-05-06 17:38:50

回答 2查看 215关注 0票数 5

我有一个37MB bin文件，我正试图将其转换为ppm序列。它工作得很好，我正在尝试将它作为一个练习来学习一些分析以及更多关于Haskell中懒惰的字节串的内容。我的程序似乎轰炸了concatMap，它用于复制每个字节三次，因此我有R、G和B。代码相当直接--每2048个字节我就编写一个新的头：

{-# LANGUAGE OverloadedStrings #-}

import System.IO
import System.Environment
import Control.Monad
import qualified Data.ByteString.Lazy.Char8 as B


main :: IO ()
main = do [from, to] <- getArgs
          withFile from ReadMode $ \inH ->
            withFile to WriteMode $ \outH ->
                loop (B.hGet inH 2048) (process outH) B.null


loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)


process :: Handle -> B.ByteString -> IO ()
process h bs | B.null bs = return ()
             | otherwise = B.hPut h header >> B.hPut h bs'
                           where header = "P6\n32 64\n255\n" :: B.ByteString
                                 bs'    = B.concatMap (B.replicate 3) bs

这使得它在5s上有了一点进展。这并不可怕，我唯一的比较是我非常天真的C实现，它在4s下做了一些--所以，或者理想地说，在下面已经是我的目标了。

以下是上述代码中的RTS：

  33,435,345,688 bytes allocated in the heap
      14,963,640 bytes copied during GC
          54,640 bytes maximum residency (77 sample(s))
          21,136 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     64604 colls,     0 par    0.20s    0.25s     0.0000s    0.0001s
  Gen  1        77 colls,     0 par    0.00s    0.01s     0.0001s    0.0006s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    5.09s  (  5.27s elapsed)
  GC      time    0.21s  (  0.26s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    5.29s  (  5.52s elapsed)

  %GC     time       3.9%  (4.6% elapsed)

  Alloc rate    6,574,783,667 bytes per MUT second

  Productivity  96.1% of total user, 92.1% of total elapsed

非常糟糕的结果。当我删除concatMap并每隔2048字节复制一次头部的所有内容时，它实际上是即时的：

      70,983,992 bytes allocated in the heap
          48,912 bytes copied during GC
          54,640 bytes maximum residency (2 sample(s))
          19,744 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       204 colls,     0 par    0.00s    0.00s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.01s  (  0.07s elapsed)
  GC      time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.02s  (  0.07s elapsed)

  %GC     time       9.6%  (2.9% elapsed)

  Alloc rate    5,026,838,892 bytes per MUT second

  Productivity  89.8% of total user, 22.3% of total elapsed

所以我想我的问题有两个方面：

我怎样才能提高整体表现？
如果瓶颈不是那么明显的话，我有什么办法可以找到它呢？

谢谢。

编辑

这是最后的代码和RTS，如果有人感兴趣！在阅读了-prof -auto-all -caf-all的剖析与优化章节之后，我还通过使用ghc的分析器和真实世界Haskell找到了额外的瓶颈。

{-# LANGUAGE OverloadedStrings #-}

import System.IO
import System.Environment
import Control.Monad
import Data.Monoid
import qualified Data.ByteString.Builder    as BU
import qualified Data.ByteString.Lazy.Char8 as BL


main :: IO ()
main = do [from, to] <- getArgs
          withFile from ReadMode $ \inH ->
              withFile to WriteMode $ \outH ->
                  loop (BL.hGet inH 2048) (process outH) BL.null


loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)


upConcatMap :: Monoid c => (Char -> c) -> BL.ByteString -> c
upConcatMap f bs = mconcat . map f $ BL.unpack bs


process :: Handle -> BL.ByteString -> IO ()
process h bs | BL.null bs = return ()
             | otherwise = BU.hPutBuilder h frame
                           where header = "P6\n32 64\n255\n"
                                 bs'    = BU.toLazyByteString $ upConcatMap trip bs
                                 frame  = BU.lazyByteString $ mappend header bs'
                                 trip c = let b = BU.char8 c in mconcat [b, b, b]

6,383,263,640 bytes allocated in the heap
      18,596,984 bytes copied during GC
          54,640 bytes maximum residency (2 sample(s))
          31,056 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     11165 colls,     0 par    0.06s    0.06s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0002s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.69s  (  0.83s elapsed)
  GC      time    0.06s  (  0.06s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.75s  (  0.89s elapsed)

  %GC     time       7.4%  (7.2% elapsed)

  Alloc rate    9,194,103,284 bytes per MUT second

  Productivity  92.6% of total user, 78.0% of total elapsed

performance

haskell

bytestring

回答 2

Stack Overflow用户

回答已采纳

发布于 2014-05-06 18:44:48

那构建器呢

这个版本对我来说快了5倍：

process :: Handle -> B.ByteString -> IO ()
process h bs
  | B.null bs = return ()
  | otherwise = B.hPut h header >> B.hPutBuilder h bs'
  where header = "P6\n32 64\n255\n" :: B.ByteString
        bs'    = mconcat $ map triple $ B.unpack bs 
        triple c = let b = B.char8 c in mconcat [b, b, b]

它分配的垃圾少得多。

添加:供参考，运行时统计数据：

   4,642,746,104 bytes allocated in the heap
     390,110,640 bytes copied during GC
          63,592 bytes maximum residency (2 sample(s))
          21,648 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      8992 colls,     0 par    0.54s    0.63s     0.0001s    0.0017s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0002s    0.0002s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.98s  (  1.13s elapsed)
  GC      time    0.54s  (  0.63s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    1.52s  (  1.76s elapsed)

  %GC     time      35.4%  (36.0% elapsed)

  Alloc rate    4,718,237,910 bytes per MUT second

  Productivity  64.6% of total user, 55.9% of total elapsed

票数 2

Stack Overflow用户

发布于 2014-05-06 19:00:25

使用Builder将您的ByteString与较小的连接起来，它将运行得更快。在文档里。

从源代码来看，concatMap通过了一个列表：

concatMap :: (Word8 -> ByteString) -> ByteString -> ByteString
concatMap f = concat . foldr ((:) . f) []

concat必须做大量的工作。看来Builder的建议是好的。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/23501384

复制

相似问题

问ByteString concatMap性能
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ByteString concatMap性能EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ByteString concatMap性能
EN