目前我的数据如下所示:
3-150
2-151
4-152
5-154
7-154
1-155
9-155
6-156这只是人工的“刻度”数据,第一个表示刻度的值,第二个表示“午夜过后的秒”。
因此,对于股票数据,我需要将这些数据分类为“条形图”。也就是说,我需要在给定的时间内将所有的条形图组合在一起。
一个例子是4秒条。午夜过后0-3秒的滴答声是1巴,午夜过4-7秒是另一巴。
我有一个导管/水槽,看起来像这样,它将计算1条大小:
{-# LANGUAGE OverloadedStrings #-}
import Data.Maybe (isJust, fromJust)
import qualified Data.ByteString.Char8 as C
import Control.Applicative ((<$>), (<*>))
import Data.Conduit -- the core library
import qualified Data.Conduit.List as CL -- some list-like functions
import qualified Data.Conduit.Binary as CB -- bytes
import qualified Data.Conduit.Text as CT
data MyData = MyData Int Int
deriving (Show)
binaryToData :: C.ByteString -> Maybe MyData
binaryToData bn = do
let parts = C.split '-' bn
case parts of
(a:b:[]) -> MyData <$> (fst <$> (C.readInt a)) <*> (fst <$> (C.readInt b))
_ -> Nothing
streamGenerator =
CB.sourceFile "sample.txt" =$=
CB.lines =$=
CL.map binaryToData =$=
CL.filter isJust =$=
CL.map fromJust =$=
CL.groupBy (\(MyData _ x) (MyData _ y) -> (x `quot` 4) == (y `quot` 4))
main :: IO ()
main = do
mlines <- runResourceT $ streamGenerator $$ CL.consume
print mlines然而,我需要多个酒吧信息关闭流在同一时间。例如,对于每2秒的条形,我需要一个4秒的条形。如果被调用的2秒条形图在4秒条形图的中间,我想输出前4秒条形图。
我的意思是:
标准条形图(数字表示条形图应包含的午夜过后秒数):
2 second bar : 0-1, 2-3, 4-5, etc...
4 second bar : 0-3, 4-7, 8-11, etc...
combo: (0-1, null), (2-3, 0-3), (4-5, 0-3), (6-7, 4-7), etc... 因此,与我当前的2和4秒条块分组管道不同的是:
4 second bar : [[MyData 3 150,MyData 2 151],[MyData 4 152,MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155],[MyData 6 156]]
2 second bar : [[MyData 3 150,MyData 2 151],[MyData 4 152],[MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155],[MyData 6 156]]我想要这个管道流:
[([MyData 3 150,MyData 2 151], [MyData 3 150,MyData 2 151])
,([MyData 4 152], [MyData 3 150,MyData 2 151])
,([MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155], [MyData 4 152,MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155])
,([MyData 6 156],[MyData 4 152,MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155])]但是,如果不做一些丑陋的事情,我似乎无法做到这一点。
发布于 2012-12-24 06:46:56
如果您不介意的话,我可以使用我的pipes库来回答您的问题,因为这就是我喜欢的。如果您愿意,可以将此解决方案转换为conduit。
这个干净的解决方案需要push back,但pipes还没有push back,所以我继续并实现了它(我将在不久的将来将其作为扩展库包括在内):
import Control.Monad
import Control.Proxy
import Control.Proxy.Trans.State
-- Pushback primitives, soon to be in a `pipes` library
require :: (Monad m, Proxy p) => a' -> StateP [a] p a' a b' b m a
require a' = StateP $ \s -> runIdentityP $ do
case s of
[] -> do
a <- request a'
return (a, s)
a:as -> do
return (a, as)
pushback :: (Monad m, Proxy p) => a -> StateP [a] p a' a b' b m ()
pushback a = StateP $ \as -> runIdentityP $ return ((), a:as)
evalPushback = evalStateK []有了这些,解决方案就很简单了:
data MyData = MyData Int Int deriving (Eq, Show)
-- Consumes ticks up until the deadline or the end of input
-- Returns the list of all ticks before the deadline
ticksUntil
:: (Monad m, Proxy p)
=> Int -> () -> Consumer (StateP [Maybe MyData] p) (Maybe MyData) m [MyData]
ticksUntil deadline () = go where
go = do
x <- require ()
case x of
Just m@(MyData _ time) ->
if (time < deadline)
then do
ms <- go
return (m:ms)
else do
pushback x
return []
Nothing -> return []
bars
:: (Monad m, Proxy p)
=> () -> Pipe (StateP [Maybe MyData] p) (Maybe MyData) ([MyData], [MyData]) m r
bars () = loop1 2 [] where
-- First half of a 4-second window
loop1 deadline b4 = do
b2 <- (ticksUntil deadline >-> unitU) ()
respond (b2, b4)
loop2 (deadline + 2) b2 b4
-- Second half of a 4-second window
loop2 deadline b2 b4 = do
b2' <- (ticksUntil deadline >-> unitU) ()
let b4' = b2 ++ b2'
respond (b2', b4')
loop1 (deadline + 2) b4'
sample :: [MyData]
sample = [
MyData 3 150,
MyData 2 151,
MyData 4 152,
MyData 5 154,
MyData 7 154,
MyData 1 155,
MyData 9 155,
MyData 6 156]
-- Use the same trick as conduit: Nothing signals termination
source :: (Monad m, Proxy p) => () -> Producer p (Maybe MyData) m ()
source () = runIdentityP $ do
(fromListS sample >-> mapD Just) ()
respond Nothing
main = runProxy $
source -- feed sample data
>-> evalPushback bars -- group the data into bars
>-> filterD (/= ([], [])) -- Ignore empty bars
>-> printD -- print outgoing bars神奇之处在于bars函数。它只是在两种状态之间切换。loop1是第一个状态,它期望4个值中的第一个条值,而loop2是第二个状态,它期望第二个条值为2个值。
实现这一点最困难的部分不是编写代码,而是理解您的规范。不过幸运的是,我想我理解了你的意思,因为我的代码产生了与你的测试示例完全相同的行为:
>>> main
([MyData 3 150,MyData 2 151],[MyData 3 150,MyData 2 151])
([MyData 4 152],[MyData 3 150,MyData 2 151])
([MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155],[MyData 4 152,MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155])
([MyData 6 156],[MyData 4 152,MyData 5 154,MyData 7 154,MyData 1 155,MyData 9 155])如果您对pipes感兴趣,那么我建议您查看pipes library,特别是Control.Proxy.Tutorial上的教程,它解释了我在代码中使用的许多习惯用法。
https://stackoverflow.com/questions/13984369
复制相似问题