我试图像这样解析多行日志
[xxx] This is 1
[xxx] This is also 1
[yyy] This is 2我定义了这些类型
{-# LANGUAGE OverloadedStrings #-}
module Parser where
import Prelude hiding(takeWhile)
import Data.Text
import Data.Word
import Data.Attoparsec.Text as T
import Data.Char
import Data.String
data ID = ID String deriving (Eq, Show)
data Entry = Entry ID String deriving (Eq, Show)
data Block = Block ID [String]
data Log = Log [Block]并定义了这些解析器:
parseID :: Parser ID
parseID = do
char '['
id <- takeTill ( == ']' )
char ']'
return $ ID $ unpack id
parseEntry :: Parser Entry
parseEntry = do
id <- parseID
char ' '
content <- takeTill isEndOfLine
return $ Entry id (unpack content)当我做一些像parseOnly parseEntry entryString这样的事情,并得到一个Entry时,这是可行的。
问题是,当我试图解析类似我在开始时添加的日志时。我想要一个[Entry],但是我想要[Block]。
另外,当两个或更多个连续行具有相同的ID (如xxx)时,应该将其存储在同一个块中,因此,为了解析上述日志,我想要返回
[block1, block2]
-- block1 == Block "xxx" ["This is 1", "This is also 1"]
-- block2 == Block "yyy" ["This is 2"]如何使解析器创建新块或添加到最后生成的块中,取决于ID是否更改?
一个明显的解决方案是简单地生成一个[Entry],然后使用一个折叠函数将其转换为具有适当逻辑的[Block],但是我将执行两次传递,一次在日志上,另一次在[Entry]上,这似乎不仅对大型日志不太有效,而且感觉是错误的方法(根据我有限的attoparsec知识)。
还有其他想法吗?
编辑
Bob Dalgleish解决方案本质上是有效的(非常感谢!),只是需要一些调整才能使其工作。这是我的最后解决方案:
data ID = ID String deriving (Eq, Show)
data Entry = Entry ID String deriving (Eq, Show)
data Block = Block ID [String] deriving (Eq, Show)
data Log = Log [Block] deriving (Eq, Show)
parseID :: Parser ID
parseID = do
char '['
id <- takeTill ( == ']' )
char ']'
return $ ID $ unpack id
parseEntry :: Parser Entry
parseEntry = do
id <- parseID
char ' '
content <- takeTill isEndOfLine
return $ Entry id (unpack content)
parseEntryFor :: ID -> Parser Entry
parseEntryFor blockId = do
id <- parseID
if blockId == id
then do
char ' '
content <- takeTill isEndOfLine
endOfLine <|> endOfInput
return $ Entry id (unpack content)
else fail "nonmatching id"
parseBlock :: Parser Block
parseBlock = do
(Entry entryId s) <- parseEntry
let newBlock = Block entryId [s]
endOfLine <|> endOfInput
entries <- many' (parseEntryFor entryId)
return $ Block entryId (s : Prelude.map (\(Entry _ s') -> s') entries)发布于 2019-01-31 20:45:19
您需要为Block提供一个解析器。它接受一个Entry,使用相同的id查找Entry;如果不是相同的,它会追溯和返回到目前为止所拥有的内容。
首先,让我们介绍一个条件Entry解析器:
parseEntryFor :: ID -> Parser Entry
parseEntryFor blockId = do
id <- parseEntry
if blockId == id
then do
char ' '
content <- takeTill isEndOfLine
endOfLine
return $ Entry id (unpack content)
else fail "nonmatching id"
-- |A Block consists of one or more Entry's with the same ID
parseBlock :: Parser Block
parseBlock = do
(Entry entryId s) <- parseEntry
let newBlock = Block entryId [s]
endOfLine
entries <- many' (parseEntryFor entryId)
return $ Block entryId s: (map (\(Entry _ s') -> x') entries)(这段代码没有经过测试,因为我只使用过Parsec。)
https://stackoverflow.com/questions/54468116
复制相似问题