我使用parsimonious (python PEG解析器库)来解析如下文本:
text = """
block block_name_0
{
foo
}
block block_name_1
{
bar
}
"""它是一系列具有简单正文要求(必须是字母)的块,它们构成了整个文本。下面是语法:
grammar = Grammar(r"""
file = block+
block = _ "block" _ alphanum _ start_brace _ block_body _ end_brace _
block_body = alphanum+
alphanum = ~"[_A-z0-9]+"
_ = ~"[\\n\\s]*"
start_brace = "{"
end_brace = "}"
""")
print (grammar.parse(text)) 我遇到的问题是,如果在第一个块之后的任何块中都有解析错误,我会得到一个无用的错误消息。举个例子,看看下面的文本:
text = """
block block_name_0
{
!foo
}
block block_name_1
{
bar
}
"""这会给出一条有用的错误消息:
[omitted stack trace]
File "/lib/parsimonious/expressions.py", line 127, in match
raise error
parsimonious.exceptions.ParseError: Rule 'block_body' didn't match at '!foo
}但是,如果我有以下文本:
text = """
block block_name_0
{
foo
}
block block_name_1
{
!bar
}
"""我得到了这个错误:
File "/lib/parsimonious/expressions.py", line 112, in parse
raise IncompleteParseError(text, node.end, self)
parsimonious.exceptions.IncompleteParseError: Rule 'file' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with 'block block_name_1
{' (line 7, column 1).它看起来与序列的第一个实例(第一个块)匹配,但是当它在第二个块上失败时,它不会将整个事情视为失败,这正是我希望它做的。我希望它给我一个与块0类似的错误,这样我就可以确切地知道块出了什么问题,而不仅仅是整个块不能被解析。
任何帮助都将不胜感激!
发布于 2016-11-29 18:59:08
不是对简约的回答,但为了更好的错误报告支持,我建议你尝试textX或直接使用它的底层聚乙二醇解析器Arpeggio (免责声明:我是这些库的作者)。
使用textX:
from textx.metamodel import metamodel_from_str
grammar = """
Program: blocks+=Block ;
Block:
'block' name=ID '{'
body=Body
'}'
;
Body: ID+ ;
"""
text = """
block block_name_0
{
foo
}
block block_name_1
{
!bar
}
"""
mm = metamodel_from_str(grammar)
program = mm.model_from_str(text)textX/Arpeggio将尽可能地进行解析,并精确定位错误所在的确切位置:
textx.exceptions.TextXSyntaxError:
Expected ID at position (9, 5) => 'e_1 { *!bar } '.使用textX,您还可以免费获得AST,例如,您可以这样做:
for block in program.blocks:
print(block.name, ':', block.body)出于调试/调查的目的,您还需要一个nice visualization of grammars and models。
https://stackoverflow.com/questions/40853040
复制相似问题