首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >为什么TextX忽略字符串文本中的\n,而不忽略正则表达式?

为什么TextX忽略字符串文本中的\n,而不忽略正则表达式?
EN

Stack Overflow用户
提问于 2022-02-21 20:00:32
回答 1查看 62关注 0票数 0

TL;DR: 问题将在TextX 3.0版本中修复。解决方法是使用regex来匹配转义(\)字符,例如\n

完整问题:使用TextX,我正在解析一种本土标记语言,其中段落和行间隔非常重要。我想,当我试图匹配新的行时,我缺少一个基本的理解:为什么"\n""\n\n"不能工作,而它们的正则表达式对应的是/\n//\n\n/

注意:空格是在解析器级别重新定义的,以排除使用\nws=" \t"

代码语言:javascript
复制
import textx as tx

grammar = r"""
Root:
    content*=Content
;

Content:
    Text | ParagraphBreak | LineBreak
;

ParagraphBreak:
    paragraphbreak="\n\n"
    // paragraphbreak=/\n\n/
;

LineBreak:
    linebreak="\n"  // Will cause parsing error
    // linebreak=/\n/  // Will parse fine
;

Text[noskipws]:  // All text valid
    text=/[^\n]*/
;
"""

parser = tx.metamodel_from_str(grammar, ws=" \t")

source = "Line.\nBreak.\n\n"

parsed_source = parser.model_from_str(source)
print(parsed_source.content)

在我的系统上运行上述代码时,使用

  • Python 3.10.1
  • 诗歌版本1.1.12,来自poetry.lock:
    • [package] name = "arpeggio",version = "1.10.2",.,python-version= "*“
    • [package] name = "textx",version = "2.3.0",.,python= "*",package.dependencies Arpeggio = ">=1.9.0“

我得到以下结果:

具有路径的根:/Users/[redacted]/Library/Caches/pypoetry/virtualenvs

代码语言:javascript
复制
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 291, in _parse
    return self.parser_model.parse(self)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 945, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 485, in _parse
    result = p(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 423, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
arpeggio.NoMatch: Expected '\n\n' or '\n' or EOF at position (1, 6) => 'Line.* Break.  '.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/[redacted]/scratchpad/TextX/linebreaks.py", line 31, in <module>
    parsed_source = parser.model_from_str(source)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/metamodel.py", line 615, in model_from_str
    model = self._parser_blueprint.clone().get_model_from_str(
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 332, in get_model_from_str
    self.parse(model_str, file_name=file_name)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1516, in parse
    self.parse_tree = self._parse()
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 294, in _parse
    raise TextXSyntaxError(message=text(e),
textx.exceptions.TextXSyntaxError: None:1:6: error: Expected '\n\n' or '\n' or EOF at position (1, 6) => 'Line.* Break.  '.

我希望得到与regex版本相同的结果,即:

[<textx:Text instance at 0x10129bc40>, <textx:LineBreak instance at 0x101298040>, <textx:Text instance at 0x101298130>, <textx:ParagraphBreak instance at 0x10129aec0>]

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-01 17:16:01

这是当前开发版本中解决的问题。请看这个textX问题

这个补丁将成为即将发布的textX 3.0的一部分。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71212348

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档