首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >优化boost::spirit::qi解析器

优化boost::spirit::qi解析器
EN

Stack Overflow用户
提问于 2014-08-28 07:34:37
回答 1查看 523关注 0票数 2

我有一个解析器,它基本上打印出堆栈机器的操作,给出了一些表达式,我的运算符优先。我的目标是尽可能优化速度。我读过提供this example codean article concerning qi optimizations。我理解主要文章中描述的优化的要点,但是我不清楚如何将其集成到我的代码中。

下面是我的解析器的一个工作示例。我已经尝试过通过使用raw[]提供基本迭代器来对其进行一些优化。phoenix动作调用必须提供字符串或迭代器,通过它们可以创建字符串;这些函数的实际版本不是微不足道的,它们的功能还不能在解析时进行评估:

代码语言:javascript
复制
#include <iostream>
#include <vector>
#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_char.hpp>
#include <boost/spirit/include/qi_parse.hpp>
#include <boost/spirit/include/phoenix_bind.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using std::endl;
using std::cout;
using std::string;
using std::vector;

void fPushOp(const char* op){
  cout << "PushOp: " << op << endl;
}

void fPushInt(const boost::iterator_range<string::const_iterator>& my_str){
  cout << "PushInt: " << my_str << endl;
}

template<typename Iterator, typename Skipper = qi::space_type>
struct Calculator : public qi::grammar<Iterator, Skipper> {

  qi::rule<Iterator, Skipper>  
    expression, logical_or_expression, logical_and_expression, negate_expression, series_expression,
    single_expression, inclusive_or_expression, exclusive_or_expression, and_expression, equality_expression, 
    relational_expression, shift_expression, additive_expression, multiplicative_expression, 
    term, complement_factor, factor, result, integer, variable, variable_combo, word, prefix;

  qi::rule<Iterator> number;
  Calculator() : Calculator::base_type(result)
  {
    number = 
        qi::raw[
            ("0x" >> +qi::char_("0-9a-fA-F"))     
          | ("0b" >> +qi::char_("0-1"))
          | ("0" >>  +qi::char_("0-7"))
          | (+qi::char_("0-9"))
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    integer = 
          number
        | ('-' >> number) [phx::bind(&fPushOp, "OP_UNARY_MINUS")]
        ;

    variable =
          ((qi::alpha | qi::char_('_')) 
              >> *(qi::alnum | qi::char_('_')) 
              >> '['
              >>  (+(qi::alnum | qi::char_('_') | qi::char_(',')) 
                | ('\'' >> *~qi::char_('\'') >> '\'')) 
              >> ']')
        | ((qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_')))
        ;

    variable_combo =
        qi::raw [
          variable >> *(qi::char_('.') >> variable)
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    word = 
        qi::raw[
          variable
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    factor =
            ("ceil(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_CEIL")]
        |   ("wrap(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_WRAP")]
        |   ("abs(" >> expression >> ')')                                                       [phx::bind(&fPushOp, "OP_ABS")]
        |   ("count1(" >> expression >> ')')                                                    [phx::bind(&fPushOp, "OP_COUNT1")]
        |   ("pick(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_PICK")]
        |   ("defined(" >> expression >> ')')                                                   [phx::bind(&fPushOp, "OP_DEF")]
        |   ("string_equal(" >> word >> ',' >> word >> ')')                                     [phx::bind(&fPushOp, "OP_STREQ")]
        |   ("string_contains(" >> word >> ',' >> word >> ')')                                  [phx::bind(&fPushOp, "OP_STRCON")]
        |   ("lsl(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_LSL")]
        |   ("lsr(" >> single_expression >> ',' >> single_expression >> ')')                    [phx::bind(&fPushOp, "OP_LSR")]
        |   ("asr(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_ASR")]
        |   ("ror(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_ROR")]
        |   ("rrx(" >> single_expression >> ',' >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')[phx::bind(&fPushOp, "OP_RRX")]
        |   ('(' >> expression >> ')')
        |   variable_combo
        |   integer
        ;
    complement_factor = factor
        | ('~' >> factor) [phx::bind(&fPushOp, "OP_COMPLEMENT")]
        ;
    term = complement_factor
      >> *( (".." >> complement_factor) [phx::bind(&fPushOp, "OP_LEGER")]
          | ('\\' >> complement_factor) [phx::bind(&fPushOp, "OP_MASK")]
          ); 
    multiplicative_expression = term
      >> *( ('/' >> term) [phx::bind(&fPushOp, "OP_DIV")]
          | ('%' >> term) [phx::bind(&fPushOp, "OP_MOD")]
          | ('*' >> term) [phx::bind(&fPushOp, "OP_MUL")]
          );
    additive_expression = multiplicative_expression
      >> *( ('+' >> multiplicative_expression)  [phx::bind(&fPushOp, "OP_ADD")]
          | ('-' >> multiplicative_expression)  [phx::bind(&fPushOp, "OP_SUB")]
          );
    shift_expression = additive_expression
      >> *( (">>" >> additive_expression) [phx::bind(&fPushOp, "OP_SRL")]
          | ("<<" >> additive_expression) [phx::bind(&fPushOp, "OP_SLL")]
          );
    relational_expression = shift_expression
      >> *( ('<' >> shift_expression) [phx::bind(&fPushOp, "OP_LT")]
          | ('>' >> shift_expression) [phx::bind(&fPushOp, "OP_GT")]
          | ("<=" >> shift_expression)[phx::bind(&fPushOp, "OP_LET")]
          | (">=" >> shift_expression)[phx::bind(&fPushOp, "OP_GET")]
          );
    equality_expression = relational_expression 
      >> *( ("==" >> relational_expression)[phx::bind(&fPushOp, "OP_EQ")]
          | ("!=" >> relational_expression)[phx::bind(&fPushOp, "OP_NEQ")] 
          );
    and_expression = equality_expression 
      >> *(('&' >> equality_expression)     [phx::bind(&fPushOp, "OP_AND")]); 
    exclusive_or_expression = and_expression 
      >> *(('^' >> and_expression)          [phx::bind(&fPushOp, "OP_XOR")]); 
    inclusive_or_expression = exclusive_or_expression 
      >> *(('|' >> exclusive_or_expression) [phx::bind(&fPushOp, "OP_OR")]); 
    single_expression = inclusive_or_expression;
    series_expression = inclusive_or_expression 
      >> *((',' >> inclusive_or_expression) [phx::bind(&fPushOp, "OP_SERIES")]);
    negate_expression = series_expression
        | ('!' >> series_expression)        [phx::bind(&fPushOp, "OP_NEGATE")];
    logical_and_expression = negate_expression
      >> *(("&&" >> negate_expression)      [phx::bind(&fPushOp, "OP_LOGICAL_AND")]); 
    logical_or_expression = logical_and_expression 
      >> *(("||" >> logical_and_expression) [phx::bind(&fPushOp, "OP_LOGICAL_OR")]);
    expression = logical_or_expression;

    result = expression;
  }
};

int main(){
  Calculator<string::const_iterator> calc;
  const string expr("0xff0000 >> 3 && 3 + (!9) | (0,200)");
  cout << "Expression: " << expr << endl;

  string::const_iterator it = expr.begin();
  phrase_parse(it, expr.end(), calc, qi::space);

  cout << "Remaining: " << (string(it,expr.end())) << endl;
  return 0;
}

此外,我阅读了the slides from this pdf concerning utree,并试图弄清楚如何在可能的情况下使用utree输出而不是语义操作,因为这样的事情显然是邪恶的。有没有人能给我举一个简单的例子,告诉我如何构造一个utree,然后将它提供给堆栈机器,以便按顺序打印出操作?

EN

回答 1

Stack Overflow用户

发布于 2014-08-28 16:55:57

优化取决于您想要实现的目标。因此,我认为您过早地进行了优化。

例如,如果您想稍后解释符号,那么将variable_combo解析为raw[]输入序列没有任何意义(因为您必须再次解析变量组合,甚至必须在其中预留空格:"foo . bar .tux"在这里是一个有效的变量组合)。

我有相当多的帖子大体上涉及优化Boost精神(例如启动here )。在这里快速观察:

  • 考虑回溯下的正确性;使用语法分析'ceil(3.7'),你会得到:

表达式: ceil(3.7) PushInt: 3 PushInt: ceil剩余:(3.7)

注意当解析失败时,它是如何发出操作码的。另请注意,它会发出错误的操作码

代码语言:javascript
复制
- it pushes `3` instead of `3.7`
- it pushes `ceil` as an PushInt?

因此,它不仅检测到解析失败的时间太晚,而且还忽略了括号,无法发现函数调用,并错误地解析了数字。

关于过早的评估,我将指出这个流行的答案:Boost Spirit: "Semantic actions are evil"?

除此之外,我只是在确认我的怀疑,即您正在进行过早的优化。考虑做某事

#定义BOOST_SPIRIT_DEBUG

然后,在语法构造函数中:

(expression)(logical_or_expression)(logical_and_expression)(negate_expression)(series_expression)(single_expression) (inclusive_or_expression)(exclusive_or_expression)(and_expression)(equality_expression)(relational_expression) (shift_expression)(additive_expression)(multiplicative_expression)(term)(complement_factor)(factor)(result)(integer) (BOOST_SPIRIT_DEBUG_NODES)(变量)(Variable_combo)(Word)(前缀)

来真正了解解析器是如何工作的。

  • 考虑qi::符号,例如:

qi::symbols unary_function;unary_function.add (“char*>”,"OP_WRAP") ("wrap",“OP_WRAP”) ("abs","OP_ABS") ("count1","OP_COUNT1") ("pick","OP_PICK") ("defined","OP_DEF");unary_call = (unary_function >> "(“>> expression >> ')') phx::bind(&fPushOp,phx::qi::_1);

  • traits可能会在内联后为编译器留下更多优化的可能性(与语义操作相反,因为多层模板实例化可能会掩盖某些情况,特别是在涉及bind的情况下)

您可能希望在此处驱动运算符优先级表,如一些spirit示例所示。使用规则层次结构来实施优先级的传统方法会使语法变得复杂。这有两个关键的缺点:

  • 每个规则都引入了一个虚拟派单( X3将来可能不再有此限制)
  • 您的语法变得如此复杂,以至于您已经失去了概述(请参阅第一个项目符号)

建议

我建议

随着语义操作变得笨拙,

  1. 在解析过程中不再进行评估,并且在(后期)回溯(甚至解析器失败;后者可以很容易地检测到;后者可以很容易地检测到)的情况下非常(非常)棘手,但是回溯也可能是良性的,并且很难纠正,因为当语义操作让side effects).
  2. start从最简单的规则构建语法时,随着您添加测试用例

而逐渐构建语法

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/25538555

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档