首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >一种简单编程语言的PEG.js语法

一种简单编程语言的PEG.js语法
EN

Code Review用户
提问于 2017-07-31 00:41:56
回答 1查看 558关注 0票数 10

我正在尝试开发一个编译器,并为一种简单的编程语言编写了一个peg.js语法。在语法上偷偷摸摸的高峰:

代码语言:javascript
复制
/* calculates fibonacci sequence */
fn fibonacci(n int32) int32 {
    /* termination */
    if n == 0 or n == 1 {
        return 1;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

此代码生成以下解析树(为了简洁起见,省略了locationS):

代码语言:javascript
复制
[ { kind: 'FunctionDefinition',
    location:
     { start:
        { offset: 40,
          line: 2,
          column: 3 },
       end:
        { offset: 186,
          line: 9,
          column: 4 } },
    name: 'fibonacci',
    parameters:
     [ { name:
          { kind: 'Identifier',
            location:
             { start:
                { offset: 53,
                  line: 2,
                  column: 16 },
               end:
                { offset: 54,
                  line: 2,
                  column: 17 } },
            name: 'n',
            toString: [Function] },
         type:
          { kind: 'NamedType',
            location:
             { start:
                { offset: 55,
                  line: 2,
                  column: 18 },
               end:
                { offset: 60,
                  line: 2,
                  column: 23 } },
            name: 'int32' } } ],
    returnType:
     { kind: 'NamedType',
       location:
        { start:
           { offset: 62,
             line: 2,
             column: 25 },
          end:
           { offset: 67,
             line: 2,
             column: 30 } },
       name: 'int32' },
    body:
     { kind: 'Body',
       location:
        { start:
           { offset: 68,
             line: 2,
             column: 31 },
          end:
           { offset: 186,
             line: 9,
             column: 4 } },
       statements:
        [ { kind: 'ConditionalStatement',
            location:
             { start:
                { offset: 94,
                  line: 4,
                  column: 4 },
               end:
                { offset: 182,
                  line: 8,
                  column: 5 } },
            predicate:
             { kind: 'BinaryOperator',
               location:
                { start:
                   { offset: 97,
                     line: 4,
                     column: 7 },
                  end:
                   { offset: 113,
                     line: 4,
                     column: 23 } },
               lhs:
                { kind: 'BinaryOperator',
                  location:
                   { start:
                      { offset: 97,
                        line: 4,
                        column: 7 },
                     end:
                      { offset: 113,
                        line: 4,
                        column: 23 } },
                  lhs:
                   { kind: 'Identifier',
                     location:
                      { start:
                         { offset: 97,
                           line: 4,
                           column: 7 },
                        end:
                         { offset: 98,
                           line: 4,
                           column: 8 } },
                     name: 'n',
                     toString: [Function] },
                  operator: '==',
                  rhs:
                   { kind: 'Number',
                     location:
                      { start:
                         { offset: 102,
                           line: 4,
                           column: 12 },
                        end:
                         { offset: 103,
                           line: 4,
                           column: 13 } },
                     value: 0 } },
               operator: 'or',
               rhs:
                { kind: 'BinaryOperator',
                  location:
                   { start:
                      { offset: 97,
                        line: 4,
                        column: 7 },
                     end:
                      { offset: 113,
                        line: 4,
                        column: 23 } },
                  lhs:
                   { kind: 'Identifier',
                     location:
                      { start:
                         { offset: 107,
                           line: 4,
                           column: 17 },
                        end:
                         { offset: 108,
                           line: 4,
                           column: 18 } },
                     name: 'n',
                     toString: [Function] },
                  operator: '==',
                  rhs:
                   { kind: 'Number',
                     location:
                      { start:
                         { offset: 112,
                           line: 4,
                           column: 22 },
                        end:
                         { offset: 113,
                           line: 4,
                           column: 23 } },
                     value: 1 } } },
            thenBody:
             { kind: 'Body',
               location:
                { start:
                   { offset: 114,
                     line: 4,
                     column: 24 },
                  end:
                   { offset: 134,
                     line: 6,
                     column: 5 } },
               statements:
                [ { kind: 'ReturnStatement',
                    location:
                     { start:
                        { offset: 120,
                          line: 5,
                          column: 5 },
                       end:
                        { offset: 129,
                          line: 5,
                          column: 14 } },
                    expression:
                     { kind: 'Number',
                       location:
                        { start:
                           { offset: 127,
                             line: 5,
                             column: 12 },
                          end:
                           { offset: 128,
                             line: 5,
                             column: 13 } },
                       value: 1 } } ] },
            elseBody:
             { kind: 'Body',
               location:
                { start:
                   { offset: 140,
                     line: 6,
                     column: 11 },
                  end:
                   { offset: 182,
                     line: 8,
                     column: 5 } },
               statements:
                [ { kind: 'ReturnStatement',
                    location:
                     { start:
                        { offset: 146,
                          line: 7,
                          column: 5 },
                       end:
                        { offset: 177,
                          line: 7,
                          column: 36 } },
                    expression:
                     { kind: 'BinaryOperator',
                       location:
                        { start:
                           { offset: 153,
                             line: 7,
                             column: 12 },
                          end:
                           { offset: 176,
                             line: 7,
                             column: 35 } },
                       lhs:
                        { kind: 'FunctionApplication',
                          location:
                           { start:
                              { offset: 153,
                                line: 7,
                                column: 12 },
                             end:
                              { offset: 164,
                                line: 7,
                                column: 23 } },
                          name: 'fib',
                          args:
                           [ { kind: 'BinaryOperator',
                               location:
                                { start:
                                   { offset: 157,
                                     line: 7,
                                     column: 16 },
                                  end:
                                   { offset: 162,
                                     line: 7,
                                     column: 21 } },
                               lhs:
                                { kind: 'Identifier',
                                  location:
                                   { start:
                                      { offset: 157,
                                        line: 7,
                                        column: 16 },
                                     end:
                                      { offset: 158,
                                        line: 7,
                                        column: 17 } },
                                  name: 'n',
                                  toString: [Function] },
                               operator: '-',
                               rhs:
                                { kind: 'Number',
                                  location:
                                   { start:
                                      { offset: 161,
                                        line: 7,
                                        column: 20 },
                                     end:
                                      { offset: 162,
                                        line: 7,
                                        column: 21 } },
                                  value: 1 } } ] },
                       operator: '+',
                       rhs:
                        { kind: 'FunctionApplication',
                          location:
                           { start:
                              { offset: 166,
                                line: 7,
                                column: 25 },
                             end:
                              { offset: 176,
                                line: 7,
                                column: 35 } },
                          name: 'fib',
                          args:
                           [ { kind: 'BinaryOperator',
                               location:
                                { start:
                                   { offset: 170,
                                     line: 7,
                                     column: 29 },
                                  end:
                                   { offset: 175,
                                     line: 7,
                                     column: 34 } },
                               lhs:
                                { kind: 'Identifier',
                                  location:
                                   { start:
                                      { offset: 170,
                                        line: 7,
                                        column: 29 },
                                     end:
                                      { offset: 171,
                                        line: 7,
                                        column: 30 } },
                                  name: 'n',
                                  toString: [Function] },
                               operator: '-',
                               rhs:
                                { kind: 'Number',
                                  location:
                                   { start:
                                      { offset: 174,
                                        line: 7,
                                        column: 33 },
                                     end:
                                      { offset: 175,
                                        line: 7,
                                        column: 34 } },
                                  value: 2 } } ] } } } ] } } ] } } ]

语法本身:

代码语言:javascript
复制
{
  const tree = new Proxy({}, {
    get(target, kind, receiver) {
      return (extra = {}) => {
        return Object.assign({
          kind,
          location: location(),
        }, extra);
      };
    },
  });

  const left = Symbol('left');
  const right = Symbol('right');

  const reservedWords = [
    'if', 'unless', 'fn', 'not', 'and', 'or', 'while', 'until', 'return',
    'var', 'const'
  ];

  const emptyBody = tree.Body({ statements: [] });

  const binaryOperators = {
    '*': { associativity: left, precedence: 60 },
    '/': { associativity: left, precedence: 60 },
    'mod': { associativity: left, precedence: 60 },

    '+': { associativity: left, precedence: 50 },
    '-': { associativity: left, precedence: 50 },
    '|': { associativity: left, precedence: 50 },
    '&': { associativity: left, precedence: 50 },

    'shl': { associativity: left, precedence: 40 },
    'shr': { associativity: left, precedence: 40 },

    '==': { associativity: left, precedence: 30 },
    '!=': { associativity: left, precedence: 30 },
    '<':  { associativity: left, precedence: 30 },
    '>':  { associativity: left, precedence: 30 },
    '<=': { associativity: left, precedence: 30 },
    '>=': { associativity: left, precedence: 30 },

    'and': { associativity: left, precedence: 20 },
    'or': { associativity: left, precedence: 10 },

    '=': { associativity: right, precedence: 0 },
  };

  function isLeftAssociative(operator) {
    return binaryOperators[operator].associativity === left;
  }

  function isRightAssociative(operator) {
    return binaryOperators[operator].associativity === right;
  }

  function precedenceOf(operator) {
    return binaryOperators[operator].precedence;
  }

  function nth(index, array = null) {
    return (element) => element[index];
  }

  function get(array, index, defaultValue = null) {
    if (Array.isArray(array) && array[index] !== undefined) {
      return array[index];
    }

    return defaultValue;
  }

  function value(v) {
    return () => v;
  }

  function checkNotReserved(word) {
    if (reservedWords.includes(word)) {
      throwSyntaxError(`Unexpected reserved word ${word}.`);
    }
  }

  function throwSyntaxError(message) {
    const error = new Error(message);
    error.location = location();
    error.name = "SyntaxError";

    throw error;
  }

  function toString(characters) {
    if (typeof characters === 'string') {
      return characters;
    }
    if (!Array.isArray(characters)) {
      throw new TypeError(`toString() accepts string or array of strings.`);
    }

    return characters
      .map(toString)
      .join('');
  }

  function toInteger(s, base) {
    // TODO: line numbers in manual errors
    s = toString(s);
    if (s[0] === '_' || s[s.length - 1] === '_') {
      throwSyntaxError('Thousand separators allowed only inside numerals.');
    }

    return parseInt(s.replace(/_/g, ''), base);
  }

  function notEmpty(value) {
    return value !== undefined;
  }

  function operatorsToTree({ head, tail }) {
    tail = tail.map((element) => {
      return {
        operator: element[1],
        rhs: element[3],
      };
    });

    function collect(lhs, minPrecedence) {
      let operator, rhs;

      while (tail.length && precedenceOf(tail[0].operator) >= minPrecedence) {
        const tmp = tail.shift();
        operator = tmp.operator;
        rhs = tmp.rhs;

        let lookahead = tail[0];
        while (
          tail.length &&
          (precedenceOf(lookahead.operator) > precedenceOf(operator) ||
           isRightAssociative(lookahead.operator) &&
           precedenceOf(lookahead.operator) === precedenceOf(operator))
        ) {
          rhs = collect(rhs, precedenceOf(lookahead.operator));
          lookahead = tail[0];
        }

        lhs = tree.BinaryOperator({
          lhs,
          operator,
          rhs,
        });
      }

      return lhs;
    }

    return collect(head, 0);
  }
}

Program
  = _ statements: (TopLevelStatement _)* _
  { return statements.map(nth(0)).filter(notEmpty); }

TopLevelStatement
  = Comment
  { }
  / definition: FunctionDefinition
  { return definition; }

FunctionDefinition
  = "fn" _
    name: Identifier _
    parameters: ParameterList _
    returnType: Type _
    body: Body
  {
    return tree.FunctionDefinition({
      name: String(name),
      parameters,
      returnType,
      body,
    });
  }

ParameterList
  = "(" _ ")"
  { return []; }
  / "(" _ head: NameTypePair tail: (_ "," _ NameTypePair)* _ ")" _
  { return [head].concat(tail.map(nth(3))); }

NameTypePair
  = name: Identifier __ type: Type
  { return { name, type }; }

Body
  = "{" _ statements: (Statement _)* "}"
  { return tree.Body({ statements: statements.map(nth(0)).filter(notEmpty) }); }

Statement
  = Comment
  { }
  / EmptyStatement
  / "return" _ expression: Expression _ StatementTerminator
  { return tree.ReturnStatement({ expression }); }
  / keyword: ConditionalKeyword _
    predicate: Expression _
    thenBody: Body _
    elseBody: ("else" _ Body)?
  {
    predicate = keyword === 'unless'
      ? tree.UnaryOperator({ operator: 'not', operand: predicate })
      : predicate;

    return tree.ConditionalStatement({
      predicate,
      thenBody,
      elseBody: get(elseBody, 2, emptyBody),
    });
  }
  / keyword: LoopingKeyword _
    predicate: Expression _
    doBody: Body
  {
    predicate = keyword === 'until'
      ? tree.UnaryOperator({ operator: 'not', operand: predicate })
      : predicate;

    return tree.LoopingStatement({
      predicate,
      doBody,
    });
  }
  / "var" __
    name: Identifier __
    type: Type _
    initial: ("=" _ Expression)? _
    StatementTerminator
  {
    return tree.VariableDeclaration({
      name,
      type,
      initial: get(initial, 3, null),
    });
  }
  / expression: Expression _ StatementTerminator
  { return tree.ExpressionStatement({ expression }); }

EmptyStatement
  = StatementTerminator
  { return tree.EmptyInstruction(); }

StatementTerminator
  = ";"

Type
  = "pointer" _ "!" _ type: Type
  { return tree.PointerToType({ type }); }
  / "array" _ "(" capacity: Expression ")" _ "!" _ type: Type
  { return tree.ArrayType({ type, capacity }); }
  / name: Identifier
  { return tree.NamedType({ name: String(name) }); }

Expression
  = binaryOperator: BinaryOperator
  { return binaryOperator; }

BinaryOperator
  = head: Primary tail: (_ BinaryToken _ Primary)*
  { return operatorsToTree({ head, tail }); }

Primary
  = "(" _ expression: Expression _ ")"
  { return expression; }
  / application: FunctionApplication
  { return application; }
  / operator: UnaryToken _ operand: Primary
  { return tree.UnaryOperator({ operator, operand }); }
  / identifier: Identifier
  { return identifier; }
  / number: Number
  { return number; }
  / string: String
  { return string; }

FunctionApplication
  = name: Identifier _ args: ArgumentList
  {
    return tree.FunctionApplication({
      name: String(name),
      args,
    });
  }

ArgumentList
  = "(" _ ")"
  { return []; }
  / "(" _ head: Expression tail: (_ "," _ Expression)* _ ")" _
  { return [head].concat(tail.map(nth(3))); }

Identifier "identifier"
  = name: ([a-zA-Z][a-zA-Z0-9"-]*)
  {
    name = toString(name);
    checkNotReserved(name);
    return tree.Identifier({
      name: name,
      toString: value(name),
    });
  }

Number
  = "0x" digits: HexadecimalDigits
  { return tree.Number({ value: toInteger(digits, 16) }); }
  / "0b" digits: BinaryDigits
  { return tree.Number({ value: toInteger(digits, 2) }); }
  / digits: Digits
  { return tree.Number({ value: toInteger(digits, 10) }); }

String
  = '"' string: StringCharacter* '"'
  { return tree.String({ string: toString(string) }); }

StringCharacter
  = "\\\\"
  { return "\\"; }
  / '\\"'
  { return '"'; }
  / "\\n"
  { return "\n"; }
  / "\\r"
  { return "\r"; }
  / "\\t"
  { return "\t"; }
  / "\\b"
  { return "\b"; }
  / "\\x" digits: ([0-9a-fA-F][0-9a-fA-F])
  { return String.fromCharCode(toInteger(digits, 16)); }
  / c: [^"]
  { return c; }

WhiteSpace "white space"
  = ([ \n\r\t]+)
  { }

Comment
  = "/*" (!"*/" .)* "*/"
  { }

_
  = __?

__
  = WhiteSpace

ConditionalKeyword = "if" / "unless"
LoopingKeyword = "while" / "until"
UnaryToken = "+" / "-" / "~" / "not"

BinaryToken
  = "*" / "/" / "mod"
  / "+" / "-" / "|" / "&"
  / "shl" / "shr"
  / "==" / "!=" / "<=" / ">=" / "<" / ">"
  / "and"
  / "or"
  / "="

Digits
  = digits: [0-9_]+
  { return toString(digits); }

HexadecimalDigits
  = digits: [0-9a-fA-F_]+
  { return toString(digits); }

BinaryDigits
  = digits: [01_]+
  { return toString(digits); }

这是我第二次尝试使用PEGs,所以请随时指出初学者的错误和/或推荐最佳实践。

EN

回答 1

Code Review用户

发布于 2019-11-15 00:38:19

这段代码看起来很可靠。它使用了许多ecmascript-6特性,如箭头函数、默认参数值、不被重新赋值的值的const以及可以重新分配的值的let等等。注释可以添加以帮助记录代码。

一些琐碎的函数可以简化。

函数值(V){ => v;}

可简化为:

代码语言:javascript
复制
const value = v => () => v;
票数 1
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/171615

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档