好吧,根据我已经发布的问题数量,我经常觉得自己是使用ANTLR的最愚蠢的人,但在这里,我再次请求帮助。
我最终试图重写一个现有的策略来简化它,结果是“简化的”策略决定炸毁应该发送到隐藏通道的空格(skip()也不起作用)。它可能只是乱序的Lexer令牌,但我被难住了(也许我没有很好地理解如何指定顺序)。
无论如何,这里是整个(有些净化)策略:
grammar ValidatingPolicy;
options {
language = Java;
backtrack = true;
}
// package and imports for the parser
@parser::header {
package org.jason.manager.impl;
import org.jason.manager.RecognitionRuntimeException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
}
// package and imports for the lexer
@lexer::header {
package org.jason.manager.impl;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
}
// member functions and fields for the parser
@parser::members {
private static final Logger log = LoggerFactory.getLogger(ValidatingPolicyParser.class);
@Override
protected Object recoverFromMismatchedToken(IntStream input, int ttype, BitSet follow) throws RecognitionException {
throw new MismatchedTokenException(ttype, input);
}
@Override
public Object recoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow) throws RecognitionException {
throw e;
}
@Override
public String getErrorMessage(RecognitionException e, String[] tokenNames) {
// wrap in a runtime exception to escape ANTLR's dungeon
throw new RecognitionRuntimeException(e);
}
}
// member functions and fields for the lexer
@lexer::members {
private static final Logger log = LoggerFactory.getLogger(ValidatingPolicyLexer.class);
}
// validate a group of SHOW constructs
showGroup
: show+ EOF
;
// validate a construct WITHOUT show (MINQ, MOS, etc)
noShow
: simpleIfStatement+ EOF
;
// validate a SHOW construct (COMP or ELIG validation)
show
: SHOW STRING FOR simpleIfStatement+
;
// handle an if statement
simpleIfStatement
// basic if statement
: IF chainedOperation THEN operationGroup (ELSE operationGroup)? ENDIF
// if statement with recursive if statement in THEN or ELSE block
| IF chainedOperation THEN simpleIfStatement (ELSE simpleIfStatement)? ENDIF
| operationGroup
;
// aggregate multiple operations. When evaluated, there is an implicit AND
// when there are multiple groups
operationGroup
: chainedOperation+
;
// chain an operation together optionally with AND/OR
chainedOperation
@init {
log.info("Entered chainedOperation");
}
: operation (AND operation | OR operation)*
;
// aggregate into a single rule that can be referenced up the chain
operation
@init {
log.info("Entered operation");
}
// legal operation
: (booleanLogical | stringLogical | integerLogical | dateLogical | datePeriodLogical)
;
// LOGICAL OPERATIONS
// Logical operators do not have a pass through, but may have limits
// on which particular operators can be used
// compare DATE/DATE_FIELD to DATE/DATE_FIELD
dateLogical
@init {
log.info("Entered dateLogical");
}
: dateOp (EQ|NE|LT|LE|GT|GE) dateOp
;
// compare DATE_PERIOD/DATE_PERIOD_CONSTANT/DATE_PERIOD_FIELD
datePeriodLogical
@init {
log.info("Entered datePeriodLogical");
}
: datePeriodOp (EQ|NE|LT|LE|GT|GE) datePeriodOp
;
// compare INTEGER_FIELD/INTEGER
integerLogical
@init {
log.info("Entered integerLogical");
}
: integerOp (EQ|NE|LT|LE|GT|GE) integerOp
;
// compare BOOLEAN_FIELD/BOOLEAN_CONSTANT
booleanLogical
: booleanOp (EQ|NE) booleanOp
;
// compare STRING_FIELD/STRING
stringLogical
: stringOp (EQ|NE|LT|LE|GT|GE) stringOp
{
System.out.println("stringLogical: matched rule 1");
}
;
dateOp
@init {
log.info("Entered dateOp");
}
// pass through if no math op needs to be performed
: DATE_FIELD|DATE|DATE_CONSTANT
// match a legal math op
| DATE_FIELD|DATE|DATE_CONSTANT ((PLUS|MINUS) DATE_FIELD|DATE|DATE_CONSTANT|DATE_PERIOD_FIELD|DATE_PERIOD_CONSTANT (' ' DATE_PERIOD_CONSTANT)*)*
;
datePeriodOp
// pass through if no math op needs to be performed
: DATE_PERIOD_FIELD|DATE_PERIOD_CONSTANT
// match a legal math op
| DATE_PERIOD_FIELD ((PLUS|MINUS) DATE_FIELD|DATE|DATE_CONSTANT|DATE_PERIOD_FIELD|DATE_PERIOD_CONSTANT+)*
;
integerOp
@init {
log.info("Entered integerOp");
}
// pass through if no math op needs to be performed
: INTEGER_FIELD | INTEGER
// match a legal math op
| INTEGER_FIELD (PLUS|MINUS INTEGER_FIELD|INTEGER)*
;
// booleanOp, stringOp, and waiverOp don't do anything since + and - ops are not
// supported for them
booleanOp
: BOOLEAN_FIELD | BOOLEAN_CONSTANT
;
stringOp
: STRING_FIELD | STRING
;
// these items are not directly referenced by parser rules, so they
// can be fragments
fragment DIGIT: ('0'..'9');
fragment DATE: ;
fragment DATE_PERIOD_CONSTANT: DIGIT+ ' '+ (YEAR | MONTH | WEEK | DAY);
YEAR: ('YEAR'|'YEARS');
MONTH: ('MONTH'|'MONTHS');
WEEK: ('WEEK'|'WEEKS');
DAY: ('DAY'|'DAYS');
DATE_FIELD:('DOB'|'TEST_DATE');
DATE_PERIOD_FIELD:('EMPLOYMENT_PERIOD');
BOOLEAN_FIELD:('CERTIFIED');
INTEGER_FIELD:('AGE'|'OPTION');
STRING_FIELD:('STATE'|'UF_USERID'|'USER_LEVEL');
// various tokens can't be fragments since they are directly referenced by parser rules
COMMENT_START: ';';
BOOLEAN_CONSTANT: ('TRUE'|'FALSE'|'"Y"'|'"N"');
DATE_CONSTANT:('TODAY'|'YESTERDAY'|'TOMMOROW');
SHOW: 'SHOW';
FOR: 'FOR';
IF: 'IF';
THEN: 'THEN';
ELSE: 'ELSE';
ENDIF: 'ENDIF';
AND: 'AND';
OR: 'OR';
EQ: '=';
NE: '<>';
LT: '<';
LE: '<=';
GT: '>';
GE: '>=';
NOT: 'NOT';
HAS: 'HAS';
PLUS: '+';
MINUS: '-';
// Commented ifs seem to take more than one line, even if comments are
// only supposed to be a single line
COMMENTED_IF: COMMENT_START WS* IF (options {greedy=false;} : .)* ENDIF '\r\n'
{
log.info("Lexer: matched COMMENTED IF" + getText());
$channel=HIDDEN;
//skip();
};
// Handle an empty comment such as "; "
EMPTY_COMMENT: COMMENT_START WS* '\r\n'
{
log.info("Lexer: matched EMPTY_COMMENT: " + getText());
$channel=HIDDEN;
};
// Handle a single-line comment. Policies often end with a comment, so be ready for it
SINGLE_COMMENT: COMMENT_START ~('\r'|'\n')* (('\r\n')+| EOF)
{
log.info("Lexer: matched SINGLE_COMMENT: " + getText());
$channel=HIDDEN;
};
INTEGER
// Bart Kiers on SO helped me with this one, basically handle a date period such as
// 4 WEEKS, 1 YEAR 6 MONTHS 2 WEEKS 8 DAYS, etc
: (DATE_PERIOD_CONSTANT)=> DATE_PERIOD_CONSTANT ((' '+ DATE_PERIOD_CONSTANT)=> ' '+ DATE_PERIOD_CONSTANT)*
{
// manually switch the type from INTEGER to DATE_PERIOD_CONSTANT
$type=DATE_PERIOD_CONSTANT;
log.info("Matched DATE_PERIOD_CONSTANT: " + getText());
}
| DIGIT+
{
// match a 6-digit or 8-digit date format (20120101 or 201201)
if ($text.matches("(19|20|21)[0-9]{2}[0-1]\\d{3}") || $text.matches("(19|20|21)\\d{2}(0[1-9]|1[0-2])")) {
log.info("Matched DATE pattern: " + getText());
$type = DATE;
} else {
log.info("Matched INTEGER: " + getText());
}
}
;
STRING
: '"' ID (' ' ID)* '"'
;
ID: ('A'..'Z'|'a'..'z'|DIGIT|','|'!'|'?'|':')+;
WS: (' '+|'\r'|'\n'|'\t')
{
//skip();
$channel=HIDDEN;
};"show“结构应该看起来像这样:
SHOW "DOES NOT MEET AGE REQUIREMENTS FOR EMPLOYMENT" FOR
AGE < 18
SHOW "TOO YOUNG FOR CERTIFICATION IN KY" FOR
IF STATE="KY" THEN AGE > 21 ENDIF当我删除空格时,比如字符串周围,或者操作符周围等等,它就会起作用。
此外,如果有人看到语法中的任何其他愚蠢之处,我很高兴听到他们。
杰森
发布于 2013-03-02 13:55:58
您的词法分析器正在匹配隐式的未命名词法分析器规则中的空格。此词法分析器规则在解析器规则dateOp中引用:
dateOp
//...
// pass through if no math op needs to be performed
: DATE_FIELD|DATE|DATE_CONSTANT
// match a legal math op
| DATE_FIELD|DATE|DATE_CONSTANT
((PLUS|MINUS) DATE_FIELD|DATE|DATE_CONSTANT|DATE_PERIOD_FIELD|DATE_PERIOD_CONSTANT
(' ' DATE_PERIOD_CONSTANT)* //<--- ' ' becomes a new lexer rule
)*
;它的行为就像一个普通的词法分析器规则,所以输入如下:
SHOW "DOES NOT MEET AGE REQUIREMENTS FOR EMPLOYMENT" FOR
AGE < 18lexer生成以下标记:
[SHOW : SHOW] [' ' : ] [STRING : "DOES NOT MEET AGE REQUIREMENTS FOR EMPLOYMENT"]
[' ' : ] [FOR : FOR] [INTEGER_FIELD : AGE] [' ' : ] [LT : <] [' ' : ]
[INTEGER : 18] 请注意[' ' : ]标记。这些是工作中的隐式词法分析器规则。解析器在dateOp规则之外不需要这些令牌,因此解析会出错。
从解析器规则dateOp中删除' '后,上面的输入如预期的那样生成以下标记:
[SHOW : SHOW] [STRING : "DOES NOT MEET AGE REQUIREMENTS FOR EMPLOYMENT"]
[FOR : FOR] [INTEGER_FIELD : AGE] [LT : <]
[INTEGER : 18] 我不知道从dateOp中删除' '在您的语法中是否可以接受。如果需要显式地测试空间,请考虑尽可能地重写,以便将空白测试移到词法分析器中。或者,解析器可以提前查看下一个令牌是否是隐藏的WS令牌。不过,对于初学者来说,我建议尽可能地清理dateOp,看看事情会落在哪里。
https://stackoverflow.com/questions/15161141
复制相似问题