Antlr-3在遇到法语中的磅字符(“ in ”)时产生错误,这相当于英语的Hash "#“字符,甚至三个特殊字符@、#和$的Unicode值都是在lexer/解析器规则中指定的。
FYI: (法语)= Hash (英文)的Unicode值。
词法/解析器规则:
grammar SimpleCalc;
options
{
k = 8;
language = Java;
//filter = true;
}
tokens {
PLUS = '+' ;
MINUS = '-' ;
MULT = '*' ;
DIV = '/' ;
}
/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/
expr : n1=NUMBER ( exp = ( PLUS | MINUS ) n2=NUMBER )*
{
if ($exp.text.equals("+"))
System.out.println("Plus Result = " + $n1.text + $n2.text);
else
System.out.println("Minus Result = " + $n1.text + $n2.text);
}
;
/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/
NUMBER : (DIGIT)+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; } ;
fragment DIGIT : '0'..'9' | '£' | ('\u0040' | '\u0023' | '\u0024');文本文件在UTF-8中也读取为:
public static void main(String[] args) throws Exception
{
try
{
args = new String[1];
args[0] = new String("antlr_test.txt");
SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0], "UTF-8"));
CommonTokenStream tokens = new CommonTokenStream(lex);
SimpleCalcParser parser = new SimpleCalcParser(tokens);
parser.expr();
//System.out.println(tokens);
}
catch (Exception e)
{
e.printStackTrace();
}
}输入文件只有一行:
£3 + 4£错误是:
antlr_test.txt line 1:1 no viable alternative at character '£'
antlr_test.txt line 1:7 no viable alternative at character '£'我的方法有什么问题?还是我错过了什么?
发布于 2020-09-28 06:12:13
我无法复制你所描述的。当我未经修改测试您的语法时,我会得到一个NumberFormatException,这是预期的,因为Integer.parseInt("£3")无法成功。
当我将您的嵌入式代码更改为:
{
if ($exp.text.equals("+"))
System.out.println("Result = " + (Integer.parseInt($n1.text.replaceAll("\\D", "")) + Integer.parseInt($n2.text.replaceAll("\\D", ""))));
else
System.out.println("Result = " + (Integer.parseInt($n1.text.replaceAll("\\D", "")) - Integer.parseInt($n2.text.replaceAll("\\D", ""))));
}并重新生成lexer和解析器类(您可能还没有这样做)并重新运行驱动程序代码,我得到以下输出:
Result = 7编辑
也许语法中的英镑符号才是问题所在?如果你试着:
fragment DIGIT : '0'..'9' | '\u00A3' | ('\u0040' | '\u0023' | '\u0024');而不是:
fragment DIGIT : '0'..'9' | '£' | ('\u0040' | '\u0023' | '\u0024');https://stackoverflow.com/questions/64080945
复制相似问题