【问题】
antlr v3的语法,在antlrworks中调试。
核心部分的代码是:
fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; singleInclude : '#include' '"' ID '"' '.h'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision;
解析的内容是:
/* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1
结果调试出错:
【解决过程】
1.很明显,是双引号无法识别,出现MismatchedTokenException(0!=0)的问题。
2.参考:
解释的很清楚,可惜对此问题没帮助。
3.参考:
[antlr-interest] MismatchedTokenException
没太看懂。。。
对解决问题,没帮助。
4.参考:
Antlr.Runtime.MismatchedTokenException from Envers with generic entities
没用。
5.后来搜:
antlr MismatchedTokenException(0!=0) double quote
而参考:
ANTLR grammar how to capture all characters to end of line
其说的,和我此处有点类似:
好像是comment等的定义,和此处的 双引号的匹配,有点冲突了?
所以试着看,把原先的代码:
grammar DDParserDemo; options { output = AST; ASTLabelType = CommonTree; // type of $stat.tree ref etc... } //NEWLINE : '\r'? '\n' ; //NEWLINE : '\r' '\n' ; fragment NEWLINE : '\r'? '\n' ; fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; fragment FLOAT : ('0'..'9')+ '.' ('0'..'9')* EXPONENT? | '.' ('0'..'9')+ EXPONENT? | ('0'..'9')+ EXPONENT ; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();}; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\'' ; fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; ESC_SEQ : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') | UNICODE_ESC | OCTAL_ESC ; fragment OCTAL_ESC : '\\' ('0'..'3') ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ; fragment UNICODE_ESC : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT ; fragment DIGIT : '0'..'9'; //FAKE_TOKEN : '1' '2' '3'; /* DECIMAL_VALUE : '1'..'9' DIGIT*; */ //DECIMAL_VALUE : DIGIT*; DECIMAL_VALUE : DIGIT+; //HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ; HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ; HEX_VALUE : '0x' HEX_DIGIT+; fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; /* BLANKSPACE_TAB // : (' ' | '\t'){skip();}; : (' ' | '\t') {$channel=HIDDEN;}; */ //fragment BLANK : (' '|'\t')+ {skip();}; //BLANK : (' '|'\t') {skip();}; //BLANK : (' '|'\t'); //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+; //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANK : (' '|'\t') {skip();}; BLANKS : (' '|'\t')+; //BLANKS : (' '|'\t')+ {skip();}; //BLANKS : ' '+ {$channel=HIDDEN;}; //singleInclude : '#include' ' '+ '"' ID '.h"' ; //singleInclude : '#include' ' '+ '"' ID+ '.h"' ; //singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"'; //singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"'; //singleInclude : '#include "' HEADER_FILENAME '.h"'; //fragment singleInclude : '#include' (' ')+ '"' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '""' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '"std_defs.h"'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; singleInclude : '#include' '"' ID '"' '.h'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision; startParse : include+ manufacture deviceType deviceRevison ddRevision; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*; manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*; deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; //identification : definiton WS* (','?)! WS* -> definiton; //definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE) //definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE) //definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE);
中的STRING注释掉:
/* STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; */
去重新debug看看结果,结果,果然可以识别第一个双引号了,不过接着又出现了另外的
MismatchedTokenException(0!=0)
的问题:
但是,这样就离着最终解决此问题,前进了一大步了。
因为,搞懂了,之前之所以没有匹配第一个双引号,是因为,之前无故地,多定义了个STRING,但是却没使用。
导致后续无法正常匹配所需要的双引号。
6.此处,之所以错在ID位置,好像是之前多余的,自己定义了一个:
fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*;
所以,去掉:
/* fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; */
试试结果,结果错误依旧。
7.期间遇到类似于重复定义的问题,详见:
【总结】
1.不要随便,乱用,Antlrworks创建新的.g文件时所自带的语法
比如ID,STRING等等。
否则,后期可能和你真正要处理的内容,有冲突:
比如此处就是,之前模板所生成的STRING,和后续的识别双引号,而产生冲突,导致出现了
MismatchedTokenException(0!=0)
而无法正常继续解析。
2.之前的ID定义,其实是可以用的,即:
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
是可以正常使用的。
3.但是对应ID,不能加上fragment,即不能用:
fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
否则,是会报错:MismatchedTokenException(0!=0),的。
4.单引号的表示,的确就是正常的:
'"'
即可。
5.此处,还仍旧会有那个MissingTokenException的,目前看来,估计是bug。
详见:
【基本解决】antlr v3,用包含{$channel=HIDDEN;}语法,结果解析出错:MissingTokenException
6.目前是用如下代码:
grammar DDParserDemo; options { output = AST; ASTLabelType = CommonTree; // type of $stat.tree ref etc... } //NEWLINE : '\r'? '\n' ; //NEWLINE : '\r' '\n' ; fragment NEWLINE : '\r'? '\n' ; fragment FLOAT : ('0'..'9')+ '.' ('0'..'9')* EXPONENT? | '.' ('0'..'9')+ EXPONENT? | ('0'..'9')+ EXPONENT ; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();}; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; /* STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; */ CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\'' ; fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; ESC_SEQ : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') | UNICODE_ESC | OCTAL_ESC ; fragment OCTAL_ESC : '\\' ('0'..'3') ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ; fragment UNICODE_ESC : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT ; //fragment DIGIT : '0'..'9'; //FAKE_TOKEN : '1' '2' '3'; /* DECIMAL_VALUE : '1'..'9' DIGIT*; */ //DECIMAL_VALUE : DIGIT*; DECIMAL_VALUE : DIGIT+; //HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ; HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ; HEX_VALUE : '0x' HEX_DIGIT+; /* fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; */ /* BLANKSPACE_TAB // : (' ' | '\t'){skip();}; : (' ' | '\t') {$channel=HIDDEN;}; */ //fragment BLANK : (' '|'\t')+ {skip();}; //BLANK : (' '|'\t') {skip();}; //BLANK : (' '|'\t'); //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+; //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANK : (' '|'\t') {skip();}; BLANKS : (' '|'\t')+; //BLANKS : (' '|'\t')+ {skip();}; //BLANKS : ' '+ {$channel=HIDDEN;}; //singleInclude : '#include' ' '+ '"' ID '.h"' ; //singleInclude : '#include' ' '+ '"' ID+ '.h"' ; //singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"'; //singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"'; //singleInclude : '#include "' HEADER_FILENAME '.h"'; //fragment singleInclude : '#include' (' ')+ '"' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '""' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '"std_defs.h"'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; //singleInclude : '#include' '"' ID '"' '.h'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; //singleInclude : '#include' BLANKS '"' ID '.h' '"'; //singleInclude : '#include' BLANKS '"' ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* '.h' '"'; //ID_START : 'a'..'z'|'A'..'Z'|'_'; //fragment ID_START : 'a'..'z'|'A'..'Z'|'_'; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'| DIGIT)*; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') (HEX_DIGIT|'_')*; //fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; //ID_START : 'a'..'z'|'A'..'Z'|'_'; //WHOLE_ID : (ID_START) (ID_START | DIGIT)*; //ID_MIDDLE_END : ID_START | DIGIT; //ID_MIDDLE_END : HEX_DIGIT | '_'; //singleInclude : '#include' BLANKS '"' ID_START ID_MIDDLE_END* '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)* '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)+ '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START '.h' '"'; //singleInclude : '#include' BLANKS '"' WHOLE_ID '.h' '"'; singleInclude : '#include' BLANKS '"' ID '.h' '"'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision; startParse : include+ manufacture deviceType deviceRevison ddRevision; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*; manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*; deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; //identification : definiton WS* (','?)! WS* -> definiton; //definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE) //definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE) //definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE);
去解析:
/* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1
对应的截图为:
7.
转载请注明:在路上 » 【已解决】antlr解析双引号出错:MismatchedTokenException(0!=0)