【问题】
antlr v3的语法,在antlrworks中调试。
核心部分的代码是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; singleInclude : '#include' '"' ID '"' '.h'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision; |
解析的内容是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | /* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1 |
结果调试出错:
【解决过程】
1.很明显,是双引号无法识别,出现MismatchedTokenException(0!=0)的问题。
2.参考:
解释的很清楚,可惜对此问题没帮助。
3.参考:
[antlr-interest] MismatchedTokenException
没太看懂。。。
对解决问题,没帮助。
4.参考:
Antlr.Runtime.MismatchedTokenException from Envers with generic entities
没用。
5.后来搜:
antlr MismatchedTokenException(0!=0) double quote
而参考:
ANTLR grammar how to capture all characters to end of line
其说的,和我此处有点类似:
好像是comment等的定义,和此处的 双引号的匹配,有点冲突了?
所以试着看,把原先的代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | grammar DDParserDemo; options { output = AST; ASTLabelType = CommonTree; // type of $stat.tree ref etc... } //NEWLINE : '\r'? '\n' ; //NEWLINE : '\r' '\n' ; fragment NEWLINE : '\r'? '\n' ; fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; fragment FLOAT : ('0'..'9')+ '.' ('0'..'9')* EXPONENT? | '.' ('0'..'9')+ EXPONENT? | ('0'..'9')+ EXPONENT ; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();}; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\'' ; fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; ESC_SEQ : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') | UNICODE_ESC | OCTAL_ESC ; fragment OCTAL_ESC : '\\' ('0'..'3') ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ; fragment UNICODE_ESC : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT ; fragment DIGIT : '0'..'9'; //FAKE_TOKEN : '1' '2' '3'; /* DECIMAL_VALUE : '1'..'9' DIGIT*; */ //DECIMAL_VALUE : DIGIT*; DECIMAL_VALUE : DIGIT+; //HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ; HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ; HEX_VALUE : '0x' HEX_DIGIT+; fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; /* BLANKSPACE_TAB // : (' ' | '\t'){skip();}; : (' ' | '\t') {$channel=HIDDEN;}; */ //fragment BLANK : (' '|'\t')+ {skip();}; //BLANK : (' '|'\t') {skip();}; //BLANK : (' '|'\t'); //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+; //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANK : (' '|'\t') {skip();}; BLANKS : (' '|'\t')+; //BLANKS : (' '|'\t')+ {skip();}; //BLANKS : ' '+ {$channel=HIDDEN;}; //singleInclude : '#include' ' '+ '"' ID '.h"' ; //singleInclude : '#include' ' '+ '"' ID+ '.h"' ; //singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"'; //singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"'; //singleInclude : '#include "' HEADER_FILENAME '.h"'; //fragment singleInclude : '#include' (' ')+ '"' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '""' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '"std_defs.h"'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; singleInclude : '#include' '"' ID '"' '.h'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision; startParse : include+ manufacture deviceType deviceRevison ddRevision; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*; manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*; deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; //identification : definiton WS* (','?)! WS* -> definiton; //definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE) //definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE) //definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE); |
中的STRING注释掉:
1 2 3 4 5 | /* STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; */ |
去重新debug看看结果,结果,果然可以识别第一个双引号了,不过接着又出现了另外的
MismatchedTokenException(0!=0)
的问题:
但是,这样就离着最终解决此问题,前进了一大步了。
因为,搞懂了,之前之所以没有匹配第一个双引号,是因为,之前无故地,多定义了个STRING,但是却没使用。
导致后续无法正常匹配所需要的双引号。
6.此处,之所以错在ID位置,好像是之前多余的,自己定义了一个:
1 2 3 | fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; |
所以,去掉:
1 2 3 4 5 | /* fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; */ |
试试结果,结果错误依旧。
7.期间遇到类似于重复定义的问题,详见:
【总结】
1.不要随便,乱用,Antlrworks创建新的.g文件时所自带的语法
比如ID,STRING等等。
否则,后期可能和你真正要处理的内容,有冲突:
比如此处就是,之前模板所生成的STRING,和后续的识别双引号,而产生冲突,导致出现了
MismatchedTokenException(0!=0)
而无法正常继续解析。
2.之前的ID定义,其实是可以用的,即:
1 2 | ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; |
是可以正常使用的。
3.但是对应ID,不能加上fragment,即不能用:
1 2 3 | fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; |
否则,是会报错:MismatchedTokenException(0!=0),的。
4.单引号的表示,的确就是正常的:
1 | '"' |
即可。
5.此处,还仍旧会有那个MissingTokenException的,目前看来,估计是bug。
详见:
【基本解决】antlr v3,用包含{$channel=HIDDEN;}语法,结果解析出错:MissingTokenException
6.目前是用如下代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | grammar DDParserDemo; options { output = AST; ASTLabelType = CommonTree; // type of $stat.tree ref etc... } //NEWLINE : '\r'? '\n' ; //NEWLINE : '\r' '\n' ; fragment NEWLINE : '\r'? '\n' ; fragment FLOAT : ('0'..'9')+ '.' ('0'..'9')* EXPONENT? | '.' ('0'..'9')+ EXPONENT? | ('0'..'9')+ EXPONENT ; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();}; //fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; /* STRING : '"' ( ESC_SEQ | ~('\\'|'"') )* '"' ; */ CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\'' ; fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; ESC_SEQ : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') | UNICODE_ESC | OCTAL_ESC ; fragment OCTAL_ESC : '\\' ('0'..'3') ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ('0'..'7') | '\\' ('0'..'7') ; fragment UNICODE_ESC : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT ; //fragment DIGIT : '0'..'9'; //FAKE_TOKEN : '1' '2' '3'; /* DECIMAL_VALUE : '1'..'9' DIGIT*; */ //DECIMAL_VALUE : DIGIT*; DECIMAL_VALUE : DIGIT+; //HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ; HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ; HEX_VALUE : '0x' HEX_DIGIT+; /* fragment HEADER_FILENAME : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*; */ /* BLANKSPACE_TAB // : (' ' | '\t'){skip();}; : (' ' | '\t') {$channel=HIDDEN;}; */ //fragment BLANK : (' '|'\t')+ {skip();}; //BLANK : (' '|'\t') {skip();}; //BLANK : (' '|'\t'); //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+ {$channel=HIDDEN;}; //BLANKS : (' '|'\t')+; //BLANK : (' '|'\t') {$channel=HIDDEN;}; //BLANK : (' '|'\t') {skip();}; BLANKS : (' '|'\t')+; //BLANKS : (' '|'\t')+ {skip();}; //BLANKS : ' '+ {$channel=HIDDEN;}; //singleInclude : '#include' ' '+ '"' ID '.h"' ; //singleInclude : '#include' ' '+ '"' ID+ '.h"' ; //singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"'; //singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"'; //singleInclude : '#include "' HEADER_FILENAME '.h"'; //fragment singleInclude : '#include' (' ')+ '"' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '""' ID '.h"'; //singleInclude : '#include' (' '|'\t')+ '"std_defs.h"'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; //singleInclude : '#include' '"' ID '"' '.h'; //singleInclude : '#include' BLANKS '"' ID '"' '.h'; //singleInclude : '#include' BLANKS '"' ID '.h' '"'; //singleInclude : '#include' BLANKS '"' ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* '.h' '"'; //ID_START : 'a'..'z'|'A'..'Z'|'_'; //fragment ID_START : 'a'..'z'|'A'..'Z'|'_'; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'| DIGIT)*; //WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') (HEX_DIGIT|'_')*; //fragment ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; //ID_START : 'a'..'z'|'A'..'Z'|'_'; //WHOLE_ID : (ID_START) (ID_START | DIGIT)*; //ID_MIDDLE_END : ID_START | DIGIT; //ID_MIDDLE_END : HEX_DIGIT | '_'; //singleInclude : '#include' BLANKS '"' ID_START ID_MIDDLE_END* '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)* '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)+ '.h' '"'; //singleInclude : '#include' BLANKS '"' ID_START '.h' '"'; //singleInclude : '#include' BLANKS '"' WHOLE_ID '.h' '"'; singleInclude : '#include' BLANKS '"' ID '.h' '"'; //include : singleInclude WS* -> singleInclude; include : singleInclude WS*; //startParse : include* identification+; //startParse : include+ identification+; //startParse : identification+; //startParse : manufacture deviceType deviceRevison ddRevision; startParse : include+ manufacture deviceType deviceRevison ddRevision; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; //manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*; manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*; deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*; deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*; //identification : definiton WS* (','?)! WS* -> definiton; //definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE) //definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE) //definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE); |
去解析:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | /* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1 |
对应的截图为:
7.
转载请注明:在路上 » 【已解决】antlr解析双引号出错:MismatchedTokenException(0!=0)