【问题】
antlr的语法:
grammar preprocess; //lexer grammar preprocess; options{ language=Java; } ...... fragment MACRO_TEXT : ( '\\'! '\n' {newline();} // escaped newline | ~'\n' )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); }; ......
结果编译出错:
[13:35:19] warning(149): preprocess.g:0:1: rewrite syntax or operator with no output option; setting output=AST |
【解决过程】
1.看到错误提示是:
rewrite syntax or operator with no output option; setting output=AST
然后注意到,刚添加的:
fragment MACRO_TEXT : ( '\\'! '\n' {newline();} // escaped newline | ~'\n' )*;
中的感叹号,根据antlr资料:
所以是rewrite rule。
并且开始的输出的确没有设置。所以按照提示,去设置输出为AST:
options{ language=Java; output=AST; }
然后再去编译试试,结果出现其他错误:
[13:43:54] warning(200): preprocess.g:97:44: As a result, alternative(s) 2 were disabled for that input [13:43:54] error(133): preprocess__.g:0:1: illegal option output [13:43:54] warning(200): D:\DevRoot\IndustrialMobileAutomation\HandheldDataSetter\ANTLR\projects\v1.5\HartEddlParser_local_TFS\preprocess\remove_comment\preprocess.g:97:44: Decision can match input such as "{‘\t’..’\n’, ‘\r’, ‘ ‘}" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input |
2.参考了:
[antlr-interest] ANTLR3.0b2 – rule STRING uses rewrite syntax or operator with no output option
才注意到,其实那个感叹号!,是parser中的语法,但是此处却用到了Lexer中了,所以语法上就是错的。
要么在lexer中去掉感叹号,要么把对应的lexer的token:DIRECTIVE,改为parser。
改为:
fragment MACRO_TEXT : ( '\\'{skip();} '\n' {newline();} // escaped newline | ~'\n' )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
试试,果然可以了。
至少可以消除此处的这个语法错误了。
注:
余下的,是另外的错误:
[13:50:16] warning(200): preprocess.g:97:44: As a result, alternative(s) 2 were disabled for that input Decision can match input such as "{‘\t’..’\n’, ‘\r’, ‘ ‘}" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input |
有待后续解决。
【总结】
如之前总结的,antlr的语法:
【整理】antlr中的各种语法:集合元素(Element Sets),标签元素(Element Labels),构造树操作符(Tree construction operators)
加感叹号!,这个语法,只是针对Parser才有的,
换句话说,如果用在Lexer中,就是非法的。
所以,此处,在:
fragment MACRO_TEXT : ( '\\'! '\n' {newline();} // escaped newline | ~'\n' )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT )
中用到了感叹号,即,在DIRECTIVE这个Lexer中(其中MACRO_TEXT是fragment,所以会被替换到DIRECTIVE中)
所以,会报错。
然后去掉此处的感叹号!,换成{skip();}:
fragment MACRO_TEXT : ( '\\'{skip();} '\n' {newline();} // escaped newline | ~'\n' )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT )
就可以解决问题,实现类似的,扔掉对应的字符的效果了。
【后记】
1.实际上,经过测试,即使加了skip(),结果也是无法扔掉这个反斜杠的。
即,无法实现对应的扔掉字符的效果。
因为此处的skip(),只是扔掉在lexer处理输入字符串的时候的字符,
但是对于此处的defineText获得的值,还是包含了反斜杠的。
2.再去改为:
fragment //MACRO_TEXT : ( ('\\'{skip();System.out.println("skip line tail back slash");} '\r'? '\n') MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') | (~('\r'|'\n')) )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
试试,结果生成的java的lexer中,_hidden找不到,编译就出错了。
再去改为:
//MACRO_TEXT : ( ('\\'{skip();System.out.println("skip line tail back slash");} '\r'? '\n') fragment MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') | (~('\r'|'\n')) )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
结果问题依旧。
3.再去改为:
/* //MACRO_TEXT : ( ('\\'{skip();System.out.println("skip line tail back slash");} '\r'? '\n') fragment MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') | (~('\r'|'\n')) )*; */ DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') | (~('\r'|'\n')) )* ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
结果lexer中,这句:
String macroValue = defineText.getText();
又无法正常执行了,因为开始的定义是:
int defineText;
总之,貌似很难办。
4.实在不行的话,那就只有去手动写java代码去除:
反斜杠加上回车
了。
5.参考antlr的api:
http://www.antlr3.org/api/Java/org/antlr/runtime/Lexer.html
(注:
从:
->
)
去试试SetText()
fragment //MACRO_TEXT : ( (('\\'){skip();System.out.println("skip line tail back slash");} '\r'? '\n') //MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') MACRO_TEXT : ( (('\\'){setText("");System.out.println("set back slash to empty");} '\r'? '\n') | (~('\r'|'\n')) )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
结果也还是扔不掉那个反斜杠。
6.打算手动去实现替换:
fragment //MACRO_TEXT : ( (('\\'){skip();System.out.println("skip line tail back slash");} '\r'? '\n') //MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') //MACRO_TEXT : ( (('\\'){setText("");System.out.println("set back slash to empty");} '\r'? '\n') MACRO_TEXT : ( ('\\' '\r'? '\n') | (~('\r'|'\n')) )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); macroValue = macroValue.replaceAll("\\\r?\n", ""); System.out.println("Found macro: " + macroKey + "=" + macroValue); defines.put(macroKey, macroValue); skip(); };
看看效果。结果没有替换掉。
调试发现,原来对应的内容是
\ \r \n
对应的java代码,就写错了。
最终,经过折腾:
才可以实现对应的效果,去除掉了反斜杠回车换行:
暂时就折腾到这里。
算是,通过手动写java代码,把define中的多行值中的行尾的
反斜杠 回车 换行
\ \r\n
去除了。
【总结】
最终,使用如下语法:
fragment //MACRO_TEXT : ( (('\\'){skip();System.out.println("skip line tail back slash");} '\r'? '\n') //MACRO_TEXT : ( ('\\'{$channel=HIDDEN;System.out.println("set back slash to hidden");} '\r'? '\n') //MACRO_TEXT : ( (('\\'){setText("");System.out.println("set back slash to empty");} '\r'? '\n') MACRO_TEXT : ( ('\\' '\r'? '\n') | (~('\r'|'\n')) )*; DIRECTIVE : ('#define' WS* defineMacro=ID WS* defineText=MACRO_TEXT ) { String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); String removedBackslashMacroValue = macroValue.replaceAll("\\\\(\\r)?\\n", "\r\n"); //remove \ \r \n System.out.println("Found macro: " + macroKey + "=" + removedBackslashMacroValue); defines.put(macroKey, removedBackslashMacroValue); skip(); };
生成的,对应lexer部分的代码:
int defineTextStart186 = getCharIndex(); int defineTextStartLine186 = getLine(); int defineTextStartCharPos186 = getCharPositionInLine(); mMACRO_TEXT(); defineText = new CommonToken(input, Token.INVALID_TOKEN_TYPE, Token.DEFAULT_CHANNEL, defineTextStart186, getCharIndex()-1); defineText.setLine(defineTextStartLine186); defineText.setCharPositionInLine(defineTextStartCharPos186); } String macroKey = defineMacro.getText(); String macroValue = defineText.getText(); String removedBackslashMacroValue = macroValue.replaceAll("\\\\(\\r)?\\n", "\r\n"); //remove \ \r \n System.out.println("Found macro: " + macroKey + "=" + removedBackslashMacroValue); defines.put(macroKey, removedBackslashMacroValue); skip(); } state.type = _type; state.channel = _channel;
然后处理后的效果为:
Found macro: STR_ACTUATOR_FUNC="Actuator function" Found macro: _SET_ACTUATOR_FUNCTION=0x8A /* 138 */ Found macro: __ALL_RESPONSE_LIST= 0, SUCCESS, [no_command_specific_errors]; Found macro reference, so replce NON_STR_DEFINE to READ&WRITE Found macro reference, so replce _SET_ACTUATOR_FUNCTION to 0x8A /* 138 */ Found macro reference, so replce __ALL_RESPONSE_LIST to 0, SUCCESS, [no_command_specific_errors]; Found macro reference, so replce STR_SHOW_SEP_VAR to "Show seperated variables" |
更多处理,等待以后再弄。
转载请注明:在路上 » 【已解决】antlr语法出错:rewrite syntax or operator with no output option; setting output=AST