【已解决】antlr中获得的AST的CommonTree这个树的变量中，不存在所想要的父子(parent,child)关系

【问题】

antlr开发过程中，已经写好了antlr的.g源代码了。

已经可以正常通过antlrworks调试出对应的树了：

但是，结果通过

Generate -> Generate Code

生成的代码，加到android项目中，调试出来的AST：

String ddFile = "/mnt/sdcard/hartEddlTestFile_pos.ddl";
CharStream cs = null;
try {
    cs = new ANTLRFileStream(ddFile);
} catch (IOException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
 
HartEddlLexer lexer = new HartEddlLexer(cs);
CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
HartEddlParser parser = new HartEddlParser(tokens);
 
try {
    HartEddlParser.startParse_return parserResult = parser.startParse();
    CommonTree outputTree = parserResult.tree; // here the outputTree's children, not contain expected  parent-children relation
} catch (RecognitionException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

却发现该AST变量，所包含的子节点，是解析整个的输入的内容，所获得的，全部的叶子节点，共有384个：

而不是所希望的，有那6个子节点，然后每个节点，分别包含对应的其各自的内容；

即，所获得的CommonTree类型的AST，内部并不包含对应的父子关系：

【解决过程】

1.后来是看了antlr作者Parr的

The Definitive ANTLR Reference.pdf

中的：

Encoding structure in the intermediate-form tree makes walking it much easier and faster than scanning a linear list of symbols as a parser does. Figuring out that 3+4 is an expression from token stream

INT + INT is much harder for a computer than looking at a tree node that explicitly says “Hi, I’m an addition expression with two operands.” The most convenient way to encode input structure is with a special tree called an abstract syntax tree (AST). ASTs contain only those nodes associated with input symbols and are, therefore, not parse trees.

Parse trees also record input structure, but they have nodes for all rule references used to recognize the input. Parse trees are much bigger and highly sensitive to changes to the parser grammar.

…

Before learning to build ASTs, let’s consider what ASTs should look like for various input structures. Keep in mind the following primary goals as you read this section and when you design ASTs in general. ASTs should do the following:

• Record the meaningful input tokens (and only the meaningful tokens)

• Encode, in the two-dimensional structure of the tree, the grammatical structure used by the parser to match the associated tokens but not the superfluous rule names themselves

• Be easy for the computer to recognize and navigate

参考了：

How can I build parse trees not ASTs?

以及：

Please help me to create parse tree from java and ANTLR

就先去

用antlrworks，将原有的代码，通过Run->Debug，去生成Debug模式的Lexer和Parser

（注：

关于正常生成的代码和debug模式的代码的区别，可以参考：

【问题】android中调试antlr时执行xxx_return去获得parserResult时出错）

然后再去把Debug模式的：

xxxLexer.java

xxxParer.java

添加到对应的android项目中：

然后再去写测试代码：

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.runtime.debug.*;
 
String ddFile = "/mnt/sdcard/hartEddlTestFile_pos.ddl";
CharStream cs = null;
try {
    cs = new ANTLRFileStream(ddFile);
} catch (IOException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
 
//http://blog.sina.com.cn/s/blog_7e4ac8b501015dgf.html
HartEddlLexer lexer = new HartEddlLexer(cs);
CommonTokenStream tokens = new CommonTokenStream(lexer);
 
// create a debug event listener that builds parse trees
ParseTreeBuilder builder = new ParseTreeBuilder("startParse");
// create the parser attached to the token buffer
// and tell it which debug event listener to use
HartEddlParser parser = new HartEddlParser(tokens, builder);
try {
    parser.startParse();
    ParseTree parseTree = builder.getTree();
    System.out.println(parseTree.toStringTree());
} catch (RecognitionException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}

最后就获得了对应的ParseTree：

其中就包含了对应的，我们所需要的，各个节点的，父子关系。

就是一个真正的Tree了。

【总结】

对于同样的antlr的语法.g文件来说，可以生成：

1. 正常代码

antlrworks中：Generate -> Generate Code生成的正常模式的，非调试模式的：xxxLexer.java和xxxParser.java；

后续用代码：

String ddFile = "/mnt/sdcard/hartEddlTestFile_pos.ddl";
CharStream cs = null;
try {
    cs = new ANTLRFileStream(ddFile);
} catch (IOException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
 
HartEddlLexer lexer = new HartEddlLexer(cs);
CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
HartEddlParser parser = new HartEddlParser(tokens);
try {
    HartEddlParser.startParse_return parserResult = parser.startParse();            
    //CommonTree outputTree = parserResult.tree;
    CommonTree outputTree = parserResult.getTree();
    System.out.println(outputTree);
    System.out.println(outputTree.getChildren().toString());
} catch (RecognitionException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

去获得对应的CommonTree的AST；

此AST中，只包含，输入的内容，被解析后所得到的数据；

即你在antlrworks中调试期间所看到的各个叶子节点的数据；

2.调试代码

antlrworks中：Run->Debug生成的调试模式的xxxLexer.java和xxxParser.java；

后续用代码：

String ddFile = "/mnt/sdcard/hartEddlTestFile_pos.ddl";
CharStream cs = null;
try {
    cs = new ANTLRFileStream(ddFile);
} catch (IOException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
 
HartEddlLexer lexer = new HartEddlLexer(cs);
CommonTokenStream tokens = new CommonTokenStream(lexer);
 
//How can I build parse trees not ASTs?
//http://www.antlr.org/wiki/pages/viewpage.action?pageId=1760
// create a debug event listener that builds parse trees
ParseTreeBuilder builder = new ParseTreeBuilder("startParse");
// create the parser attached to the token buffer
// and tell it which debug event listener to use
HartEddlParser parser = new HartEddlParser(tokens, builder);
try {
    parser.startParse();
    ParseTree parseTree = builder.getTree();
    System.out.println(parseTree.toStringTree());
} catch (RecognitionException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}

去获得对应的ParseTree；

此ParseTree不仅包含了antlrworks的叶子节点（输入的内容被分词解析后的结果）

还包含了内在的逻辑父子关系（其实只是rule的调用的逻辑关系）

此时，我们此处，需要的就是这个ParseTree。

这样，后续可以写代码，针对此，包含了内在的父子关系的，ParseTree，去提取出来我们所需要的各种信息和数据。

转载请注明：在路上 » 【已解决】antlr中获得的AST的CommonTree这个树的变量中，不存在所想要的父子(parent,child)关系

Post Views: 1,411

与本文相关的文章

订阅在路上