如果我以主页为例
The strongest rain ever recorded in India shut down
the financial hub of Mumbai, snapped communication
lines, closed airports and forced thousands of people
to sleep in their offices or walk home during the night,
officials said today.斯坦福解析器:
LexicalizedParser lexicalizedParser = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
Tree parse = lexicalizedParser.parse(text);
TreePrint treePrint = new TreePrint("penn, typedDependencies");
treePrint.printTree(parse);交付折叠树:
(ROOT
(S
(S
(NP
(NP (DT The) (JJS strongest) (NN rain))
(VP
(ADVP (RB ever))
(VBN recorded)
(PP (IN in)
(NP (NNP India)))))
(VP
(VP (VBD shut)
(PRT (RP down))
(NP
(NP (DT the) (JJ financial) (NN hub))
(PP (IN of)
(NP (NNP Mumbai)))))
(, ,)
(VP (VBD snapped)
(NP (NN communication) (NNS lines)))
(, ,)
(VP (VBD closed)
(NP (NNS airports)))
(CC and)
(VP (VBD forced)
(NP
(NP (NNS thousands))
(PP (IN of)
(NP (NNS people))))
(S
(VP (TO to)
(VP
(VP (VB sleep)
(PP (IN in)
(NP (PRP$ their) (NNS offices))))
(CC or)
(VP (VB walk)
(NP (NN home))
(PP (IN during)
(NP (DT the) (NN night))))))))))
(, ,)
(NP (NNS officials))
(VP (VBD said)
(NP-TMP (NN today)))
(. .)))现在,我想拆分依赖于其结构的树,以获得子句。因此,在本例中,我希望拆分树以获得以下部分:
我怎么能这么做?
因此,第一个答案是使用递归算法打印所有根到叶路径。
下面是我尝试过的代码:
public static void main(String[] args) throws IOException {
LexicalizedParser lexicalizedParser = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
Tree tree = lexicalizedParser.parse("In a ceremony that was conspicuously short on pomp and circumstance at a time of austerity, Felipe, 46, took over from his father, King Juan Carlos, 76.");
printAllRootToLeafPaths(tree, new ArrayList<String>());
}
private static void printAllRootToLeafPaths(Tree tree, List<String> path) {
if(tree != null) {
if(tree.isLeaf()) {
path.add(tree.nodeString());
}
if(tree.children().length == 0) {
System.out.println(path);
} else {
for(Tree child : tree.children()) {
printAllRootToLeafPaths(child, path);
}
}
path.remove(tree.nodeString());
}
}当然,这段代码是完全不合逻辑的,因为如果我只是将leafs添加到路径中,就不会有递归调用,因为leafs没有子调用。这里的问题是,所有真实的单词都是叶字,所以这个算法只需打印出一个单词,即叶字:
[The]
[strongest]
[rain]
[ever]
[recorded]
[in]
[India]
[shut]
[down]
[the]
[financial]
[hub]
[of]
[Mumbai]
[,]
[snapped]
[communication]
[lines]
[,]
[closed]
[airports]
[and]
[forced]
[thousands]
[of]
[people]
[to]
[sleep]
[in]
[their]
[offices]
[or]
[walk]
[home]
[during]
[the]
[night]
[,]
[officials]
[said]
[today]
[.]发布于 2014-06-24 09:25:57
查看一下在二叉树中打印所有根到叶路径或分割二叉树:
https://stackoverflow.com/questions/24382581
复制相似问题