我使用nltk.tree.Tree是为了读取基于选区的解析树。我需要找到从树中的一个特定单词移动到另一个特定单词所需的节点路径。
一个简单的例子:
这是句子“看见狗”的解析树:
(VP (VERB saw) (NP (DET the) (NOUN dog)))如果我想要单词the和dog之间的路径,它应该是:DET, NP, NOUN。
我甚至不确定如何开始:我如何找到树叶的值?如何找到leave/node的父节点?
谢谢。
发布于 2015-02-27 02:37:55
代码如下:
def get_lca_length(location1, location2):
i = 0
while i < len(location1) and i < len(location2) and location1[i] == location2[i]:
i+=1
return i
def get_labels_from_lca(ptree, lca_len, location):
labels = []
for i in range(lca_len, len(location)):
labels.append(ptree[location[:i]].label())
return labels
def findPath(ptree, text1, text2):
leaf_values = ptree.leaves()
leaf_index1 = leaf_values.index(text1)
leaf_index2 = leaf_values.index(text2)
location1 = ptree.leaf_treeposition(leaf_index1)
location2 = ptree.leaf_treeposition(leaf_index2)
#find length of least common ancestor (lca)
lca_len = get_lca_length(location1, location2)
#find path from the node1 to lca
labels1 = get_labels_from_lca(ptree, lca_len, location1)
#ignore the first element, because it will be counted in the second part of the path
result = labels1[1:]
#inverse, because we want to go from the node to least common ancestor
result = result[::-1]
#add path from lca to node2
result = result + get_labels_from_lca(ptree, lca_len, location2)
return result
ptree = ParentedTree.fromstring("(VP (VERB saw) (NP (DET the) (NOUN dog)))")
print(ptree.pprint())
print(findPath(ptree, 'the', "dog"))https://stackoverflow.com/questions/28681741
复制相似问题