首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在nltk.tree.Tree中查找路径

在nltk.tree.Tree中查找路径
EN

Stack Overflow用户
提问于 2015-02-24 03:25:56
回答 1查看 1.3K关注 0票数 2

我使用nltk.tree.Tree是为了读取基于选区的解析树。我需要找到从树中的一个特定单词移动到另一个特定单词所需的节点路径。

一个简单的例子:

这是句子“看见狗”的解析树:

代码语言:javascript
复制
(VP (VERB saw) (NP (DET the) (NOUN dog)))

如果我想要单词thedog之间的路径,它应该是:DET, NP, NOUN

我甚至不确定如何开始:我如何找到树叶的值?如何找到leave/node的父节点?

谢谢。

EN

回答 1

Stack Overflow用户

发布于 2015-02-27 02:37:55

代码如下:

代码语言:javascript
复制
def get_lca_length(location1, location2):
    i = 0
    while i < len(location1) and i < len(location2) and location1[i] == location2[i]:
        i+=1
    return i

def get_labels_from_lca(ptree, lca_len, location):
    labels = []
    for i in range(lca_len, len(location)):
        labels.append(ptree[location[:i]].label())
    return labels

def findPath(ptree, text1, text2):
    leaf_values = ptree.leaves()
    leaf_index1 = leaf_values.index(text1)
    leaf_index2 = leaf_values.index(text2)

    location1 = ptree.leaf_treeposition(leaf_index1)
    location2 = ptree.leaf_treeposition(leaf_index2)

    #find length of least common ancestor (lca)
    lca_len = get_lca_length(location1, location2)

    #find path from the node1 to lca

    labels1 = get_labels_from_lca(ptree, lca_len, location1)
    #ignore the first element, because it will be counted in the second part of the path
    result = labels1[1:]
    #inverse, because we want to go from the node to least common ancestor
    result = result[::-1]

    #add path from lca to node2
    result = result + get_labels_from_lca(ptree, lca_len, location2)
    return result

ptree = ParentedTree.fromstring("(VP (VERB saw) (NP (DET the) (NOUN dog)))")
print(ptree.pprint())
print(findPath(ptree, 'the', "dog"))

它基于树的列表表示,请参阅here。另请检查similar questions

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/28681741

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档