文章/答案/技术大牛

发布

社区首页 >问答首页 >理解和使用协参解析斯坦福NLP工具(Python3.7)

问理解和使用协参解析斯坦福NLP工具(Python3.7)
EN

Stack Overflow用户

提问于 2020-07-04 23:31:48

回答 1查看 1.1K关注 0票数 2

我正在努力理解Coreference NLP斯坦福工具。，这是我的代码，它正在运行

import os
os.environ["CORENLP_HOME"] = "/home/daniel/StanfordCoreNLP/stanford-corenlp-4.0.0"

from stanza.server import CoreNLPClient

text = 'When he came from Brazil, Daniel was fortiﬁed with letters from Conan but otherwise did not know a soul except Herbert. Yet this giant man from the Northeast, who had never worn an overcoat or experienced a change of seasons, did not seem surprised by his past.'

with CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse', 'depparse','coref'],
               properties={'annotators': 'coref', 'coref.algorithm' : 'neural'},timeout=30000, memory='16G') as client:

    ann = client.annotate(text)

chains = ann.corefChain
chain_dict=dict()
for index_chain,chain in enumerate(chains):
    chain_dict[index_chain]={}
    chain_dict[index_chain]['ref']=''
    chain_dict[index_chain]['mentions']=[{'mentionID':mention.mentionID,
                                          'mentionType':mention.mentionType,
                                          'number':mention.number,
                                          'gender':mention.gender,
                                          'animacy':mention.animacy,
                                          'beginIndex':mention.beginIndex,
                                          'endIndex':mention.endIndex,
                                          'headIndex':mention.headIndex,
                                          'sentenceIndex':mention.sentenceIndex,
                                          'position':mention.position,
                                          'ref':'',
                                          } for mention in chain.mention ]


for k,v in chain_dict.items():
    print('key',k)
    mentions=v['mentions']
    for mention in mentions:
        words_list = ann.sentence[mention['sentenceIndex']].token[mention['beginIndex']:mention['endIndex']]
        mention['ref']=' '.join(t.word for t in words_list)
        print(mention['ref'])

我尝试了三种算法：

统计数据(如上面的代码所示)。结果

他是个来自东北的巨人，从来没有穿过大衣，也没有经历过季节的变化，但以理他的。

神经型

这个来自东北的巨人，从来没有穿过大衣，也没有经历过季节的变化，他的

确定性(我得到了下面的错误)使用命令启动服务器: java /home/daniel/StanfordCoreNLP/stanford-corenlp-4.0.0/* -cp > edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout > 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties > corenlp_ server -9fedd1e9dfb14c9e.props -preload > tokenize、ssplit、pos、引理、ner、解析、解解析、coref跟踪(最近一次调用)：>> File“，第1行，在runfile中(‘/home/daniel/Documentos/wdir/领导者特性/代码/20200704-建模>组织/理解_cocorence.py’>)，>wdir=‘home/daniel/Documentos/wdir>品性/代码/20200704-建模组织Code>文件> "/home/daniel/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py"，>第827行，在runfile>execfile中(文件名)>> File > "/home/daniel/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py"，>第110行，在execfile >exec(f.read()，filename，‘exec’，命名空间)>> File“/home/daniel/Documentos/Code/Papers/Code/20200704-建模>组织/理解_coference.py”，第21行，in > ann =client.annotate(文本)>client.annotate>第470行，在注释>r= self._request(text.encode('utf-8')，request_properties，**kwargs) >> File > "/home/daniel/anaconda3/lib/python3.7/site-packages/stanza/server/client.py"，>第404行中在_request > raise (r.text)>> AnnotationException: java.lang.RuntimeException：> java.lang.IllegalArgumentException: No enum常数>r.text中

问题：

为什么我要用确定性得到这个错误？
使用Python中的NLP斯坦福的任何代码似乎都比与Spacy或NLTK相关的代码慢得多。我知道在这些其他库中没有共同引用。但是，例如，当我使用import nltk.parse.stanford import StanfordDependencyParser进行依赖解析时，它比这个StanfordNLP库要快得多。有没有任何方法加速这个CoreNLPClient的Python？
我将使用这个库来处理长的文本。用较小的部分处理整个文本会更好吗？长的文本可能会导致错误的结果，共同引用解决(我已经发现了非常奇怪的结果，这个共同参考库，当我使用长文本)？有合适的尺寸吗？
结果：

统计算法的结果似乎更好。我预计最好的结果将来自于神经算法。你同意我的观点吗？在统计算法中有4种有效的提及，而当我使用神经算法时只有2种。

我是不是遗漏了什么？

python-3.x

nlp

stanford-stanza

coreference-resolution

回答 1

Stack Overflow用户

发布于 2020-08-14 15:53:06

您可以在Java文档中找到支持的算法列表：链接
您可能需要启动服务器，然后使用它，下面是最慢的部分--加载了客户端=CoreNLPClient(.)ann =client.annotate(文本).client.stop()

但我不能给你任何关于3和4的线索。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/62735456

复制

相似问题

问理解和使用协参解析斯坦福NLP工具(Python3.7)
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问理解和使用协参解析斯坦福NLP工具(Python3.7)EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问理解和使用协参解析斯坦福NLP工具(Python3.7)
EN