文章/答案/技术大牛

发布

社区首页 >问答首页 >使用corenlp的共指解析非常长

问使用corenlp的共指解析非常长
EN

Stack Overflow用户

提问于 2016-04-21 22:54:06

回答 2查看 345关注 0票数 0

我有一个问题，我想要解决文档的共引用问题，并且我正在尝试运行由以下link提供的示例

import edu.stanford.nlp.hcoref.CorefCoreAnnotations;
import edu.stanford.nlp.hcoref.data.CorefChain;
import edu.stanford.nlp.hcoref.data.Mention;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.util.CoreMap;

import java.util.Properties;

public class CorefExample {

  public static void main(String[] args) throws Exception {

    Annotation document = new Annotation("Barack Obama was born in Hawaii.  He is the president.  Obama was elected in 2008.");
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,mention,coref");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    pipeline.annotate(document);
    System.out.println("---");
    System.out.println("coref chains");
    for (CorefChain cc : document.get(CorefCoreAnnotations.CorefChainAnnotation.class).values()) {
      System.out.println("\t"+cc);
    }
    for (CoreMap sentence : document.get(CoreAnnotations.SentencesAnnotation.class)) {
      System.out.println("---");
      System.out.println("mentions");
      for (Mention m : sentence.get(CorefCoreAnnotations.CorefMentionsAnnotation.class)) {
        System.out.println("\t"+m);
       }
    }
  }
}

只有一句话需要解决，这大约是我的程序运行的一个小时。正常吗？我花了大约一个小时才得到结果。

我使用这个选项运行了这个程序:-Xmx4g

java

stanford-nlp

回答 2

Stack Overflow用户

发布于 2017-04-07 22:18:25

你有没有尝试使用6 6GB的内存？在documentation中，他们提到新版本的CoreNLP使用神经网络进行共指解析，因此它将比基于规则的算法慢，并且需要更多内存。在我的例子中，它很慢，并且耗尽了两句话的内存和4 GB的RAM。

对于3.7.0 CoreNLP版本，英语中的神经系统示例命令使用5 5GB：

java -Xmx5g -cp stanford-corenlp-3.7.0.jar:stanford-corenlp-models-3.7.0.jar:* edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,mention,coref -coref.algorithm neural -file example_file.txt

您还可以尝试使用参数指定您喜欢的算法：

coref.algorithm

例如,

PropertiesUtils.asProperties("annotators", "your annotators","coref.algorithm","neural");

有三种可用的方法可供选择。

在basic示例代码中，他们使用"dcoref“而不是"coref”作为共指解析注释器，这是确定性方法，速度更快，准确性更低。

票数 0

Stack Overflow用户

发布于 2021-02-04 17:54:59

我没有这个特殊的问题，但是使用CoreNLP和coref作为注释器属性时，堆空间也用完了。问题是我多次创建new StanfordCoreNLP(props)，而不是使用相同的对象。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/36773294

复制

相似问题

问使用corenlp的共指解析非常长
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用corenlp的共指解析非常长EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用corenlp的共指解析非常长
EN