文章/答案/技术大牛

发布

社区首页 >问答首页 >询问有关在StanfordNLP中使用单一共指的问题

问询问有关在StanfordNLP中使用单一共指的问题
EN

Stack Overflow用户

提问于 2015-02-04 14:32:11

回答 2查看 286关注 0票数 0

我想用斯坦福大学的自然语言处理系统。这意味着我做了标记器和句子拆分器，所有的工作都需要在coref之前完成。我构造了文档涂抹，并在上面做了所有涂抹。但是当我们想要使用coref时，它会出错，因为我没有使用StanfordcoreNLP类，这是我的代码：

edu.stanford.nlp.pipeline.Annotation document=new edu.stanford.nlp.pipeline.Annotation(doc.toString());
    Properties props = new Properties();
    ArrayList <edu.stanford.nlp.ling.CoreLabel> tokenAnnotate=new ArrayList<>();
    //document.set(edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation.class,doc.toString());
    int countToken=0;
    int countSentence=0;
    for(CoreMap sentence: sentences) {
        ArrayList <edu.stanford.nlp.ling.CoreLabel> tokenAnnotateCoreMap=new ArrayList<>();
        // traversing the words in the current sentence
        // a CoreLabel is a CoreMap with additional token-specific methods
        edu.stanford.nlp.util.CoreMap stanfordCorMap=new edu.stanford.nlp.pipeline.Annotation(sentence.toString());
        int countFirstToken=countToken;
        for (CoreLabel token: sentence.get(com.mobin.tp.textAnnotator.common.dto.CoreAnnotations.TokensAnnotation.class)) {
            // this is the text of the token
            countToken++;
            edu.stanford.nlp.ling.CoreLabel coreLabel=mobinStanfordConverter.mobinToStanfordCorelabelConvertor(token);
            tokenAnnotateCoreMap.add(coreLabel);
            tokenAnnotate.add(coreLabel);
        }
        stanfordCorMap.set(edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation.class,tokenAnnotateCoreMap);
        stanfordCorMap.set(edu.stanford.nlp.ling.CoreAnnotations.TokenBeginAnnotation.class,countFirstToken);
        stanfordCorMap.set(edu.stanford.nlp.ling.CoreAnnotations.TokenEndAnnotation.class,countToken);
        stanfordCorMap.set(CoreAnnotations.SentenceIndexAnnotation.class,countSentence);
        stanfordsnetence.add(stanfordCorMap);
        countSentence++;
        // this is the parse tree of the current sentence
        //Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
        // this is the Stanford dependency graph of the current sentence
        //SemanticGraph dependencies = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
    }

    document.set(edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation.class,tokenAnnotate);
    document.set(edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation.class,stanfordsnetence);

    Annotator annotator=new ParserAnnotator(false,0);

    annotator.annotate(document);
    annotator=new DeterministicCorefAnnotator(props);
    annotator.annotate(document);

这是my : ERROR：

attempted to fetch annotator "parse" before the annotator pool was created!

java.lang.AssertionError
at edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder.getParser(RuleBasedCorefMentionFinder.java:345)
at edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder.parse(RuleBasedCorefMentionFinder.java:338)
at edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder.findSyntacticHead(RuleBasedCorefMentionFinder.java:273)
at edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder.findHead(RuleBasedCorefMentionFinder.java:215)
at edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder.extractPredictedMentions(RuleBasedCorefMentionFinder.java:88)
at edu.stanford.nlp.pipeline.DeterministicCorefAnnotator.annotate(DeterministicCorefAnnotator.java:89)

stanford-nlp

java

回答 2

Stack Overflow用户

发布于 2015-02-04 18:31:46

据我所知，standford的NLP库使用多遍筛子算法来解析共指关系。您可以参考此answer了解如何使用该库，并参考此javadoc获取完整的文档。

下面是我测试结果的代码：

public class CoReferenceAnalyzer
{
    public static void main(String[] args)
    {
        Properties props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

        String text = "My horse, whom I call Steve, is my best friend. He comforts me when I ride him";
        Annotation document = new Annotation(text);
        pipeline.annotate(document);

        Map<Integer, CorefChain> graph = document.get(CorefChainAnnotation.class);
        System.out.println("Graph: " + graph.toString());
        for(Map.Entry<Integer, CorefChain> entry : graph.entrySet())
        {
            CorefChain chain = entry.getValue();
            CorefMention repMention = chain.getRepresentativeMention();
            System.out.println("Chain: " + chain.toString());
            System.out.println("Rep: " + repMention.toString());
        }
    }
}

您将看到如下输出：

Graph: {1=CHAIN1-["Steve" in sentence 1, "He" in sentence 2, "him" in sentence 2], 2=CHAIN2-["My horse , whom I call Steve" in sentence 1], 3=CHAIN3-["My horse" in sentence 1], 4=CHAIN4-["My" in sentence 1, "I" in sentence 1, "my" in sentence 1, "me" in sentence 2, "I" in sentence 2], 6=CHAIN6-["my best friend" in sentence 1], 8=CHAIN8-["He comforts me when I ride him" in sentence 2]}
Chain: CHAIN1-["Steve" in sentence 1, "He" in sentence 2, "him" in sentence 2]
Rep: "Steve" in sentence 1

票数 0

Stack Overflow用户

发布于 2016-04-29 10:54:45

我想我已经向你提出了同样的问题。问题可能是maven发布的jar与3.6.0版本的edu.stanford.nlp.simple.Document.java中的源代码不同。

在源代码中，Document的构造函数如下所示：

  public Document(String text) {
    StanfordCoreNLP.getDefaultAnnotatorPool(EMPTY_PROPS, new AnnotatorImplementations());  // cache the annotator pool
    this.impl = CoreNLPProtos.Document.newBuilder().setText(text);
  }

但在maven jar代码中，它看起来像这样：

  public Document(String text) {
    this.impl = CoreNLPProtos.Document.newBuilder().setText(text);
  }

区别是非常明显的。

因此，解决上述问题的方法是从https://github.com/stanfordnlp/CoreNLP下载源代码，并使用ANT生成一个名为jar的新jar然后用新罐子替换旧罐子。

希望这个方法对你有效。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/28314894

复制

相似问题

问询问有关在StanfordNLP中使用单一共指的问题
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问询问有关在StanfordNLP中使用单一共指的问题EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问询问有关在StanfordNLP中使用单一共指的问题
EN