文章/答案/技术大牛

发布

社区首页 >问答首页 >关于在句子中查找单词的Java查询

问关于在句子中查找单词的Java查询
EN

Stack Overflow用户

提问于 2011-10-13 21:36:21

回答 2查看 1.4K关注 0票数 2

我正在使用斯坦福大学的自然语言处理解析器(http://nlp.stanford.edu/software/lex-parser.shtml)将一段文本分割成句子，然后查看哪些句子包含给定的单词。

到目前为止，我的代码如下：

import java.io.FileReader;
import java.io.IOException;
import java.util.List;

import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.process.*;

public class TokenizerDemo {

    public static void main(String[] args) throws IOException {
        DocumentPreprocessor dp = new DocumentPreprocessor(args[0]);
        for (List sentence : dp) {
            for (Object word : sentence) {
                System.out.println(word);
                System.out.println(word.getClass().getName());
                if (word.equals(args[1])) {
                    System.out.println("yes!\n");
                }
            }
        }
    }
}

我使用"java TokenizerDemo testfile.txt wall“从命令行运行代码。

testfile.txt的内容是：

Humpty Dumpty sat on a wall. Humpty Dumpty had a great fall.

因此，我希望程序检测第一句话中的"wall“(作为命令行中的第二个参数输入”wall“)。但该程序不会检测"wall"，因为它从不打印"yes!“。该程序的输出为：

Humpty
edu.stanford.nlp.ling.Word
Dumpty
edu.stanford.nlp.ling.Word
sat
edu.stanford.nlp.ling.Word
on
edu.stanford.nlp.ling.Word
a
edu.stanford.nlp.ling.Word
wall
edu.stanford.nlp.ling.Word
.
edu.stanford.nlp.ling.Word
Humpty
edu.stanford.nlp.ling.Word
Dumpty
edu.stanford.nlp.ling.Word
had
edu.stanford.nlp.ling.Word
a
edu.stanford.nlp.ling.Word
great
edu.stanford.nlp.ling.Word
fall
edu.stanford.nlp.ling.Word
.
edu.stanford.nlp.ling.Word

来自斯坦福大学解析器的DocumentPreprocessor正确地将文本拆分为两个句子。问题似乎出在equals方法的使用上。每个单词都有"edu.stanford.nlp.ling.Word“类型。我尝试访问单词的底层字符串，因此可以检查字符串是否等于"wall"，但我不知道如何访问它。

如果我把第二个for循环写成"for (Word word :句子) {“，那么编译时就会得到一个不兼容类型的错误消息。

java

string

nlp

stanford-nlp

sentence

回答 2

Stack Overflow用户

回答已采纳

发布于 2011-10-13 21:46:35

可以通过调用方法：word() on edu.stanford.nlp.ling.Word来访问String内容；例如

import edu.stanford.nlp.ling.Word;

List<Word> words = ...
for (Word word : words) {
  if (word.word().equals(args(1))) {
    System.err.println("Yes!");
  }
}

还要注意，在定义List时最好使用泛型，因为这意味着如果您尝试比较不兼容类型的类(例如，Word与String)，编译器或集成开发环境通常会警告您。

编辑

事实证明，我使用的是NLP API的一个旧版本。查看最新的DocumentPreprocessor文档，我看到它实现了Iterable<List<HasWord>>，由此HasWord定义了word()方法。因此，您的代码应该如下所示：

DocumentPreprocessor dp = ...
for (HasWord hw : dp) {
  if (hw.word().equals(args[1])) {
    System.err.println("Yes!");
  }
}

票数 2

Stack Overflow用户

发布于 2011-10-13 22:10:57

因为可以优雅地打印单词，所以一个简单的word.toString().equals(arg[1])就足够了。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/7754950

复制

相似问题

问关于在句子中查找单词的Java查询
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问关于在句子中查找单词的Java查询EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问关于在句子中查找单词的Java查询
EN