如何从wordnet中生成更一般的、更不一般的和等价的关系?
RitaWordnet中的wordnet相似度给出了一个类似于- 1.0,0.222或1.0的数字,但是如何得出单词之间更一般,更不一般的关系呢?哪种工具是最理想的呢?请帮帮我
我得到了java.lang.NullPointerException,在它打印出"the holonyms are“之后。
package wordnet;
import rita.wordnet.RiWordnet;
public class Main {
public static void main(String[] args) {
try {
// Would pass in a PApplet normally, but we don't need to here
RiWordnet wordnet = new RiWordnet();
wordnet.setWordnetHome("/usr/share/wordnet/dict");
// Demo finding parts of speech
String word = "first name";
System.out.println("\nFinding parts of speech for " + word + ".");
String[] partsofspeech = wordnet.getPos(word);
for (int i = 0; i < partsofspeech.length; i++) {
System.out.println(partsofspeech[i]);
}
//word = "eat";
String pos = wordnet.getBestPos(word);
System.out.println("\n\nDefinitions for " + word + ":");
// Get an array of glosses for a word
String[] glosses = wordnet.getAllGlosses(word, pos);
// Display all definitions
for (int i = 0; i < glosses.length; i++) {
System.out.println(glosses[i]);
}
// Demo finding a list of related words (synonyms)
//word = "first name";
String[] poss = wordnet.getPos(word);
for (int j = 0; j < poss.length; j++) {
System.out.println("\n\nSynonyms for " + word + " (pos: " + poss[j] + ")");
String[] synonyms = wordnet.getAllSynonyms(word, poss[j], 10);
for (int i = 0; i < synonyms.length; i++) {
System.out.println(synonyms[i]);
}
}
// Demo finding a list of related words
// X is Hypernym of Y if every Y is of type X
// Hyponym is the inverse
//word = "nurse";
pos = wordnet.getBestPos(word);
System.out.println("\n\nHyponyms for " + word + ":");
String[] hyponyms = wordnet.getAllHyponyms(word, pos);
//System.out.println(hyponyms.length);
//if(hyponyms!=null)
for (int i = 0; i < hyponyms.length; i++) {
System.out.println(hyponyms[i]);
}
System.out.println("\n\nHypernyms for " + word + ":");
String[] hypernyms = wordnet.getAllHypernyms(word, pos);
//if(hypernyms!=null)
for (int i = 0; i < hypernyms.length; i++) {
System.out.println(hypernyms[i]);
}
System.out.println("\n\nHolonyms for " + word + ":");
String[] holonyms = wordnet.getAllHolonyms(word, pos);
//if(holonyms!=null)
for (int i = 0; i < holonyms.length; i++) {
System.out.println(holonyms[i]);
}
System.out.println("\n\nmeronyms for " + word + ":");
String[] meronyms = wordnet.getAllMeronyms(word, pos);
if(meronyms!=null)
for (int i = 0; i < meronyms.length; i++) {
System.out.println(meronyms[i]);
}
System.out.println("\n\nAntonym for " + word + ":");
String[] antonyms = wordnet.getAllAntonyms(word, pos);
if(antonyms!=null)
for (int i = 0; i < antonyms.length; i++) {
System.out.println(antonyms[i]);
}
String start = "cameras";
String end = "digital cameras";
pos = wordnet.getBestPos(start);
// Wordnet can find relationships between words
System.out.println("\n\nRelationship between: " + start + " and " + end);
float dist = wordnet.getDistance(start, end, pos);
String[] parents = wordnet.getCommonParents(start, end, pos);
System.out.println(start + " and " + end + " are related by a distance of: " + dist);
// These words have common parents (hyponyms in this case)
System.out.println("Common parents: ");
if (parents != null) {
for (int i = 0; i < parents.length; i++) {
System.out.println(parents[i]);
}
}
//wordnet.
// System.out.println("\n\nHypernym Tree for " + start);
// int[] ids = wordnet.getSenseIds(start,wordnet.NOUN);
// wordnet.printHypernymTree(ids[0]);
} catch (Exception e) {
e.printStackTrace();
}
}
}发布于 2010-10-20 20:59:34
Rita wordnet确实提供了用于查找上位词(更通用)、下位词(不太通用)和同义词的api。有关详情,请参阅以下网页:
http://www.rednoise.org/rita/wordnet/documentation/index.htm
要了解所有这些术语(超词等),请查看wordnet的维基百科页面。
发布于 2010-10-21 23:34:05
您可以尝试自己解析数据库。不会那么难的。1)在以下文件中找到单词: index.noun,index.verb,index.adj和index.noun,2)提取其同义词的id (“词义”),对于每个同义词,转到data.noun,data.verb,data.adj或data.noun,并提取其上位词或下位词的同义词id。然后搜索这些同义词和注释的同义词ids。如果你使用正则表达式,这是相当容易的。
数据库(例如index.verb)可以在Wordnet的一个目录中找到,您可以从here下载该目录。如果您使用的是Linux,也有一个很好的命令行程序可以为您完成这项工作,但是如果您希望将其集成到Java代码中,恐怕您必须自己完成所有的解析工作。您可能还会发现this link很有趣。希望这能有所帮助:)
PS:你也可以试试NLTK (用Python语言编写)
https://stackoverflow.com/questions/3977100
复制相似问题