我手头有很多话要说。我需要做的是保存它们并计算每个不同的单词。原始数据可能包含一些重复的words.Firstly,我想使用Set,然后我可以保证我只得到不同的wrod。但是我怎么计算他们的时间呢?有没有人有什么“聪明”的主意?
发布于 2013-03-14 10:25:06
您可以使用Guava库中的MultiSet。
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multiset.html
发布于 2013-03-14 10:40:51
您可以使用Map来解决此问题。
String sample = " I have a problem here. I have a lot of words at hand. What I need to do is to save them and count every different word. The original data may contains duplicate words.Firstly, I want to use Set, then I can guarantee that I only get the different wrods. But how can I count their times? Is there someone having any clever idea?";
String[] array = sample.split("[\\s\\.,\\?]");
Map<String,Integer> statistic = new HashMap<String,Integer>();
for (String elem:array){
String trimElem = elem.trim();
Integer count = 0;
if(!"".equals(trimElem)){
if(statistic.containsKey(trimElem)){
count = statistic.get(trimElem);
}
count++;
statistic.put(trimElem,count);
}
}发布于 2013-03-14 10:57:12
也许你可以使用散列,在java中,它是HashMap(或者HashSet?)您可以对每个单词进行散列,如果该单词已被散列,则将与其关联的一些值加1,这就是我们的想法。
https://stackoverflow.com/questions/15400067
复制相似问题