我有这样的数据:
String[] a = {"a", "b", "c", "d"};
String[] b = {"c", "d"};
String[] c = {"b", "c"};现在我需要这些列表的每个交叉点的图形表示,这通常会产生一个像这样的维恩图:http://manuals.bioinformatics.ucr.edu/_/rsrc/1353282523430/home/R_BioCondManual/r-bioc-temp/venn1.png?height=291&width=400
在我的实现中,这些列表将包含超过1000个条目,我将拥有10+列表,因此一个好的表示将创建一组字符串并将它们相交。在我非常简单的情况下,这将导致
set_a = {"c"}; // in all three lists
set_b = {"b", "d"}; // in two of three lists
set_c = {"a"}; // in one of three lists现在的另一个要求是交集的大小应该与列表中的出现次数成比例。因此,set_a的大小应该是set_c的3倍。
有没有什么库可以满足这些需求?
发布于 2013-02-08 22:59:31
我认为这个程序完成了你想要的转换:
// The input
String[][] a = {
{"a", "b", "c", "d"},
{"c", "d"},
{"b", "c"}
};
System.out.println("Input: "+ Arrays.deepToString(a));
// Convert the input to a Set of Sets (so that we can hangle it more easily
Set<Set<String>> input = new HashSet<Set<String>>();
for (String[] s : a) {
input.add(new HashSet<String>(Arrays.asList(s)));
}
// The map is used for counting how many times each element appears
Map<String, Integer> count = new HashMap<String, Integer>();
for (Set<String> s : input) {
for (String i : s) {
if (!count.containsKey(i)) {
count.put(i, 1);
} else {
count.put(i, count.get(i) + 1);
}
}
}
//Create the output structure
Set<String> output[] = new HashSet[a.length + 1];
for (int i = 1; i < output.length; i++) {
output[i] = new HashSet<String>();
}
// Fill the output structure according the map
for (String key : count.keySet()) {
output[count.get(key)].add(key);
}
// And print the output
for (int i = output.length - 1; i > 0; i--) {
System.out.println("Set_" + i + " = " + Arrays.toString(output[i].toArray()));
}输出:
Input: [[a, b, c, d], [c, d], [b, c]]
Set_3 = [c]
Set_2 = [d, b]
Set_1 = [a]https://stackoverflow.com/questions/14774190
复制相似问题