我有一个包含三列的data.table,需要进行复杂的聚合:
> c
sample group symbol
1: APPL/S Up CBEbrown Icosl
2: APPL/S Up CBEbrown Ampd3
3: APPL/S Up CBEbrown Thbs2
4: APPL/S Up CBEbrown Map4k4
5: APPL/S Up CBEbrown Slc45a3
---
1724: APPL/S_BD10-2 Up TCXyellow Nfxl1
1725: APPL/S_BD10-2 Up TCXyellow Rhog
1726: APPL/S_BD10-2 Up TCXyellow Wipf1
1727: APPL/S_BD10-2 Up TCXyellow Selenos
1728: APPL/S_BD10-2 Up TCXyellow Kdelr2因此,sample有两个结果,每个结果有7个groups。基本上,需要"APPL/S _ and 10-2 Up“中的fsetdiff of symbols,而不是”APPL/S Up“中的symbols:
setdiff(c[group == "TCXyellow" & sample == "APPL/S_BD10-2 Up", symbol],
c[group == "TCXyellow" & sample == "APPL/S Up", symbol])但是我想计算每个符号,fsetdiff发生了多少个fsetdiff(从0到7可能)。输出结果如下:
> out = c[, N_diff := fsetdiff(?????), by="symbol"]
> out
symbol N_diff
1: Icosl 4
2: Ampd3 5
3: Thbs2 7
4: Map4k4 4
5: Slc45a3 4
---
503: Unc13d 1
504: Rpl30 1
505: Tpt1 1
506: Garre1 1
507: Selenos 17组中有4组Icosl处于"APPL/S _ in 10-2 Up“,而不是”APPL/S Up“。
发布于 2022-09-25 01:57:26
我想你可以这样做:
f <- function(sam,sym) setdiff(sym[sam!="APPL/S UP"], sym[sam=="APPL/S UP"])
df[,.(symbol = f(sample,symbol)),group][, .(N_diff = uniqueN(group)),symbol]https://stackoverflow.com/questions/73841314
复制相似问题