library(data.table)
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C"))我想在popular中包含至少两次DT2$group时,向DT2添加一个带有值TRUE的列TRUE。因此,在上面的示例中,DT2应该是
group popular
1: A TRUE
2: B TRUE
3: C FALSE怎样才能有效地解决这个问题?
更新示例: DT2实际上可能包含比DT1更多的组,下面是一个更新的示例:
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C", "D"))所需的输出将是
group popular
1: A TRUE
2: B TRUE
3: C FALSE
4: D FALSE发布于 2014-10-19 18:13:07
我只想这样做:
## 1.9.4+
setkey(DT1, group)
DT1[J(DT2$group), list(popular = .N >= 2L), by = .EACHI]
# group popular
# 1: A TRUE
# 2: B TRUE
# 3: C FALSE
# 4: D FALSE ## on the updated exampledata.table的联接语法非常强大,因为在连接时,您还可以在j中聚合/选择/更新列。在这里,我们执行一个连接。对于DT2$group中的每一行,在DT1中对应的匹配行上,我们计算j-expression .N >= 2L;通过指定by = .EACHI (请检查1.9.4新闻),我们每次计算j-expression。
在1.9.4中,.()作为别名引入到所有i、j和by中。所以你也可以:
DT1[.(DT2$group), .(popular = .N >= 2L), by = .EACHI]当您加入单个字符列时,您可以完全放弃.() / J()语法(为了方便起见)。因此,这也可以写成:
DT1[DT2$group, .(popular = .N >= 2L), by = .EACHI]发布于 2014-10-19 17:49:15
我就是这样做的:首先计算每个组在DT1中出现的次数,然后简单地加入DT2和DT1。
require(data.table)
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C"))
#solution:
DT1[,num_counts:=.N,by=group] #the number of entries in this group, just count the other column
setkey(DT1, group)
setkey(DT2, group)
DT2 = DT1[DT2,mult="last"][,list(group, popular = (num_counts >= 2))]
#> DT2
# group popular
#1: A TRUE
#2: B TRUE
#3: C FALSEhttps://stackoverflow.com/questions/26453288
复制相似问题