这可能是微不足道的,但我可能感到困惑。
我有这样的东西:
set.seed(1234)
dt<-data.table(day=sample(c("day1","day2","day3"),20,replace = T),
store=sample(c("store1","store2","store3"),20,replace=T),
x=rnorm(20,33,6),y=rnorm(20,12,10))我对按天和商店进行聚合很感兴趣:
dt[,.(sumx=sum(x),sumy=sum(y)),by=c("day","store")]
day store sumx sumy
1: day1 store2 56.33890 44.52312
2: day2 store1 164.72854 61.37866
3: day3 store3 144.52483 53.74347
4: day1 store3 56.25504 34.00066
5: day3 store1 70.61311 30.85589
6: day2 store3 123.34534 74.67024
7: day2 store2 35.72952 21.19009而且,更全球化的是,只在白天:
dt[,.(sumx=sum(x),sumy=sum(y)),by=day]
day sumx sumy
1: day1 112.5939 78.52378
2: day2 323.8034 157.23899
3: day3 215.1379 84.59936在实践中,我希望最终得到一个数据集,该数据集具有每天的聚合和存储,以及一个仅包含日聚合的附加列:
day store sumx sumy sumx_daylevel sumy_daylevel
1: day1 store2 56.33890 44.52312 112.5939 78.52378
2: day2 store1 164.72854 61.37866 323.8034 157.23899
3: day3 store3 144.52483 53.74347 215.1379 84.59936
4: day1 store3 56.25504 34.00066 112.5939 78.52378
5: day3 store1 70.61311 30.85589 215.1379 84.59936
6: day2 store3 123.34534 74.67024 323.8034 157.23899
7: day2 store2 35.72952 21.19009 323.8034 157.23899我希望实现将所有内容包装在一个函数中,而不是合并。任何帮助都将不胜感激。谢谢
发布于 2017-07-07 18:03:08
我们可以使用:=创建新列
dt[,.(sumx=sum(x),sumy=sum(y)),by=c("day","store")
][, c("sumx_daylevel", "sumy_daylevel") := .(sum(sumx), sum(sumy)), day][]https://stackoverflow.com/questions/44967975
复制相似问题