我遇到了rcppRoll包的问题。我想用它来总结过去3个月的价值,但有时没有一个月或更多个月的数据。"n = 3“考虑的是最后三个观察,而不是最后三个月。我找不到可靠的解决办法,所以我在这里试着碰运气。谢谢您的任何建议。
我更喜欢使用data.table和rcpp_roll,因为我的数据集很大,而且我熟悉这些数据集。
代码:
library("data.table")
library("RcppRoll")
test = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8)
test = test[, var:= roll_sumr(value, n = 3, na.rm = TRUE), by = id]
id date value var
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 12
6: 1 2015-09 6 15
7: 1 2015-10 7 18
8: 1 2015-11 8 21预期输出
prefered_outcome = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8,var = c(NA, NA, 6, 9, NA, NA, 18, 21))
id date value var
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 NA
6: 1 2015-09 6 NA
7: 1 2015-10 7 18
8: 1 2015-11 8 21发布于 2018-11-27 14:08:50
定义ym类,并检查先前和第二个优先ym是否在一个月和两个月前,如果是,则使用roll_sumr,否则使用NA。
library(zoo)
ym <- test[, as.yearmon(date)]
test[, roll := ifelse(ym - 1/12 == shift(ym) & ym - 2/12 == shift(ym, 2),
roll_sumr(value, 3, na.rm = TRUE), NA), by = id ]给予:
> test
id date value roll
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 NA
6: 1 2015-09 6 NA
7: 1 2015-10 7 18
8: 1 2015-11 8 21发布于 2018-11-27 13:11:38
您可以先添加缺少的月份,然后执行函数。在此之后,可以再次删除所增加的月份。
library(data.table)
library("RcppRoll")
library(zoo)
test = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8)
test$date <- as.yearmon(test$date)
allMonths <- seq.Date(from=as.Date(test$date[1]),to=as.Date(test$date[length(test$date)]),by="month")
df2 <- data.frame(date=as.yearmon(allMonths))
df3 <- merge(test,df2, all=TRUE)
df3 <- df3[, var:= roll_sumr(value, n = 3, na.rm = TRUE), by = id]
df3https://stackoverflow.com/questions/53499834
复制相似问题