下面的代码在一个时间序列中发现没有雨的时间序列,在这个时间序列之前有超过一个阈值的雨(例如,2天内的10毫米,k=48)。
significant_drydowns <- function(rain, threshold=10, k=48) {
require(zoo)
require(dplyr)
# Find drydowns: sequences of 0 rain; give them a group
starts <- (lag(rain, default=0) > 0) & (rain == 0)
groups <- cumsum(starts)
groups[rain > 0] <- NaN
groups[groups == 0] <- NaN
# remove drydowns where previous rain is below threshold
past_rain <- rollsum(rain, k, fill=0, align='right')
for (t in which(starts)) {
if (past_rain[t-1] < threshold) {
groups[groups == groups[t]] <- NaN
}
}
return(groups)
}for循环非常慢,部分原因在于==比较。有什么方法可以加快这个代码的速度吗?
示例数据:
雨<-c(0,0,0,2,3,0,0,0,1,5,0,0,0,0,0,0,6,1,1,1,0,0,0)名称(雨) <-雨
示例输出:
R> significant_drydowns(雨,5,5) 0 0 0 2 3 0 0 1 5 0 0 0 6 1 1 1 0 0 0 NaN 1 1 1 NaN NaN 2 2 2 NaN 3 3 R> significant_drydowns4) 0 0 0 2 3 0 0 0 1 5 0 0 0 6 1 1 1 0 0 0 NaN 3 3 3
所以,组名并不重要,只要它们是唯一的。组只分配给以前k步的和大于阈值的干涸。
发布于 2014-10-03 12:49:29
以下函数不需要for循环或*apply家族的任何函数即可工作。此外,它不需要额外的包,而是只使用基本函数。有关详细信息,请参阅代码注释。
significant_drydowns <- function(rain, threshold = 10, k = 48) {
# all values except the first one
rain_tail <- rain[-1L]
# logical index of start of no-rain period
starts <- head(rain, -1L) > 0 & !rain_tail
# no-rain groups
groups <- cumsum(starts)
# sum of the amount of rain in the last k elements
# (for e.g., k = 5 the filter is c(1,1,1,1,1), therefore
# the sum of 5 preceding elements is calculated)
past_rain <- filter(rain, rep.int(1L, k), sides = 1L)
# valid groups (previous amount of rain exceeds threshold)
valid <- past_rain[starts] >= threshold
# replace `groups` values with NA if
# (a) there is rain or
# (b) this is the first rain or no-rain period
is.na(groups) <- rain_tail | !groups
# replace `groups` values with NA if amount of rain is below threshold
# (here `groups` is used as a numeric index for `valid`)
is.na(groups) <- !valid[groups]
# add NA to match length of original vector and set names
setNames(c(NA_integer_, groups), rain)
}一些结果:
> significant_drydowns(rain, 5, 5)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA 1 1 1 NA NA 2 2 2 2 2 2 NA NA NA NA 3 3 3
> significant_drydowns(rain, 7, 4)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3 3 3 https://codereview.stackexchange.com/questions/64608
复制相似问题