我有以下数据:
> str(eth)
'data.frame': 11029 obs. of 10 variables:
$ Date : Date, format: "2017-07-01" "2017-07-01" "2017-07-01" "2017-07-01" ...
$ Symbol : Factor w/ 1 level "ETHUSD": 1 1 1 1 1 1 1 1 1 1 ...
$ Open : num 264 257 258 258 260 ...
$ High : num 265 264 261 262 261 ...
$ Low : num 260 256 254 257 253 ...
$ Close : num 263 264 257 258 258 ...
$ Volume.From : num 7221 4975 10747 9118 15402 ...
$ Volume.To : num 1902503 1290128 2765561 2366698 3962669 ...
$ future_12h_high: num NA NA NA NA NA NA NA NA NA NA ...
$ past_12h_high : num NA NA NA NA NA NA NA NA NA NA ...和:
> head(eth)
Date Symbol Open High Low Close Volume.From Volume.To future_12h_high past_12h_high
1 2017-07-01 ETHUSD 263.84 264.97 260.31 263.12 7221.08 1902503 NA NA
2 2017-07-01 ETHUSD 257.13 264.36 256.03 263.84 4975.12 1290128 NA NA
3 2017-07-01 ETHUSD 258.17 260.56 254.15 257.13 10746.60 2765561 NA NA
4 2017-07-01 ETHUSD 258.49 262.00 257.12 258.17 9118.43 2366698 NA NA
5 2017-07-01 ETHUSD 259.50 260.88 253.23 258.49 15402.48 3962669 NA NA
6 2017-07-01 ETHUSD 263.51 266.73 255.27 259.50 20821.39 5396852 NA NA和:
> dput(head(eth,10))
structure(list(Date = structure(c(17348, 17348, 17348, 17348,
17348, 17348, 17348, 17348, 17348, 17348), class = "Date"), Symbol = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "ETHUSD", class = "factor"),
Open = c(263.84, 257.13, 258.17, 258.49, 259.5, 263.51, 268,
272.57, 265.74, 268.79), High = c(264.97, 264.36, 260.56,
262, 260.88, 266.73, 268.44, 272.57, 272.74, 269.9), Low = c(260.31,
256.03, 254.15, 257.12, 253.23, 255.27, 262.39, 267.6, 265,
265), Close = c(263.12, 263.84, 257.13, 258.17, 258.49, 259.5,
263.51, 268, 272.57, 265.74), Volume.From = c(7221.08, 4975.12,
10746.6, 9118.43, 15402.48, 20821.39, 7142.36, 4776.58, 5581.66,
6367.05), Volume.To = c(1902503.11, 1290127.76, 2765560.88,
2366698.5, 3962669, 5396852.35, 1894983.33, 1287300.75, 1500282.55,
1702536.85), future_12h_high = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), past_12h_high = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_)), row.names = c(NA, 10L), class = "data.frame")我试图在过去的12个数据点内计算过去和未来的高点。
eth <- read.csv("/Users/micahsmith/Downloads/Gdax_ETHUSD_1h.csv") %>%
mutate(Date = as.Date(Date, "%Y-%m-%d %I-%p")) %>%
arrange(Date) %>%
mutate(future_12h_high = max(lead(High,12)), past_12h_high = max(lag(High,12)))上面的代码是不正确的--它正在计算当前项和项- 12索引项的最大值。我想使用所有的最后12个和所有未来的12个项目。
总结
如何在窗口中显示一系列未来和过去的项目,而不仅仅是一个值?
发布于 2018-10-07 15:09:40
我已经将窗口从12降到4,因为示例数据没有12个观察结果,但是下面是一种使用RcppRoll的方法
# install.packages("RcppRoll")
library(RcppRoll)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
eth %>% mutate(Date = as.Date(Date, "%Y-%m-%d %I-%p")) %>%
arrange(Date) %>%
mutate(future_12h_high = roll_maxl(High, 4),
past_12h_high = roll_maxr(High, 4))
#> Date Symbol Open High Low Close Volume.From Volume.To
#> 1 2017-07-01 ETHUSD 263.84 264.97 260.31 263.12 7221.08 1902503
#> 2 2017-07-01 ETHUSD 257.13 264.36 256.03 263.84 4975.12 1290128
#> 3 2017-07-01 ETHUSD 258.17 260.56 254.15 257.13 10746.60 2765561
#> 4 2017-07-01 ETHUSD 258.49 262.00 257.12 258.17 9118.43 2366698
#> 5 2017-07-01 ETHUSD 259.50 260.88 253.23 258.49 15402.48 3962669
#> 6 2017-07-01 ETHUSD 263.51 266.73 255.27 259.50 20821.39 5396852
#> 7 2017-07-01 ETHUSD 268.00 268.44 262.39 263.51 7142.36 1894983
#> 8 2017-07-01 ETHUSD 272.57 272.57 267.60 268.00 4776.58 1287301
#> 9 2017-07-01 ETHUSD 265.74 272.74 265.00 272.57 5581.66 1500283
#> 10 2017-07-01 ETHUSD 268.79 269.90 265.00 265.74 6367.05 1702537
#> future_12h_high past_12h_high
#> 1 264.97 NA
#> 2 264.36 NA
#> 3 266.73 NA
#> 4 268.44 264.97
#> 5 272.57 264.36
#> 6 272.74 266.73
#> 7 272.74 268.44
#> 8 NA 272.57
#> 9 NA 272.74
#> 10 NA 272.74正如 Window functions vignette中所说的,“滚动聚合在固定宽度的窗口中运行,在基本R或dplyr中找不到它们,但是在其他包中有很多实现,比如RcppRoll。”
roll_maxr(High, 4)在长度为4的滚动窗口中获得High的最大值,并对齐(因此在当前观察和当前观察之前的三个观察)。roll_maxl(High, 4)做同样的事情,但使用左对齐(因此在当前观察和当前观察之后的三个观察)。
发布于 2018-10-07 15:35:47
类似于@duckmayr,您还可以从rollapplyr库中使用zoo:
df %>%
mutate(Date = as.Date(Date, "%Y-%m-%d %I-%p")) %>%
arrange(Date) %>%
mutate(past_12h_high = rollapplyr(High, 4, max, align = "right", partial = FALSE, fill = NA),
future_12h_high = rollapplyr(High, 4, max, align = "left", partial = FALSE, fill = NA))
Date Symbol Open High Low Close Volume.From Volume.To
1 2017-07-01 ETHUSD 263.84 264.97 260.31 263.12 7221.08 1902503
2 2017-07-01 ETHUSD 257.13 264.36 256.03 263.84 4975.12 1290128
3 2017-07-01 ETHUSD 258.17 260.56 254.15 257.13 10746.60 2765561
4 2017-07-01 ETHUSD 258.49 262.00 257.12 258.17 9118.43 2366698
5 2017-07-01 ETHUSD 259.50 260.88 253.23 258.49 15402.48 3962669
6 2017-07-01 ETHUSD 263.51 266.73 255.27 259.50 20821.39 5396852
7 2017-07-01 ETHUSD 268.00 268.44 262.39 263.51 7142.36 1894983
8 2017-07-01 ETHUSD 272.57 272.57 267.60 268.00 4776.58 1287301
9 2017-07-01 ETHUSD 265.74 272.74 265.00 272.57 5581.66 1500283
10 2017-07-01 ETHUSD 268.79 269.90 265.00 265.74 6367.05 1702537
future_12h_high past_12h_high
1 264.97 NA
2 264.36 NA
3 266.73 NA
4 268.44 264.97
5 272.57 264.36
6 272.74 266.73
7 272.74 268.44
8 NA 272.57
9 NA 272.74
10 NA 272.74此外,对于partial = TRUE,即使窗口大小小于指定值,rollapplyr也会返回一个值(而不是NA(s)):
df %>%
mutate(Date = as.Date(Date, "%Y-%m-%d %I-%p")) %>%
arrange(Date) %>%
mutate(past_12h_high = rollapplyr(High, 4, max, align = "right", partial = TRUE, fill = NA),
future_12h_high = rollapplyr(High, 4, max, align = "left", partial = TRUE, fill = NA))
Date Symbol Open High Low Close Volume.From Volume.To
1 2017-07-01 ETHUSD 263.84 264.97 260.31 263.12 7221.08 1902503
2 2017-07-01 ETHUSD 257.13 264.36 256.03 263.84 4975.12 1290128
3 2017-07-01 ETHUSD 258.17 260.56 254.15 257.13 10746.60 2765561
4 2017-07-01 ETHUSD 258.49 262.00 257.12 258.17 9118.43 2366698
5 2017-07-01 ETHUSD 259.50 260.88 253.23 258.49 15402.48 3962669
6 2017-07-01 ETHUSD 263.51 266.73 255.27 259.50 20821.39 5396852
7 2017-07-01 ETHUSD 268.00 268.44 262.39 263.51 7142.36 1894983
8 2017-07-01 ETHUSD 272.57 272.57 267.60 268.00 4776.58 1287301
9 2017-07-01 ETHUSD 265.74 272.74 265.00 272.57 5581.66 1500283
10 2017-07-01 ETHUSD 268.79 269.90 265.00 265.74 6367.05 1702537
future_12h_high past_12h_high
1 264.97 264.97
2 264.36 264.97
3 266.73 264.97
4 268.44 264.97
5 272.57 264.36
6 272.74 266.73
7 272.74 268.44
8 272.74 272.57
9 272.74 272.74
10 269.90 272.74https://stackoverflow.com/questions/52689594
复制相似问题