使用数据(方),假设我知道成交量和开盘价之间存在平滑的关系。此外,我知道最具预测性的滚动均值的长度因股票而异。对于一些人来说,它是很短的,一天2天。对于其他10。我想为每个股票创建多个滚动长度在2到10天之间的方式。
到目前为止,我尝试了tibbletime包,并获得了一个开始,这样我就可以计算其中一个的多个滚动平均值。
library(tibbletime)
library(tidyverse)
data("FANG")
FB <- FANG %>% filter(symbol == “FB”)
meanstep <- seq(2, 10, 1)
col_names <- map_chr(meanstep, ~paste0("rollmean_", .x))
rollers <- map(meanstep, ~rollify(mean, window = .x)) %>% set_names(nm = col_names)
FB_multiroll<- bind_cols(FB, invoke_map(rollers, x = FB$volume))然而,当按多个股票分组时,我似乎想不出如何实现这一点。
我尝试添加:
FANG_with_multiroll<- FANG %>% group_by(symbol) %>% bind_cols(FANG, invoke_map(rollers, x =FANG$volume)但这并不管用。它创建了滚动方法,但不是按组。取而代之的是,它只获取整个数据帧,而不考虑“符号”。任何想法都将不胜感激。有一次我让它工作,我计划为每个符号找到最高的相关性或r平方。如果你也有更好的方法,我很感兴趣。
发布于 2019-03-04 01:59:09
OP已经用dplyr和purr标记了这个问题,但这个问题已经有6个月没有得到回答了。
在1.12.0版本(2019年1月13日),data.table包获得了frollmean()函数,该函数可用于按组创建不同长度的多个滚动手段。
data(FANG, package = "tibbletime")
library(data.table) # version 1.12.0 +
meanstep <- 2:10
FANG_with_multiroll <- as.data.table(FANG)[
, sprintf("rollmean_%02i", meanstep) := frollmean(volume, meanstep), by = symbol][]
FANG_with_multirollsymbol date open high low close volume adjusted rollmean\_02 rollmean\_03 1: FB 2013-01-02 27.44 28.18 27.420 28.00 69846400 28.00 NA NA 2: FB 2013-01-03 27.88 28.47 27.590 27.77 63140600 27.77 66493500 NA 3: FB 2013-01-04 28.01 28.93 27.830 28.76 72715400 28.76 67928000 68567466.7 4: FB 2013-01-07 28.69 29.79 28.650 29.42 83781800 29.42 78248600 73212600.0 5: FB 2013-01-08 29.51 29.60 28.860 29.06 45871300 29.06 64826550 67456166.7 --- 4028: GOOG 2016-12-23 790.90 792.74 787.280 789.91 623400 789.91 796250 933733.3 4029: GOOG 2016-12-27 790.68 797.86 787.657 791.55 789100 791.55 706250 793866.7 4030: GOOG 2016-12-28 793.70 794.23 783.200 785.05 1132700 785.05 960900 848400.0 4031: GOOG 2016-12-29 783.33 785.93 778.920 782.79 742200 782.79 937450 888000.0 4032: GOOG 2016-12-30 782.75 782.78 770.410 771.82 1760200 771.82 1251200 1211700.0 rollmean\_04 rollmean\_05 rollmean\_06 rollmean\_07 rollmean\_08 rollmean\_09 rollmean\_10 1: NA NA NA NA NA NA NA 2: NA NA NA NA NA NA NA 3: NA NA NA NA NA NA NA 4: 72371050 NA NA NA NA NA NA 5: 66377275 67071100 NA NA NA NA NA --- 4028: 931575 990440 1230083.3 1286314 1333588 1420944 1488560 4029: 897575 903080 956883.3 1167086 1224163 1273089 1357760 4030: 878575 944600 941350.0 982000 1162788 1214000 1259050 4031: 821850 851300 910866.7 912900 952025 1116056 1166820 4032: 1106050 1009520 1002783.3 1032200 1018813 1041822 1180470
为了证明这对每组都有效,我们可以打印每组的前几行(也只打印前10列):
FANG_with_multiroll[, head(.SD, 3), .SDcols = 1:10, by = symbol]symbol symbol date open high low close volume adjusted rollmean\_02 rollmean\_03 1: FB FB 2013-01-02 27.4400 28.1800 27.4200 28.0000 69846400 28.00000 NA NA 2: FB FB 2013-01-03 27.8800 28.4700 27.5900 27.7700 63140600 27.77000 66493500 NA 3: FB FB 2013-01-04 28.0100 28.9300 27.8300 28.7600 72715400 28.76000 67928000 68567467 4: AMZN AMZN 2013-01-02 256.0800 258.1000 253.2600 257.3100 3271000 257.31000 NA NA 5: AMZN AMZN 2013-01-03 257.2700 260.8800 256.3700 258.4800 2750900 258.48001 3010950 NA 6: AMZN AMZN 2013-01-04 257.5800 259.8000 256.6500 259.1500 1874200 259.14999 2312550 2632033 7: NFLX NFLX 2013-01-02 95.2100 95.8100 90.6900 92.0100 19431300 13.14429 NA NA 8: NFLX NFLX 2013-01-03 91.9700 97.9200 91.5300 96.5900 27912500 13.79857 23671900 NA 9: NFLX NFLX 2013-01-04 96.5400 97.7100 95.5400 95.9800 17761100 13.71143 22836800 21701633 10: GOOG GOOG 2013-01-02 719.4212 727.0013 716.5512 723.2512 5101500 361.26435 NA NA 11: GOOG GOOG 2013-01-03 724.9313 731.9312 720.7212 723.6713 4653700 361.47415 4877600 NA 12: GOOG GOOG 2013-01-04 729.3412 741.4713 727.6812 737.9713 5547600 368.61701 5100650 5100933
https://stackoverflow.com/questions/52187359
复制相似问题