如果我两个月的回报中有任何一个滞后小于-0.7,我想将下一个closeret观察值设置为NA。
我试过:
x <- x %>%
group_by(seriesid) %>%
mutate(first = { twomonthreturn <= (-0.7) } %>% { . * !duplicated(.) } ) %>%
mutate(first = ifelse(first==1,datem,NA)) #%>%
mutate(closeret = ifelse(datem<=sum(first,na.rm = TRUE),closeret,NA))其中datem =年份(日期)*12+month(日期)
date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 -0.40000000 -0.62500000
2019-01-26 50005 0.360 0.20000000 -0.28000000
2019-02-23 50005 0.300 -0.16666667 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451我正在寻找每个组的解决方案,我的解决方案适用于第一组,但当然,如果没有第一组,因为它是NA,这个解决方案不起作用。
date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 NA -0.62500000
2019-01-26 50005 0.360 NA -0.28000000
2019-02-23 50005 0.300 NA 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451发布于 2019-05-03 17:29:30
希望这次能成功:
解决方案
将replace添加到cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0 (在twomonthreturn中的- 0.7之后用作组中所有值的索引),因此它将closered中的所有此类值替换为NA。
library(tidyverse)
dat %>%
arrange(seriesid, date) %>%
group_by(seriesid) %>%
mutate(
closeret = replace(
closeret,
cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0,
NA_real_
)
) %>%
ungroup()输出
# A tibble: 13 x 5
date seriesid price closeret twomonthreturn
<date> <dbl> <dbl> <dbl> <dbl>
1 2018-07-25 50005 3.1 NA NA
2 2018-08-14 50005 2.5 -0.194 NA
3 2018-09-28 50005 2.35 -0.06 -0.242
4 2018-10-27 50005 0.8 -0.660 -0.68
5 2018-11-27 50005 0.5 -0.375 -0.787
6 2018-12-31 50005 0.3 NA -0.625
7 2019-01-26 50005 0.36 NA -0.28
8 2019-02-23 50005 0.3 NA 0
9 2017-01-21 50006 7 NA NA
10 2017-03-28 50006 9.75 NA NA
11 2017-04-14 50006 8.88 -0.0897 NA
12 2017-05-20 50006 9 0.0141 -0.0769
13 2017-06-22 50006 9 0 0.0141数据
dat <- read.table(
text = "date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500 -0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 -0.40000000 -0.62500000
2019-01-26 50005 0.360 0.20000000 -0.28000000
2019-02-23 50005 0.300 -0.16666667 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451",
colClasses = c("Date", "numeric", "numeric", "numeric", "numeric"),
header = TRUE
)https://stackoverflow.com/questions/55964659
复制相似问题