首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在相同的突变函数dplyr中使用滞后结果

在相同的突变函数dplyr中使用滞后结果
EN

Stack Overflow用户
提问于 2021-06-17 09:30:13
回答 2查看 537关注 0票数 5

我想用dplyr +滞后函数复制下面的公式R。代码一直工作到每个组的第二行,然后继续给出0。

预测=滞后(值,1)*(1-滞后(自然损耗)/52)

条件:

  1. 预测的第一个值应该是空的,因为我们已经有了这个值。
  2. 第二行根据以前的减值和值列的值计算。
  3. 第三行,前面的值应该分别从预测列(而不是值列)和消耗列中选择。

我从第三排开始得到0。下面是我的复制代码。

代码语言:javascript
复制
data <- data %>% group_by(Patch) %>% mutate(id = row_number())
data <- data %>% group_by(Patch) %>% mutate(forecast = lag(Value,1)*(1-lag(Attrition,1)/52))

tbl_df(data)
# A tibble: 12 x 6
   Patch Week       Value Attrition    id forecast
   <chr> <date>     <dbl>     <dbl> <int>    <dbl>
 1 11P11 2021-06-14     2     0.075     1   NA    
 2 11P11 2021-06-21     0     0.075     2    2.00 
 3 11P11 2021-06-28     0     0.075     3    0    
 4 11P12 2021-06-14     3     0.075     1   NA    
 5 11P12 2021-06-21     0     0.075     2    3.00 
 6 11P12 2021-06-28     0     0.075     3    0    
 7 11P12 2021-07-05     0     0.075     4    0    
 8 11P13 2021-06-14     1     0.075     1   NA    
 9 11P13 2021-06-21     0     0.075     2    0.999
10 11P13 2021-06-28     0     0.075     3    0    
11 11P13 2021-07-05     0     0.075     4    0    
12 11P13 2021-07-12     0     0.075     5    0   


> dput(data)
structure(list(Patch = c("11P11", "11P11", "11P11", "11P12", 
"11P12", "11P12", "11P12", "11P13", "11P13", "11P13", "11P13", 
"11P13"), Week = structure(c(18792, 18799, 18806, 18792, 18799, 
18806, 18813, 18792, 18799, 18806, 18813, 18820), class = "Date"), 
    Value = c(2, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0), Attrition = c(0.075, 
    0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 
    0.075, 0.075), id = c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L, 
    3L, 4L, 5L), forecast = c(NA, 1.99711538461538, 0, NA, 2.99567307692308, 
    0, 0, NA, 0.998557692307692, 0, 0, 0)), row.names = c(NA, 
-12L), groups = structure(list(Patch = c("11P11", "11P12", "11P13"
), .rows = structure(list(1:3, 4:7, 8:12), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame")) 
EN

回答 2

Stack Overflow用户

发布于 2021-06-17 16:05:11

更新解决方案

下面是一个使用base::Reduce的简单解决方案

代码语言:javascript
复制
do.call(rbind, lapply(split(df, df$Patch), function(x) {
  x$forecast <- c(NA, Reduce(function(a, b) {
    a * (1 - (x$Attrition[b]/52))
  }, 2:(nrow(x)-1), init = x$Value[1], accumulate = TRUE))
  x
}))

   Patch       Week Value Attrition id  forecast
1  11P11 2021-06-14     2     0.075  1        NA
2  11P11 2021-06-21     0     0.075  2 2.0000000
3  11P11 2021-06-28     0     0.075  3 1.9971154
4  11P12 2021-06-14     3     0.075  1        NA
5  11P12 2021-06-21     0     0.075  2 3.0000000
6  11P12 2021-06-28     0     0.075  3 2.9956731
7  11P12 2021-07-05     0     0.075  4 2.9913524
8  11P13 2021-06-14     1     0.075  1        NA
9  11P13 2021-06-21     0     0.075  2 1.0000000
10 11P13 2021-06-28     0     0.075  3 0.9985577
11 11P13 2021-07-05     0     0.075  4 0.9971175
12 11P13 2021-07-12     0     0.075  5 0.9956793

早期逼近

您还可以使用以下方法。为此,我首先在数据集上应用了变异公式,以获得我的forecast系列的第一个值。然后,我对包含NA值的组的第一行进行了切片。之后,我使用accumulate函数来计算您想要的系列,使用第一个forecast值作为.init参数的值。然后将结果数据集绑定到包含NA值的数据集:

代码语言:javascript
复制
library(dplyr)
library(purrr)

df %>%
  group_by(Patch) %>%
  mutate(forecast = lag(Value)*(1-(lag(Attrition)/52))) %>%
  filter(between(row_number(), 2, n())) %>%
  mutate(forecast = accumulate(Attrition[-1], .init = forecast[1], ~ ..1 * (1-(..2/52)))) %>%
  bind_rows(df %>% group_by(Patch) %>%
              mutate(forecast = lag(Value)*(1-(lag(Attrition)/52))) %>%
              slice_head()) %>%
  ungroup() %>%
  arrange(Patch, Week)

# A tibble: 12 x 6
   Patch Week       Value Attrition    id forecast
   <chr> <date>     <dbl>     <dbl> <int>    <dbl>
 1 11P11 2021-06-14     2     0.075     1   NA    
 2 11P11 2021-06-21     0     0.075     2    2.00 
 3 11P11 2021-06-28     0     0.075     3    1.99 
 4 11P12 2021-06-14     3     0.075     1   NA    
 5 11P12 2021-06-21     0     0.075     2    3.00 
 6 11P12 2021-06-28     0     0.075     3    2.99 
 7 11P12 2021-07-05     0     0.075     4    2.99 
 8 11P13 2021-06-14     1     0.075     1   NA    
 9 11P13 2021-06-21     0     0.075     2    0.999
10 11P13 2021-06-28     0     0.075     3    0.997
11 11P13 2021-07-05     0     0.075     4    0.996
12 11P13 2021-07-12     0     0.075     5    0.994
票数 2
EN

Stack Overflow用户

发布于 2021-06-17 15:00:38

这方面的棘手之处在于,您需要连续构建forecast变量,这就是为什么它不能在标准mutate()调用中工作的原因。

以下是我依靠purrrmap()reduce()进行数据整合的方法:

代码语言:javascript
复制
library(tidyverse)

data %>%
  mutate(forecast = NA) %>%
  split(~ Patch) %>%
  map(~ .x %>%
          pmap(~ tibble(...)) %>%
          reduce(\(.x, .y) {
            prev <- slice_tail(.x)
            base_value <- ifelse(prev$Value != 0, prev$Value, prev$forecast)
            bind_rows(.x,
                      mutate(.y,
                             forecast = base_value * 1 - prev$Attrition / 5))
          })) %>%
  reduce(bind_rows)

返回:

代码语言:javascript
复制
# A tibble: 12 x 6
   Patch Week       Value Attrition    id forecast
   <chr> <date>     <dbl>     <dbl> <int>    <dbl>
 1 11P11 2021-06-14     2     0.075     1   NA
 2 11P11 2021-06-21     0     0.075     2    1.98
 3 11P11 2021-06-28     0     0.075     3    1.97
 4 11P12 2021-06-14     3     0.075     1   NA
 5 11P12 2021-06-21     0     0.075     2    2.98
 6 11P12 2021-06-28     0     0.075     3    2.97
 7 11P12 2021-07-05     0     0.075     4    2.95
 8 11P13 2021-06-14     1     0.075     1   NA
 9 11P13 2021-06-21     0     0.075     2    0.985
10 11P13 2021-06-28     0     0.075     3    0.97
11 11P13 2021-07-05     0     0.075     4    0.955
12 11P13 2021-07-12     0     0.075     5    0.94

使用的数据:

代码语言:javascript
复制
data <- structure(list(Patch = c("11P11", "11P11", "11P11", "11P12", "11P12", "11P12", "11P12", "11P13", "11P13", "11P13", "11P13", "11P13"), Week = structure(c(18792, 18799, 18806, 18792, 18799, 18806, 18813, 18792, 18799, 18806, 18813, 18820), class = "Date"), Value = c(2, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0), Attrition = c(0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075, 0.075), id = c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), forecast = c(NA, 1.99711538461538, 0, NA, 2.99567307692308, 0, 0, NA, 0.998557692307692, 0, 0, 0)), row.names = c(NA, -12L), groups = structure(list(Patch = c("11P11", "11P12", "11P13"), .rows = structure(list(1:3, 4:7, 8:12), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame")) 
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68016611

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档