首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >因子赛季有新的4级,当在R组表演Arima时

因子赛季有新的4级,当在R组表演Arima时
EN

Stack Overflow用户
提问于 2018-12-24 12:33:18
回答 1查看 243关注 0票数 2

这里是我的数据集的示例

代码语言:javascript
复制
ts=structure(list(Data = structure(c(10L, 14L, 18L, 22L, 26L, 29L, 
32L, 35L, 38L, 1L, 4L, 7L, 11L, 15L, 19L, 23L, 27L, 30L, 33L, 
36L, 39L, 2L, 5L, 8L, 12L, 16L, 20L, 24L, 28L, 31L, 34L, 37L, 
40L, 3L, 6L, 9L, 13L, 17L, 21L, 25L), .Label = c("01.01.2018", 
"01.01.2019", "01.01.2020", "01.02.2018", "01.02.2019", "01.02.2020", 
"01.03.2018", "01.03.2019", "01.03.2020", "01.04.2017", "01.04.2018", 
"01.04.2019", "01.04.2020", "01.05.2017", "01.05.2018", "01.05.2019", 
"01.05.2020", "01.06.2017", "01.06.2018", "01.06.2019", "01.06.2020", 
"01.07.2017", "01.07.2018", "01.07.2019", "01.07.2020", "01.08.2017", 
"01.08.2018", "01.08.2019", "01.09.2017", "01.09.2018", "01.09.2019", 
"01.10.2017", "01.10.2018", "01.10.2019", "01.11.2017", "01.11.2018", 
"01.11.2019", "01.12.2017", "01.12.2018", "01.12.2019"), class = "factor"), 
    client = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L), .Label = c("Horns", "Kornev"), class = "factor"), stuff = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("chickens", 
    "hooves", "Oysters"), class = "factor"), Sales = c(374L, 
    12L, 120L, 242L, 227L, 268L, 280L, 419L, 12L, 172L, 336L, 
    117L, 108L, 150L, 90L, 117L, 116L, 146L, 120L, 211L, 213L, 
    67L, 146L, 118L, 152L, 122L, 201L, 497L, 522L, 65L, 268L, 
    441L, 247L, 348L, 445L, 477L, 62L, 226L, 476L, 306L)), .Names = c("Data", 
"client", "stuff", "Sales"), class = "data.frame", row.names = c(NA, 
-40L))

我想用Arima模型按组执行时间序列。

代码语言:javascript
复制
#if using dummy
fun_tslm <- function(x, start = "2017-01-04", freq = 12){
  tsw <- ts(x[["Sales"]], start = decimal_date(as.Date(start)), frequency = freq)
  #View(tsw)
  mytslm <- tslm(tsw ~ trend + season)
  mytslm
}

fun_forecast <- function(x, h = 14){
  residarima1 <- auto.arima(x[["residuals"]])
  residualsArimaForecast <- forecast(residarima1, h = h)
  residualsF <- as.numeric(residualsArimaForecast$mean)
  regressionForecast <- forecast(x, h = h)
  regressionF <- as.numeric(regressionForecast$mean)
  forecastR <- regressionF + residualsF
  forecastR
}

tslm_list <- lapply(group_list, fun_tslm)
fore_list <- lapply(tslm_list, fun_forecast)

当我运行这个脚本时,我得到了错误

Model.frame.default中的错误(术语、新数据、na.action = na.action、xlev =object$xlevel):因子季节有新的级别4

但实际上,我想用Arima度量来获得输出,在这里我可以看到1.预测初始值

2.用CI对14个月的预测

初始值和预测值的输出应该在两个不同的data.frame中。该怎么做呢?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-12-24 13:56:29

有些部分在您的脚本和数据中不太清楚,所以我可以尝试给您一个部分的答案,看看如何得到您想要的结果:

代码语言:javascript
复制
# I called your dataset in this way, because ts is a function
timeseries

现在,我们的想法是将您的数据帧转换为列表,列表的每个组件都是一个组,即一个时间序列。我以为每个小组都是客户+人员,但我们可以用不同的方式来管理:

代码语言:javascript
复制
# first the grouping variable
timeseries$group <- paste0(timeseries$client,timeseries$stuff)

# EDIT here you convert the Data class as class(date)
library(lubridate)
timeseries$Data <- dmy(timeseries$Data)

# now the list
listed <- split(timeseries,timeseries$group)

现在,我们必须使用lapplyts函数将列表的每个组件定义为时间序列:

代码语言:javascript
复制
 # I do not understand why all your ts start with "2017-01-04", when in the example they are not (probably because it's an example)

 # EDIT: convert the start date
 listed_ts <- lapply(listed,
                     function(x) ts(x[["Sales"]], start = ymd("2017-01-04"), frequency = 12)  ) 

    listed_ts
$`Hornschickens`
      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
17170 374  12 120 242 227 268 280 419  12 172 336

$Hornshooves
      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
17170 497 522  65 268 441 247 348 445 477  62 226 476
17171 306                                            

$KornevOysters
      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
17170 117 108 150  90 117 116 146 120 211 213  67 146
17171 118 152 122 201       

下一步是使用auto.arima逻辑对每个时间序列进行lapply

代码语言:javascript
复制
library(forecast)
listed_arima <- lapply(listed_ts,function(x) auto.arima(x) )
# partial result
> listed_arima
$`Hornschickens`
Series: x 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
          mean
      223.8182
s.e.   38.7707

sigma^2 estimated as 18188:  log likelihood=-69.03
AIC=142.06   AICc=143.56   BIC=142.86
...

现在,对每个arima的预测:

代码语言:javascript
复制
listed_forecast <- lapply(listed_arima,function(x) forecast(x,1) )

如果需要将其简化为data.frame、do.callrbind帮助:

代码语言:javascript
复制
do.call(rbind,listed_forecast)

              method                            model   level     mean     lower     upper     x          series fitted     residuals 
Hornschickens "ARIMA(0,0,0) with non-zero mean" List,18 Numeric,2 223.8182 Numeric,2 Numeric,2 Integer,11 "x"    Numeric,11 Numeric,11
Hornshooves   "ARIMA(0,0,0) with non-zero mean" List,18 Numeric,2 336.9231 Numeric,2 Numeric,2 Integer,13 "x"    Numeric,13 Numeric,13
KornevOysters "ARIMA(0,0,0) with non-zero mean" List,18 Numeric,2 137.125  Numeric,2 Numeric,2 Integer,16 "x"    Numeric,16 Numeric,16

我认为你可以把它再扭曲一点,以获得更好的结果。如果您想知道为什么在这个例子中,如果您在auto.arima函数中放置了1以上来预测,但是结果是一个常量,答案是here,也就是输出上的method列所指出的。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/53913526

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档