es() - Exponential Smoothing

Ivan Svetunkov

2022-03-29

es() is a part of smooth package. It allows constructing Exponential Smoothing (also known as ETS), selecting the most appropriate one among 30 possible ones, including exogenous variables and many more.

We will use some of the functions of the greybox package.

Let’s load the necessary packages:

require(smooth)
require(greybox)

The simplest call of this function is:

es(BJsales, h=12, holdout=TRUE, silent=FALSE)

## Forming the pool of models based on... ANN, AAN, Estimation progress:    33%44%56%67%78%89%100%... Done!

## Time elapsed: 0.57 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2629 
## Damping parameter: 0.8857
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 237.6755
## Error standard deviation: 1.3796
## Sample size: 138
## Number of estimated parameters: 5
## Number of provided parameters: 1
## Number of degrees of freedom: 133
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 485.3509 485.8055 499.9872 501.1070 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 14.8%; Asymmetry: 87.2%; MAPE: 1.1%
## MASE: 2.488; sMAE: 1.3%; sMSE: 0%; rMAE: 0.956; rRMSE: 0.951

In this case function uses branch and bound algorithm to form a pool of models to check and after that constructs a model with the lowest information criterion. As we can see, it also produces an output with brief information about the model, which contains:

How much time was elapsed for the model construction;
What type of ETS was selected;
Values of persistence vector (smoothing parameters);
What type of initialisation was used;
How many parameters were estimated (standard deviation is included);
Standard deviation of residuals. The model has multiplicative error term, so as a result the standard deviation is small.
Cost function type and the value of that cost function;
Information criteria for this model;
Forecast errors (because we have set holdout=TRUE).

The function has also produced a graph with actual values, fitted values and point forecasts.

If we need prediction interval, then we run:

es(BJsales, h=12, holdout=TRUE, interval=TRUE, silent=FALSE)

## Forming the pool of models based on... ANN, AAN, Estimation progress:    33%44%56%67%78%89%100%... Done!

## Time elapsed: 1.08 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2629 
## Damping parameter: 0.8857
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 237.6755
## Error standard deviation: 1.3796
## Sample size: 138
## Number of estimated parameters: 5
## Number of provided parameters: 1
## Number of degrees of freedom: 133
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 485.3509 485.8055 499.9872 501.1070 
## 
## 95% parametric prediction interval was constructed
## 100% of values are in the prediction interval
## Forecast errors:
## MPE: 1.1%; sCE: 14.8%; Asymmetry: 87.2%; MAPE: 1.1%
## MASE: 2.488; sMAE: 1.3%; sMSE: 0%; rMAE: 0.956; rRMSE: 0.951

Due to multiplicative nature of error term in the model, the interval are asymmetric. This is the expected behaviour. The other thing to note is that the output now also provides the theoretical width of prediction interval and its actual coverage.

If we save the model (and let’s say we want it to work silently):

ourModel <- es(BJsales, h=12, holdout=TRUE, silent="all")

we can then reuse it for different purposes:

es(BJsales, model=ourModel, h=12, holdout=FALSE, interval="np", level=0.93)

## Time elapsed: 0.01 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2629 
## Damping parameter: 0.8857
## Initial values were provided by user.
## 
## Loss function type: likelihood; Loss function value: 255.3083
## Error standard deviation: 1.3317
## Sample size: 150
## Number of estimated parameters: 6
## Number of provided parameters: 5
## Number of degrees of freedom: 144
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 512.6165 512.6435 515.6272 515.6949 
## 
## 93% nonparametric prediction interval was constructed

We can also extract the type of model in order to reuse it later:

modelType(ourModel)

## [1] "AMdN"

This handy function, by the way, also works with ets() from forecast package.

If we need actual values from the model, we can use actuals() method from greybox package:

actuals(ourModel)

## Time Series:
## Start = 1 
## End = 138 
## Frequency = 1 
##   [1] 200.1 199.5 199.4 198.9 199.0 200.2 198.6 200.0 200.3 201.2 201.6 201.5
##  [13] 201.5 203.5 204.9 207.1 210.5 210.5 209.8 208.8 209.5 213.2 213.7 215.1
##  [25] 218.7 219.8 220.5 223.8 222.8 223.8 221.7 222.3 220.8 219.4 220.1 220.6
##  [37] 218.9 217.8 217.7 215.0 215.3 215.9 216.7 216.7 217.7 218.7 222.9 224.9
##  [49] 222.2 220.7 220.0 218.7 217.0 215.9 215.8 214.1 212.3 213.9 214.6 213.6
##  [61] 212.1 211.4 213.1 212.9 213.3 211.5 212.3 213.0 211.0 210.7 210.1 211.4
##  [73] 210.0 209.7 208.8 208.8 208.8 210.6 211.9 212.8 212.5 214.8 215.3 217.5
##  [85] 218.8 220.7 222.2 226.7 228.4 233.2 235.7 237.1 240.6 243.8 245.3 246.0
##  [97] 246.3 247.7 247.6 247.8 249.4 249.0 249.9 250.5 251.5 249.0 247.6 248.8
## [109] 250.4 250.7 253.0 253.7 255.0 256.2 256.0 257.4 260.4 260.0 261.3 260.4
## [121] 261.6 260.8 259.8 259.0 258.9 257.4 257.7 257.9 257.4 257.3 257.6 258.9
## [133] 257.8 257.7 257.2 257.5 256.8 257.5

We can then use persistence or initials only from the model to construct the other one:

es(BJsales, model=modelType(ourModel), h=12, holdout=FALSE, initial=ourModel$initial, silent="graph")

## Time elapsed: 0.05 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 0.9647 0.2934 
## Damping parameter: 0.8691
## Initial values were provided by user.
## 
## Loss function type: likelihood; Loss function value: 255.2422
## Error standard deviation: 1.3447
## Sample size: 150
## Number of estimated parameters: 4
## Number of provided parameters: 2
## Number of degrees of freedom: 146
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 518.4844 518.7602 530.5269 531.2181

es(BJsales, model=modelType(ourModel), h=12, holdout=FALSE, persistence=ourModel$persistence, silent="graph")

## Time elapsed: 0.03 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2629 
## Damping parameter: 0.8816
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 255.3053
## Error standard deviation: 1.3453
## Sample size: 150
## Number of estimated parameters: 4
## Number of provided parameters: 2
## Number of degrees of freedom: 146
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 518.6106 518.8864 530.6531 531.3442

or provide some arbitrary values:

es(BJsales, model=modelType(ourModel), h=12, holdout=FALSE, initial=1500, silent="graph")

## Time elapsed: 0.08 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 0.9641 0.2944 
## Damping parameter: 0.8684
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 255.2405
## Error standard deviation: 1.354
## Sample size: 150
## Number of estimated parameters: 6
## Number of degrees of freedom: 144
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 522.4811 523.0685 540.5449 542.0165

Using some other parameters may lead to completely different model and forecasts:

es(BJsales, h=12, holdout=TRUE, loss="aTMSE", bounds="a", ic="BIC", interval=TRUE)

## Time elapsed: 0.59 seconds
## Model estimated: ETS(AAdN)
## Persistence vector g:
##   alpha    beta 
##  0.5602 -0.0065 
## Damping parameter: 0.9884
## Initial values were optimised.
## 
## Loss function type: aTMSE; Loss function value: 121.6905
## Error standard deviation: 2.0481
## Sample size: 138
## Number of estimated parameters: 6
## Number of degrees of freedom: 132
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 1066.231 1066.873 1083.795 1085.375 
## 
## 95% parametric prediction interval was constructed
## 100% of values are in the prediction interval
## Forecast errors:
## MPE: 1%; sCE: 13.6%; Asymmetry: 86.3%; MAPE: 1%
## MASE: 2.308; sMAE: 1.2%; sMSE: 0%; rMAE: 0.887; rRMSE: 0.88

You can play around with all the available parameters to see what’s their effect on final model.

In order to combine forecasts we need to use “C” letter:

es(BJsales, model="CCN", h=12, holdout=TRUE, silent="graph")

## Estimation progress:    10%20%30%40%50%60%70%80%90%100%... Done!

## Time elapsed: 0.61 seconds
## Model estimated: ETS(CCN)
## Initial values were optimised.
## 
## Loss function type: likelihood
## Error standard deviation: 1.3546
## Sample size: 138
## Information criteria:
## (combined values)
##      AIC     AICc      BIC     BICc 
## 486.7651 487.2553 501.2497 502.2196 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 14.9%; Asymmetry: 87.7%; MAPE: 1.1%
## MASE: 2.499; sMAE: 1.3%; sMSE: 0%; rMAE: 0.96; rRMSE: 0.956

Model selection from a specified pool and forecasts combination are called using respectively:

es(BJsales, model=c("ANN","AAN","AAdN","ANA","AAA","AAdA"), h=12, holdout=TRUE, silent="graph")

## Estimation progress:    33%67%100%... Done!

## Time elapsed: 0.14 seconds
## Model estimated: ETS(AAdN)
## Persistence vector g:
##  alpha   beta 
## 0.9437 0.2991 
## Damping parameter: 0.8777
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 237.5994
## Error standard deviation: 1.3841
## Sample size: 138
## Number of estimated parameters: 6
## Number of degrees of freedom: 132
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 487.1988 487.8400 500.9180 501.6590 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 14.9%; Asymmetry: 88%; MAPE: 1.1%
## MASE: 2.49; sMAE: 1.3%; sMSE: 0%; rMAE: 0.957; rRMSE: 0.953

es(BJsales, model=c("CCC","ANN","AAN","AAdN","ANA","AAA","AAdA"), h=12, holdout=TRUE, silent="graph")

## Estimation progress:    33%67%100%... Done!

## Time elapsed: 0.15 seconds
## Model estimated: ETS(CCC)
## Initial values were optimised.
## 
## Loss function type: likelihood
## Error standard deviation: 1.3566
## Sample size: 138
## Information criteria:
## (combined values)
##      AIC     AICc      BIC     BICc 
## 487.7374 488.3454 501.4108 502.0727 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 15.1%; Asymmetry: 88.6%; MAPE: 1.1%
## MASE: 2.52; sMAE: 1.3%; sMSE: 0%; rMAE: 0.969; rRMSE: 0.966

Now let’s introduce some artificial exogenous variables:

x <- cbind(rnorm(length(BJsales),50,3),rnorm(length(BJsales),100,7))

and fit a model with all the exogenous first:

es(BJsales, model="ZZZ", h=12, holdout=TRUE, xreg=x)

## Time elapsed: 0.85 seconds
## Model estimated: ETSX(AAdN)
## Persistence vector g:
##  alpha   beta 
## 0.8953 0.3364 
## Damping parameter: 0.8664
## Initial values were optimised.
## Xreg coefficients were estimated in a normal style
## 
## Loss function type: likelihood; Loss function value: 236.1315
## Error standard deviation: 1.3799
## Sample size: 138
## Number of estimated parameters: 10
## Number of degrees of freedom: 128
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 488.2629 489.3792 511.2062 513.3287 
## 
## Forecast errors:
## MPE: 1%; sCE: 13.5%; Asymmetry: 80.8%; MAPE: 1.1%
## MASE: 2.363; sMAE: 1.2%; sMSE: 0%; rMAE: 0.908; rRMSE: 0.889

or construct a model with selected exogenous (based on IC):

es(BJsales, model="ZZZ", h=12, holdout=TRUE, xreg=x, xregDo="select")

## Time elapsed: 1.32 seconds
## Model estimated: ETS(AMdN)
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2629 
## Damping parameter: 0.8857
## Initial values were optimised.
## 
## Loss function type: likelihood; Loss function value: 237.6755
## Error standard deviation: 1.3796
## Sample size: 138
## Number of estimated parameters: 5
## Number of provided parameters: 1
## Number of degrees of freedom: 133
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 485.3509 485.8055 499.9872 501.1070 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 14.8%; Asymmetry: 87.2%; MAPE: 1.1%
## MASE: 2.488; sMAE: 1.3%; sMSE: 0%; rMAE: 0.956; rRMSE: 0.951

If we want to check if lagged x can be used for forecasting purposes, we can use xregExpander() function from greybox package:

es(BJsales, model="ZZZ", h=12, holdout=TRUE, xreg=xregExpander(x), xregDo="select")

## Time elapsed: 2.41 seconds
## Model estimated: ETSX(AAN)
## Persistence vector g:
## alpha  beta 
## 1.000 0.244 
## Initial values were optimised.
## Xreg coefficients were estimated in a normal style
## 
## Loss function type: likelihood; Loss function value: 240.5525
## Error standard deviation: 1.4087
## Sample size: 138
## Number of estimated parameters: 6
## Number of provided parameters: 1
## Number of degrees of freedom: 132
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 490.4731 491.5595 505.7412 506.8610 
## 
## Forecast errors:
## MPE: 1.1%; sCE: 15.5%; Asymmetry: 90.2%; MAPE: 1.2%
## MASE: 2.565; sMAE: 1.3%; sMSE: 0%; rMAE: 0.986; rRMSE: 0.984

If we are confused about the type of estimated model, the function formula() will help us:

formula(ourModel)

## [1] "y[t] = l[t-1] * b[t-1] + e[t]"

Finally, if you work with M or M3 data, and need to test a function on a specific time series, you can use the following simplified call:

es(M3$N2457, interval=TRUE, silent=FALSE)

This command has taken the data, split it into in-sample and holdout and produced the forecast of appropriate length to the holdout.