We have already met forecasting in one form or another. Firstly we extrapolated the deterministic components of regression models into the future, for trend, seasonal and cyclical models. Therefore, in the model where are uncorrelated with zero mean, the forecast for based on is where is the estimate of based on .
This is still appropriate if such components are used in a model for a time series which includes ARMA error structure. But now the ARMA part also needs to be extrapolated. For uncorrelated errors the extrapolations would be simply zero. But when the errors are correlated the expected future values are not all zero.
We also came across the concept of prediction when introducing autoregressive models. These again exploit regression - lagged regression on past values to predict the future.
However the approach in this chapter is model based. Forecasting is simply asking the question about where the process (that has been modelled) is going in the future, given that you have observed it up to the present. We investigate this now for models that can be expressed in the form
To understand prediction of future values of the process we will now investigate how future values depend in part upon the past (and present) quantities and that are known at time ; and in part upon the future innovations . We first look at particular simple examples. We omit for simplicity. The method we use is to write the model down with replaced by then …. If the model has terms on its right hand side which include any values in , we successively substitute for each of these so no such terms remain; only terms in remain on the right, besides those in and
5.2.1 Example: the MA(2) model:
So for , depends only on future innovations; not at all on the past.
5.2.2 Example: the AR(1) model:
Thus the dependence of on the past is given by the coefficient of , decaying geometrically.
Now recall the infinite moving average representation of the AR(1) model, which after replacing by becomes
and note that this gives the dependence of upon that we observed in the previous equation.
5.2.3 Example: the IMA() model:
Note now that the infinite moving average representation of for the IMA(1,1) model is formally
5.2.4
We say ‘formally’ because this sum does not converge, but it does supply the correct coefficients by which the previous equation shows how the innovations affect the future value . In this sense we may still write
where for an ARIMA model
Suppose we have observations of up to and including time and, through the process of estimating the model, values of also up to and including time . We want to forecast future values for , which we call .
The general form of the infinite MA for may be written
This shows explicitly that the part of which depends on future unknown innovations, is
and the remaining part of is known. In other words
The forecast error variance is therefore:
When forecasts made using ARIMA models are graphed for increasing lead time from a fixed forecast origin , i.e.
the resulting graph is called a forecast function. It is usual to graph the forecast function with limits of to indicate the range of possible future paths of the series.
Any two such functions are usually different because they depend on the latest values of the series. However for a given model a similar pattern is usually evident in all forecast functions. To see why, consider the ARMA() model with where , and setting to obtain the forecasts:
because all terms have been set to zero. This may also be expressed:
This is the same recurrence relationship which generated successive autocorrelations for the ARMA model. If then a geometric decay to the mean of the series results. If and has complex factors the values will follow a damped cycle as they decay to .
Now consider ARIMA models. For example the IMA() model with drift:
so that the forecast function follows a trend line with slope . For this model we can show that so that the limits gradually widen out around the forecast function.
When new observations are made, for example of the new innovations can be regenerated and at each time point in succession forecasts can be generated by the means described. It is however possible to develop equations to update the forecast function in a convenient way. The EWMA updating equation is a good example. The idea is to update from to . From the expression given earlier for forecast errors we can get
By expressing and in terms of this can be re-arranged to give the updating formula. For the IMA() model we start with so that
and using and gives
We conclude with some illustrations of forecast functions in Figure 26. In all of these the model is fitted to the data points before the forecasts start. The first of the four in Figure 26, shows forecasts of the weekly Financial Transaction series. The logarithms of the series are modelled by an IMA(1,1) model, but the forecasts and limits are transformed back to the original scale (and therefore you would not see constant forecast). The second shows forecasts of the daily temperatures, using an AR(1) model. The reversion of the forecast function to the mean level can be seen, as the error limits widen.
The third and fourth are forecasts of the annual sunspot series. An AR(2) model is used in the third, and an AR(8) in the fourth. The forecasts from the AR(2) model are seen to dampen down more quickly than those from the AR(8) model, which better captures the dynamics of the series. The error limits are actually only 90%, i.e. plus or minus 1.65 standard deviations. The errors for increasing lead-time are generally strongly correlated, so if the actual series values lie outside the limit at some point, it is quite likely that some of those close in time, will also do so.
The following four figures show forecasts of selected series using respectively an IMA(1,1), AR(1), AR(2) and AR(8) models.
Unnumbered Figure: Link
Unnumbered Figure: Link
Unnumbered Figure: Link
Unnumbered Figure: Link
Forecasting is one of the most challenging applications of univariate time series analysis. There are many other applications, most of them being smoothing or extracting components of a series, for which model based approaches have many advantages. Classification of series is also important, with applications in medical diagnosis and seismic analysis. Multivariate time series opens up a new field of modelling the dependence between time series, which also has widespread applications.