by Julian Faraway and Chris Chatfield
ABSTRACT
Using the famous airline data as the main example, a variety of neural
network (NN) models are fitted and the resulting forecasts are
compared with those obtained from the (Box-Jenkins) seasonal ARIMA
model called the airline model. The results suggest that there is
plenty of scope for going badly wrong with NN models and that it is
unwise to apply them blindly in `black-box' mode. Rather the wise
analyst needs to use traditional modelling skills to select a good NN
model, for example in making a careful choice of input variables. The
BIC criterion is recommended for comparing different NN models. Great
care is also needed when fitting a NN model and using it to produce
forecasts. Methods of examining the response surface implied by a NN
model are examined as well as alternative procedures using Generalized
Additive Models and Projection Pursuit Regression.
SOFTWARE
At the time the paper was written in 1995, S-PLUS was the dominant
implementation of the S language. R has now taken over. I have
modified the original code to reproduce the results described in
Appendix A of the paper:
> source("http://people.bath.ac.uk/jjf23/papers/neural/nnts.R")
> air <- scan("http://people.bath.ac.uk/jjf23/papers/neural/air.data")/100
> nmod <- nnts(air[1:132]/100,c(1,12),2,retry=50)
> summary(nmod)
a 2-2-1 network with 9 weights
Unit 0 is constant one input
Input units: Lag 1=1, Lag 12=2,
Hidden units are 3 4
Output unit is 5
0->3 1->3 2->3 0->4 1->4 2->4 0->5 3->5 4->5
0.08 0.64 -0.57 -0.01 1.06 -1.10 -11.64 49.62 -28.34
Sum of squares is 2.3019
AIC : -456.45 , BIC : -422.37 , residual se : 0.14401
> predict(nmod,12)
[1] 3.9943 3.8001 4.3880 4.3703 4.6560 5.1996 5.9289 6.0780 4.9708 4.4144
[11] 3.9926 4.4695
The numerical results are not identical to those shown in Appendix A
because the optimisation method used in the NN fitting uses random
restarts so the results will be a little different each time it is
repeated. Here are more details on how
to call the function and the output it produces.
The R code used here is just a wrapper to
the nnet R package. Alternatively, you can just
lag the appropriate input variables and feed them to the
nnet() function directly.
Last modified: Sat Aug 18 11:38:12 PDT 2012