Statistics for dynamic modelling
Please note that this course will be quite substantially re-written for 2012/13, when it will become
'Applied Statistical Inference'.
The aim is to show you how to use Maximum Likelihood Estimation and Bayesian
methods in practice, to do statistics with any statistical model you can write down.
Of course in practice there are limits to how far we will get towards this aim, but
a surprising amount of progress is possible.
The re-written course will focus on
- using maximum likelihood estimation and
associated general large sample theory in practice, via numerical optimization,
in particular.
- using the Bayesian approach to statistics via Markov Chain Monte Carlo methods.
The course will no longer focus on dynamic models, and will take the MLE and Bayesian
material at a much gentler pace (and in a bit more depth) than the current course, at
the expense of the very specialized methods for highly non-linear dynamics covered at
the end of the current course.
There is an assessed practical with this unit. If
you are enrolled on this course for credit it is your responsibility to
make sure that you have organised a group (of 3) with which to complete
this assessment. The assessment counts for 40% of the course mark.
Marking scheme for the practical
- First: 14-20 A project that could be given to the epidemiologists who collected the data with little or no revision,
without the risk of misleading. Analysis should be soundly done so that
conclusions are well supported statistically . Interpretation should be reasonably mature. The project should demonstrate
a clear overview of the work, without getting lost in details, and be free of any except minor statistical errors.
- 2.1: 12-13 A project that could be given to the epidemiologists after a round of revision, but without having to re-do much
of the actual analysis. Some substantial flaws in the analysis or presentation (or more minor flaws in both), but basically
sound. A good grasp of the statistics and context, so that interpretation is reasonable.
- 2.2: 10-11 Major re-working required before the project could be presented to an epidemiologist, but containing some
sound statistics demonstrating understanding. Reasonable presentation and organisation.
- Third: 8-9 Major flaws in analysis and presentation, but demonstrating some understanding of statistics, and a
reasonable attempt to present the results.
- Fail:0-7 Flawed analysis demonstrating little or no statistical understanding, and/or incomprehensible or very
badly organised presentation.
Hints:
- Concise and precise is better than long and vague.
- I'm not interested in how fancy you can make the analysis. Indeed properly answering the questions with simple analysis
is better than answering the questions with large quantities of complicated analysis. The point is to carefully select from the methods
you have met in order to best answer the questions, not to try and showcase everything covered in the module.
- Do make sure that you justify both your conclusions and the approach taken to arriving at them.
Here is a list of background reading....
- An
Introduction to R Follow the Documentation links at CRAN for further
introductory material on R.
- For background on the theory of maximum likelihood estimation you
might like to look at pages 102-113 of Wood (2006) "Generalized Additive
Models:An Introduction with R", CRC Press, which covers all the theory we
will use, and gives further references.
- For a slower, and more detailed, look at likelihood based inference
from a practical perspective, take a look at
this
course on inference.
- The prerequisite for the course is
MA20226
or equivalent. The
notes
for the course are available online.
- If you don't have the prerequisites for the course and need to catch
up, you might want to try these introductory
statistics notes
- For the Bayesian MCMC material Section 15.8 of Press, Teukolsky,
Vetterling and Flannery (2007) "Numerical Recipes" (3rd Edition) is quite
a nice short introduction, while Gamermann's "Markov Chain Monte Carlo"
CRC (there are a couple of editions) is more detailed, but also very
clear.
- Chatfield "The analysis of timeseries" CRC (various editions)
provides an excellent introduction to its subject.
- Gurney and Nisbet (1998) "Ecological Dynamic modelling" Oxford, is a
very nice book on biological modelling aimed squarely at real systems.
- Britton (2003) "Essential Mathematical Biology" gives a good overview
of somewhat more abstract biological models.
- Turchin (2003) "Complex Population Dynamics", Princeton, is an
impressive example of how statistics and dynamic modelling can be
combined. One can argue with the technical approach, but the
basic philosophy is almost surely right.
Past papers...
Data...
- The
urchin data . These are digitized from figure 4.12 p107 of Gurney and
Nisbet (1998) Ecological Dynamics, Oxford. Units are 1000 mm^3 and years.
- The algae in chemostat data. Algal cell
counts versus hour.
- lbm.dat .
Notes...
Practicals (it's a really bad idea to look at the solutions before you
have got code to work for the labs, even if you struggle with them - the
struggling is part of the learning for this sort of work.) The solutions
are deliberately intended to be a bit cryptic if you have not attempted
the lab.