MA40189: Topics in Bayesian statistics

MA40189: Topics in Bayesian statistics

Quick links

Lecture notes and summaries

Printed lecture notes: pdf version and html version
Last lecture material (overview, visualiser notes, Panopto recording, references to printed notes)

Question sheets and solutions
Past exam papers and solutions
Panopto Re:view folder of recordings for the course

Lectures and timetable information

	Lecturer:	Simon Shaw; s.shaw at bath.ac.uk
Timetable:	Lectures:	Tuesday 15:15-16:05 (8W2.30) and Thursday 11:15-12:05 (8W1.1).
	Problems classes:	Thursday 17:15-18:05 (8W2.10).
	Office hours:	I am happy to discuss any matters relating to the course at any time, either via email or one-to-one. If you would like to meet then just send me an email, with a list of proposed times and whether you wish to meet in-person or on Teams.

A schedule for the course is available in either pdf or html.

Syllabus

Credits:	6
Level:	Masters
Period:	Semester 2
Assessment:	EX 100%
Other work:	There will be weekly question sheets. These will be set and handed in during problems classes. Any work submitted by the hand-in deadline will be marked and returned to you. Full solutions to all exercises and general feedback sheets will be made available.
Requisites:	Before taking this unit you must take MA40092 (home-page).
Description:	Aims & Learning Objectives: Aims: To introduce students to the ideas and techniques that underpin the theory and practice of the Bayesian approach to statistics. Objectives: Students should be able to formulate the Bayesian treatment and analysis of many familiar statistical problems. Content: Bayesian methods provide an alternative approach to data analysis, which has the ability to incorporate prior knowledge about a parameter of interest into the statistical model. The prior knowledge takes the form of a prior (to sampling) distribution on the parameter space, which is updated to a posterior distribution via Bayes' Theorem, using the data. Summaries about the parameter are described using the posterior distribution. The Bayesian Paradigm; decision theory; utility theory; exchangeability; Representation Theorem; prior, posterior and predictive distributions; conjugate priors. Tools to undertake a Bayesian statistical analysis will also be introduced. Simulation based methods such as Markov Chain Monte Carlo and importance sampling for use when analytical methods fail.

Some useful books

We won't follow a book as such but useful references, in ascending order of difficulty, include:

Peter M. Lee, Bayesian Statistics: an introduction, Fourth Edition, 2012.
Very readable, introductory text. The full text is available as an e-book here. The Third Edition is also available in the library. Further details about the book can be found here. This includes all the exercises in the book and their solutions.
Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin, Bayesian Data Analysis, Third Edition, 2014. 512.795 GEL
A slightly more advanced introductory text with a focus upon practical applications. The full text is available as an e-book, either by following the link from the library here or directly here. The library holds earlier editions of the book, the first edition can be found here and the second edition here. Further details about the book can be found on Andrew Gelman's pages here. This includes some of the solutions to exercises in the book. Andrew Gelman also has a blog which often raises some interesting statistical topics, frequently related to current news topics.
Christian P. Robert, The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation, Second Edition, 2007.
This book is not in the library but the full text is available here. A really nice book; Christian Robert also has a wide-ranging blog.
Anthony O'Hagan, Kendall's Advanced Theory of Statistics Volume 2B Bayesian Inference, 1994. 512.795 KEN
A harder, more advanced, book than the previous two but a rewarding and insightful one. It has a good mix of theory and foundations: a personal favourite.
Jose M. Bernardo and Adrian F.M. Smith, Bayesian Theory, 1994. 512.795 BER
The classic graduate text. Develops the Bayesian view from a foundational standpoint. The full text is available as an e-book, either by following the link from the library here or directly here. A very readable short overview to Bayesian statistics written by Jose Bernardo can be downloaded from here.

A very readable account of the historical development and use of Bayesian statistics aimed at a general audience is given in the following book.

Sharon Bertsch Mcgrayne, The Theory That Would Not Die: How Bayesâ€™ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy, 2011.

here

The International Society for Bayesian Analysis (ISBA) is a good starting point for a number of Bayesian resources.

Lecture notes and summaries

Lecture notes: pdf and html.
Table of useful distributions: pdf (Handed out in Problems Class of 17 Feb 22)

Material covered:

Lecture 1 (08 Feb 22):	Introduction: working definitions of classical and Bayesian approaches to inference about parameters.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p4-5 (start of Example 3).
Lecture 2 (10 Feb 22):	Â§1 The Bayesian method: Bayes' theorem, using Bayes' theorem for parametric inference.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p5 (start of Example 3)- 7 (end of the page).
Lecture 3 (15 Feb 22):	Sequential data updates, conjugate Bayesian updates, Beta-Binomial example.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p7 (end of the page)- 9 (after equation (1.12)).
Lecture 4 (17 Feb 22):	Definition of conjugate family, role of prior (weak and strong) and likelihood in the posterior. Handout of beta distributions: pdf.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p9 (after equation (1.12))- 10 (end of page).
Lecture 5 (22 Feb 22):	Example of weak/strong prior finished, kernel of a density, conjugate Normal example. Handout of weak/strong prior example: pdf.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p10 (end of page)- 13 (after equation (1.19)).
Lecture 6 (24 Feb 22):	Conjugate Normal example concluded. Using the posterior for inference, credible interval.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p13 (after equation (1.19))- 15 (after Example 9).
Lecture 7 (01 Mar 22):	Highest density regions, Â§2 Modelling: predictive distribution, Binomial-Beta example.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p15 (after Example 9)- 19 (after Example 12).
Lecture 8 (03 Mar 22):	Predictive summaries, finite exchangeability.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p19 (after Example 12)- 21 (after Example 15).
Lecture 9 (08 Mar 22):	Infinite exchangeability, example of non-extendability of finitely exchangeable sequence, general representation theorem for infinitely exchangeable events and random variables.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here (2020/21 full lecture); here (2021/22 partial lecture). Online notes: p21 (after Example 15)-24 (end of Theorem 2).
Lecture 10 (10 Mar 22):	Example of exchangeable Normal random variables, sufficiency, k-parameter exponential family.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p24 (end of Theorem 2)- 26 (end of Definition 8).
Lecture 11 (15 Mar 22):	Sufficient statistics, conjugate priors for exchangeable k-parameter exponential family random variables.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p26 (end of Definition 8)-28 (end of Example 21).
Lecture 12 (17 Mar 22):	Hyperparameters, usefulness of conjugate priors, improper priors, Fisher information matrix, Jeffreys' prior
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p28 (end of Example 21)- 31 (end of Example 24).
Lecture 13 (22 Mar 22):	Invariance property under transformation of the Jeffreys prior, final remarks about noninformative priors.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p31 (end of Example 24)- 33 (prior to Example 26).
Lecture 14 (24 Mar 22):	Â§3 Computation: normal approximation, expansion about the mode.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p33 (prior to Example 26)-37 (prior to Section 3.2).
Lecture 15 (29 Mar 22):	Monte Carlo integration, importance sampling. Basic idea of Markov chain Monte Carlo (MCMC).
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p37 (prior to Section 3.2)-40 (prior to Section 3.3.1).
Lecture 16 (31 Mar 22):	Transition kernel. Basic definitions (irreducible, periodic, recurrent, ergodic, stationary) and theorems (existence/uniqueness, convergence, ergodic) of Markov chains and their consequences for MCMC techniques. The Metropolis-Hastings algorithm. Handout: pdf.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p40 (Prior to Section 3.3.1)- 42 (prior to Algorithm 1).
Lecture 17 (05 Apr 22):	Example of the Metropolis-Hastings algorithm.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p42 (prior to Algorithm 1)- 44 (prior to Example 30).
Lecture 18 (07 Apr 22):	Conclusion of the Metropolis-Hastings algorithm example, the Gibbs sampler algorithm and example. Handout of example: pdf.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p44 (prior to Example 30)- 52 (prior to Example 31).
Lecture 19 (26 Apr 22):	The Gibbs sampler example concluded. Handout of example: pdf.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p52 (prior to Example 31)- 59 (prior to Section 3.3.4).
Lecture 20 (28 Apr 22):	Overview of why the Metropolis-Hastings algorithm works, efficiency of MCMC algorithms. §4 Decision theory: introduction.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p59 (prior to Section 3.3.4)- 61 (top of the page).
Lecture 21 (28 Apr 22):	Â§4 Decision theory: Introduction, Statistical decision theory: loss, risk, Bayes risk and Bayes rule, Quadratic loss.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p61 (top of the page)- 65 (start of Example 36). Note Section 4.1 was omitted and is not examinable.
Lecture 22 (03 May 22):	Bayes risk of the sampling procedure, worked example.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p65 (start of Example 36)- 67 (start of Example 37).
Lecture 23 (05 May 22):	Example concluded.
	Lecture overview: pdf. Visualiser notes: pdf. Panopto recording: here. Online notes: p67 (start of Example 37)-69 (end of course).
Revision (05 May 22):	A revision lecture: discussing the structure of the exam paper.
	Panopto recording: here.

Forthcoming material (as per 2020/21 schedule):
This course is now completed.

R functions

gibbs2

Gibbs sampler for Î¸1|Î¸2 ~ Bin(n, Î¸2), Î¸2|Î¸1 ~ Beta(Î¸1+Î±, n-Î¸1+Î²), illustration of Example 32 (p54) of the lecture notes. Sample plots for n=10, Î±=Î²=1 and n=10, Î±=2, Î²=3: pdf.

gibbs.update	Step by step illustration of the Gibbs sampler for bivariate normal, X, Y standard normal with Cov(X, Y) = rho; press return to advance.
gibbs	Long run version of the Gibbs sampler for bivariate normal, X, Y standard normal with Cov(X, Y) = rho.
metropolis.update	Step by step illustration of Metropolis-Hastings for sampling from N(mu.p,sig.p^2) with proposal N(theta[t-1],sig.q^2); press return to advance.
metropolis	Long run version of Metropolis-Hastings for sampling from N(mu.p,sig.p^2) with proposal N(theta[t-1],sig.q^2).
	Illustration of Example 30 (p45) of the lecture notes using metropolis.update and metropolis with mu.p=0, sig.p=1 and firstly sig.q=1 and secondly sig.q=0.6: pdf.
plot.mcmc	Plot time series summaries of output from a Markov chain. Allows you to specify burn-in and thinning.
f	Function for plotting bivariate Normal distribution in gibbs.update.
All above	All of the above functions in one file for easy reading into R; thanks to Ruth Salway for these functions.

The following functions are for sampling from bivariate normals, with thanks to Merrilee Hurn

gibbs1	Gibbs sampler (arguments: n the number of iterations, rho the correlation coefficient of the bivariate normal, start1 and start2 the initial values for the sampler).
metropolis1	Metropolis-Hastings (arguments: n the number of iterations, rho the correlation coefficient of the bivariate normal, start1 and start2 the initial values for the sampler, tau the standard deviation of the Normal proposal).
metropolis2	Metropolis-Hastings for sampling from a mixture of bivariate normals (arguments: n the number of iterations, rho the correlation coefficient of the bivariate normal, start1 and start2 the initial values for the sampler, tau the standard deviation of the Normal proposal, sigma2 the variance of the normal mixtures).

Question sheets and solutions

Question sheets are set in the Thursday problems class. Full worked solutions are available shortly after the submission date.

Question Sheet Zero: pdf or html.	Solution Sheet Zero: pdf or html.
Problems Class Zero: Visualiser notes: pdf. Panopto recording: here.
Question Sheet One: pdf or html.	Solution Sheet One: pdf or html.
Problems Class One: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Two: pdf or html.	Solution Sheet Two: pdf or html.
Problems Class Two: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Three: pdf or html.	Solution Sheet Three: pdf or html.
Problems Class Three: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Four: pdf or html.	Solution Sheet Four: pdf or html.
Problems Class Four: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Five: pdf or html.	Solution Sheet Five: pdf or html.
Problems Class Five: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Six: pdf or html.	Solution Sheet Six: pdf or html.
Problems Class Six: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Seven: pdf or html.	Solution Sheet Seven: pdf or html.
Problems Class Seven: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Eight: pdf or html.	Solution Sheet Eight: pdf or html
Problems Class Eight: Visualiser notes: pdf. Panopto recording: here.
Question Sheet Nine: pdf or html.	Solution Sheet Nine: Available on 06 May 22.

Past exam papers and solutions

The exam will be open book and is designed to take 3 hours (including up to 1 hour for upload) to complete.
There will be four questions, each worth 20 marks, of which you should answer three. Only the best three answers will contribute to the exam which is thus marked out of 60.
For the exam, an additional resource I recommend is the table of useful distributions: pdf.

The past papers remain applicable for the exam this year. Questions that involve straightforward use of bookwork should be considered under an appropriate adaption for the open book setting. Insight into this can be found be comparing the original 2019/20 paper with the adapted 2019/20 paper and discussed in the Office Hour Three Panopto recording: here.

The original 2019/20 exam paper designed to be taken in a closed book world is available here: pdf.
The actual paper taken in the open book world, with the additional essay style question, is available here: pdf.
The solutions for questions 1-4 is available here: pdf.
An adapted version of the 2019/20 exam paper, with adaptions shown in blue, appropriate as a mock paper for this year is available here: pdf. Adapted solutions are available here: pdf.

Exams:	2020/21	2019/20	2018/19	2017/18	2016/17
Paper:	pdf	pdf	pdf	pdf	pdf
Solutions:	pdf	pdf	pdf	pdf	pdf

Last revision:
05/05/22