## Question Sheet Eight

Set: Problems class, Thursday 7th April 2022.

Due in: Lecture 21, Thursday 28th April 2022. Paper copies may be submitted in the problems class or directly to me in lectures or my office, 4W4.10. Pdf copies may be submitted to the portal available on the Moodle page.

You should submit your work to the portal available on the Moodle page.

Task: Attempt questions 1-3; questions 4-5 are extra questions which may be discussed in the problems class.

### Question 1

Let $$X_{1}, \ldots, X_{n}$$ be exchangeable so that the $$X_{i}$$ are conditionally independent given a parameter $$\theta$$. Suppose that $$X_{i} \, | \, \theta \sim \mbox{Inv-gamma}(\alpha, \theta)$$, where $$\alpha$$ is known, and we judge that $$\theta \sim \mbox{Gamma}(\alpha_{0}, \beta_{0})$$, where $$\alpha_{0}$$ and $$\beta_{0}$$ are known.

1. Show that $$\theta \, | \, x \sim \mbox{Gamma}(\alpha_{n}, \beta_{n})$$ where $$\alpha_{n} = \alpha_{0} + n \alpha$$, $$\beta_{n} = \beta_{0} + \sum_{i=1}^{n} \frac{1}{x_{i}}$$, and $$x = (x_{1}, \ldots, x_{n})$$.

2. We wish to use the Metropolis-Hastings algorithm to sample from the posterior distribution $$\theta \, | \, x$$ using a normal distribution with mean $$\theta$$ and chosen variance $$\sigma^{2}$$ as the symmetric proposal distribution.

1. Suppose that, at time $$t$$, the proposed value $$\theta^{*} \leq 0$$. Briefly explain why the corresponding acceptance probability is zero for such a $$\theta^{*}$$ and thus that the sequence of values generated by the algorithm are never less than zero.

2. Describe how the Metropolis-Hastings algorithm works for this example, giving the acceptance probability in its simplest form.

### Question 2

Suppose that $$X \, | \, \theta \sim N(\theta, \sigma^{2})$$ and $$Y \, | \, \theta, \delta \sim N(\theta - \delta, \sigma^{2})$$, where $$\sigma^{2}$$ is a known constant and $$X$$ and $$Y$$ are conditionally independent given $$\theta$$ and $$\delta$$. It is judged that the improper noninformative joint prior distribution $$f(\theta, \delta) \propto 1$$ is appropriate.

1. Show that the joint posterior distribution of $$\theta$$ and $$\delta$$ given $$x$$ and $$y$$ is bivariate normal with mean vector $$\mu$$ and variance matrix $$\Sigma$$ where $\begin{eqnarray*} \mu \ = \ \left(\begin{array}{c} E(\theta \, | \, X, Y) \\ E(\delta \, | \, X, Y) \end{array} \right) \ = \ \left(\begin{array}{c} x \\ x-y \end{array} \right); \ \ \Sigma \ = \ \left(\begin{array}{cc} \sigma^{2} & \sigma^{2} \\ \sigma^{2} & 2\sigma^{2} \end{array} \right). \end{eqnarray*}$

2. Describe how the Gibbs sampler may be used to sample from the posterior distribution $$\theta, \delta \, | \, x , y$$, deriving all required conditional distributions.

3. Suppose that $$x=2$$, $$y=1$$ and $$\sigma^{2} = 1$$. Sketch the contours of the joint posterior distribution. Starting from the origin, add to your sketch the first four steps of a typical Gibbs sampler path.

4. Suppose, instead, that we consider sampling from the posterior distribution using the Metropolis-Hastings algorithm where the proposal distribution is the bivariate normal with mean vector $$\tilde{\mu}^{(t-1)} = (\theta^{(t-1)}, \delta^{(t-1)})^{T}$$ and known variance matrix $$\tilde{\Sigma}$$. Explain the Metropolis-Hastings algorithm for this case, explicitly stating the acceptance probability.

### Question 3

Let $$X_{1}, \ldots, X_{n}$$ be exchangeable so that the $$X_{i}$$ are conditionally independent given a parameter $$\theta = (\mu, \lambda)$$. Suppose that $$X_{i} \, | \, \theta \sim N(\mu, 1/\lambda)$$ so that $$\mu$$ is the mean and $$\lambda$$ the precision of the distribution. Suppose that we judge that $$\mu$$ and $$\lambda$$ are independent with $$\mu \sim N(\mu_{0}, 1/\tau)$$, where $$\mu_{0}$$ and $$\tau$$ are known, and $$\lambda \sim \mbox{Gamma}(\alpha, \beta)$$, where $$\alpha$$ and $$\beta$$ are known.

1. Show that the posterior density $$f(\mu, \lambda \, | \, x)$$, where $$x = (x_{1}, \ldots, x_{n})$$, can be expressed as $\begin{eqnarray*} f(\mu, \lambda \, | \, x) & \propto & \lambda^{\alpha + \frac{n}{2} -1}\exp\left\{-\frac{\lambda}{2}\sum_{i=1}^{n}(x_{i}-\mu)^{2} - \frac{\tau}{2}\mu^{2} + \tau \mu_{0} \mu - \beta \lambda\right\}. \end{eqnarray*}$

2. Hence show that $\begin{eqnarray*} \lambda \, | \, \mu, x \sim \mbox{Gamma}\left(\alpha + \frac{n}{2}, \beta + \frac{1}{2} \sum_{i=1}^{n} (x_{i} - \mu)^{2}\right). \end{eqnarray*}$

3. Given that $$\mu \, | \, \lambda, x \sim N(\frac{\tau\mu_{0} + n\lambda \bar{x}}{\tau + n \lambda}, \frac{1}{\tau + n \lambda})$$, where $$\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_{i}$$, describe how the Gibbs sampler may be used to sample from the posterior distribution $$\mu, \lambda \, | \, x$$. Give a sensible estimate of $$Var(\lambda \, | \, x)$$.

### Question 4

Consider a Poisson hierarchical model. At the first stage we have observations $$s_{j}$$ which are Poisson with mean $$t_{j}\lambda_{j}$$ for $$j = 1, \ldots, p$$ where each $$t_{j}$$ is known. We assume that that the $$\lambda_{j}$$ are independent and identically distributed with $$\mbox{Gamma}(\alpha, \beta)$$ prior distributions. The parameter $$\alpha$$ is known but $$\beta$$ is unknown and is given a $$\mbox{Gamma}(\gamma, \delta)$$ distribution where $$\gamma$$ and $$\delta$$ are known. The $$s_{j}$$ are assumed to be conditionally independent given the unknown parameters.

1. Find, up to a constant of integration, the joint posterior distribution of the unknown parameters given $$s = (s_{1}, \ldots, s_{p})$$.

2. Describe how the Gibbs sampler may be used to sample from the posterior distribution, deriving all required conditional distributions.

3. Let $$\{\lambda_{1}^{(t)}, \ldots, \lambda_{p}^{(t)}, \beta^{(t)}; t = 1, \ldots, N\}$$, with $$N$$ large, be a realisation of the Gibbs sampler described above. Give sensible estimates of $$E(\lambda_{j} \, | \, s)$$, $$Var(\beta \, | \, s)$$ and $$E(\lambda_{j} \, | \, a \leq \beta \leq b, s)$$ where $$0 < a < b$$ are given constants.

### Question 5

Show that the Gibbs sampler for sampling from a distribution $$\pi(\theta)$$ where $$\theta = (\theta_{1}, \ldots, \theta_{d})$$ can be viewed as a special case of the Metropolis-Hastings algorithm where each iteration $$t$$ consists of $$d$$ Metropolis-Hastings steps each with an acceptance probability of 1.