Martingales and martingale convergence

You will get more from the exercises by trying them before looking at the solutions!

  1. Let \(X\) be a martingale. Use the tower property for conditional expectation to deduce that \[\operatorname{\mathbb{E}}\left[X_{n+k}|\mathcal{F}_{n}\right]= X_n\,, \quad k=0,1,2,\dots\,.\]

Hint: Prove by induction on \(k\), using the martingale property for the base case and both the martingale and tower properties for the inductive step.

Solution: For \(k=1\), \(\operatorname{\mathbb{E}}\left[X_{n+1}|\mathcal{F}_{n}\right]= X_n\) is exactly the martingale property. For \(k > 1\), suppose \(\operatorname{\mathbb{E}}\left[X_{n+k-1}|\mathcal{F}_{n}\right]= X_n\) holds. The martingale and tower properties imply \(\operatorname{\mathbb{E}}\left[X_{n+k}|\mathcal{F}_{n}\right]=\operatorname{\mathbb{E}}\left[\operatorname{\mathbb{E}}\left[X_{n+k}|\mathcal{F}_{n+k-1}\right]|\mathcal{F}_n\right] = \operatorname{\mathbb{E}}\left[X_{n+k-1}|\mathcal{F}_{n}\right]\), hence \(\operatorname{\mathbb{E}}\left[X_{n+k}|\mathcal{F}_{n}\right] = X_n\) also holds.

  1. Recall Thackeray’s martingale: let \(Y_1,Y_2,\dots\) be a sequence of independent and identically distributed random variables, with \(\operatorname{\mathbb{P}}\left(Y_1 = 1\right) = \operatorname{\mathbb{P}}\left(Y_1 = -1\right) = 1/2\). Define the Markov chain \(M\) by \[ M_0 = 0; \qquad M_n = \begin{cases} 1-2^{n} & \quad\text{if $Y_1=Y_2 = \dots = Y_n = -1$,} \\ 1 &\quad\text{otherwise.}\end{cases}\]
    1. Compute \(\operatorname{\mathbb{E}}\left[M_n\right]\) from first principles.
    2. What should be the value of \(\operatorname{\mathbb{E}}\left[\widetilde{M}_n\right]\) if \(\widetilde{M}\) is computed as for \(M\) but stopping play if \(M\) hits level \(1-2^N\)?

Hint:

  1. Just calculate the expectation!
  2. Consider \(n< N\) and \(n \geq N\) separately.

Solution:

  1. By definition \(M_n\) takes value \(1-2^n\) with probability \(2^{-n}\) and value 1 with probability \(1-2^{-n}\); hence \(\operatorname{\mathbb{E}}\left[M_n\right]=0\).
  2. Check that if \(n < N\) we cannot have hit level \(1-2^N\) yet, so \(\widetilde{M}_n \equiv M_n\); in contrast if \(n \geq N\) we must have either hit level \(1-2^{N}\) (because \(Y_1 = Y_2 = \dots \ Y_N = -1\)) or one of \(Y_1, Y_2, \dots, Y_N\) equals 1, meaning \(\widetilde{M}_n = M_N\). (Formally speaking, \(\widetilde{M}_n\) is the stopped martingale \(M_{\min\{n,N\}}\).) Clearly, in either case \(\operatorname{\mathbb{E}}\left[\widetilde{M}_n\right] = 0\) using (a).
  1. Consider a branching process \(Y\), where \(Y_0=1\) and \(Y_{n+1}\) is the sum \(Z_{n+1,1}+\ldots+Z_{n+1,Y_n}\) of \(Y_n\) independent copies of a non-negative integer-valued family-size r.v. \(Z\).
    1. Suppose \(\operatorname{\mathbb{E}}\left[Z\right]=\mu<\infty\). Show that \(X_n=Y_n/\mu^n\) is a martingale.
    2. Show that \(Y\) is itself a supermartingale if \(\mu < 1\) and a submartingale if \(\mu > 1\).
    3. Suppose \(\operatorname{\mathbb{E}}\left[s^{Z}\right]=G(s)\). Let \(\eta\) be the smallest non-negative root of the equation \(G(s)=s\). Show that \(\eta^{Y_n}\) defines a martingale.
    4. Let \(H_n=Y_0+\ldots+Y_n\) be the total of all populations up to time \(n\). Show that \(s^{H_n}/(G(s)^{H_{n-1}})\) is a martingale.
    5. How should these three expressions be altered if \(Y_0 = k \geq 1\)?
  1. Consider asymmetric simple random walk, stopped when it first returns to \(0\). Show that this is a supermartingale if jumps have non-positive expectation, a submartingale if jumps have non-negative expectation (and therefore a martingale if jumps have zero expectation).

Hint: The martingale property trivially holds if the walk has stopped (why?). Otherwise, consider the conditional expectation \(\operatorname{\mathbb{E}}\left[X_{n+1}-X_n|X_n\right]\) (where \(X_n\) is the position of the walk at time \(n\)).

Solution: If the walk \(X\) has not yet returned to \(0\), \(\operatorname{\mathbb{E}}\left[X_{n+1}-X_n | X_n \right]\) is equal to the expectation of the next jump. Hence, if this expectation is non-positive, then \(\operatorname{\mathbb{E}}\left[X_{n+1}\right] \leq X_n\), and if it is non-negative, then \(\operatorname{\mathbb{E}}\left[X_{n+1}\right] \geq X_n\). If the walk has returned to 0 (so \(X_n = 0\) for some \(n>0\)), then \(X_{n+1}=0\) also, so \(\operatorname{\mathbb{E}}\left[X_{n+1}-X_n | X_n \right] = 0\). Hence the sub/supermartingale condition is satisfied for all \(n\).

  1. Consider Thackeray’s martingale based on asymmetric random walk. Show that this is a supermartingale or submartingale depending on whether jumps have negative or positive expectation.

Hint: Again, consider \(\operatorname{\mathbb{E}}\left[M_{n+1}-M_n | M_n \right]\) (where \(M_n\) is your fortune after \(n\) steps), and separate the cases having already hit level 1 or not.

Solution: If at time \(n\) the Thackeray “martingale” \(M\) has hit level 1 (i.e., the asymmetric walk has at least one jump to the right in the first \(n\) steps), then \(\operatorname{\mathbb{E}}\left[M_{n+1}-M_n | M_n \right] = 0\). Otherwise, the asymmetric walk has made \(n\) left jumps (losing bets), and the wager is now \(2^n\), and \(\operatorname{\mathbb{E}}\left[M_{n+1} - M_n| M_n\right] = 2^n p - 2^n (1-p) = 2^n(2p-1)\) where \(p\) is the probability that the asymmetric walk jumps to the right.

  1. Show, using the conditional form of Jensen’s inequality, that if \(X\) is a martingale then \(|X|\) is a submartingale.

Hint: What is the convex function?

Solution: The function \(\phi(x) = |x|\) is convex, so \(\operatorname{\mathbb{E}}\left[\phi(X_{n+1})|\mathcal{F}_n\right] \geq \phi(\operatorname{\mathbb{E}}\left[X_{n+1}|\mathcal{F}_n\right]) = \phi(X_n)\).

  1. A shuffled pack of cards contains \(b\) black and \(r\) red cards. The pack is placed face down, and cards are turned over one at a time. Let \(B_n\) denote the number of black cards left just before the \(n^{th}\) card is turned over. Let \[ Y_n = \frac{B_n}{r+b-(n-1)}\,. \] (So \(Y_n\) equals the proportion of black cards left just before the \(n^{th}\) card is revealed.) Show that \(Y\) is a martingale.

  2. Suppose \(N_1, N_2, \dots\) are independent identically distributed normal random variables of mean \(0\) and variance \(\sigma^2\), and put \(S_n=N_1+\ldots+N_n\).

    1. Show that \(S\) is a martingale.
    2. Show that \(Y_n= \exp\left(S_n - \tfrac{n}{2}\sigma^2\right)\) is a martingale.
    3. How should these expressions be altered if \(\operatorname{\mathbb{E}}\left[N_i\right] = \mu\neq 0\)?
  3. Let \(X\) be a discrete-time Markov chain on a countable state-space \(S\) with transition probabilities \(p_{x,y}\). Let \(f: S \to \mathbb{R}\) be a bounded function. Let \(\mathcal{F}_n\) contain all the information about \(X_0, X_1, \ldots, X_n\). Show that \[M_n = f(X_n) - f(X_0) - \sum_{i=0}^{n-1} \sum_{y \in S} (f(y) - f(X_i)) p_{X_i,y}\] defines a martingale. (Hint: first note that \(\operatorname{\mathbb{E}}\left[f(X_{i+1}) - f(X_i) | X_i\right] = \sum_{y \in S} (f(y) - f(X_i)) p_{X_i,y}\). Using this and the Markov property of \(X\), check that \(\operatorname{\mathbb{E}}\left[M_{n+1} - M_n | \mathcal{F}_n\right] = 0\).)

Solution:

Guided by the hint, we observe that \(f(X_{n+1})\) is an integrable random variable, and \(\operatorname{\mathbb{E}}\left[f(X_{n+1})|X_n=x\right] = \sum_y f(y)p_{x,y}\); equivalently, \(\operatorname{\mathbb{E}}\left[ f(X_{n+1})-f(X_n)|X_n\right] = \sum_y (f(y)-f(X_n))p_{X_n,y}\). Therefore \(M_{n+1}-M_n = f(X_{n+1})-f(X_n) - \operatorname{\mathbb{E}}\left[ f(X_{n+1})-f(X_n) | X_n\right]\). But, since \(X\) is a Markov chain, the Markov property says that \(\operatorname{\mathbb{E}}\left[ f(X_{n+1})-f(X_n) | X_n\right] = \operatorname{\mathbb{E}}\left[ f(X_{n+1})-f(X_n) | \mathcal{F}_n \right]\), and therefore \(\operatorname{\mathbb{E}}\left[M_{n+1}-M_n|\mathcal{F}_n\right]= 0\) (recall the property of conditional expectation that \(\operatorname{\mathbb{E}}\left[Z|\mathcal{F}_n\right]=\operatorname{\mathbb{E}}\left[\operatorname{\mathbb{E}}\left[Z|\mathcal{F}_n\right]|\mathcal{F}_n\right]\)).

  1. Let \(Y\) be a discrete-time birth-death process absorbed at zero: \[ p_{k,k+1} = \frac{\lambda}{\lambda+\mu},\quad p_{k,k-1} = \frac{\mu}{\lambda+\mu}\,, \qquad \text{for $k>0$, with $0<\lambda<\mu$.} \]
    1. Show that \(Y\) is a supermartingale.
    2. Let \(T=\inf\{n:Y_n=0\}\) (so \(T<\infty\) a.s.), and define \[X_n = Y_{\min\{n, T\}} + \left(\frac{\mu-\lambda}{\mu+\lambda}\right)\min\{ n, T\} \,.\] Show that \(X\) is a non-negative supermartingale, converging to \[ Z=\left(\frac{\mu-\lambda}{\mu+\lambda}\right)T \,. \]
    3. Deduce that \[ \operatorname{\mathbb{E}}\left[T|Y_0 = y\right] \leq \left(\frac{\mu+\lambda}{\mu-\lambda}\right)y\,. \]

Hint: Remember that \(p_{0,0} = 1\).

Solution:

  1. If \(Y_n = 0\) then \(Y_{n+1} = 0\) and \(\operatorname{\mathbb{E}}\left[Y_{n+1}|Y_n\right] = Y_n\). Otherwise, if \(Y_n = k \neq 0\), then \(\operatorname{\mathbb{E}}\left[Y_{n+1}-Y_n|Y_n\right] = \frac{\lambda}{\lambda+\mu}-\frac{\mu}{\lambda+\mu} < 0\).
  2. Observe, that if \(T \leq n\) then \(T \leq n+1\), so \(\min\{n,T\} = \min\{n+1,T\}\) so \(X_{n+1}-X_n = 0\). Otherwise, \(T >n\) and then \(T \geq n+1\), so \(X_{n+1}-X_n = Y_{n+1}-Y_n+\frac{\mu-\lambda}{\mu+\lambda}\). Then part (a) implies \(\operatorname{\mathbb{E}}\left[X_{n+1}-X_n|\mathcal{F}_n\right] \leq 0\). (Formally, we used here that \(T\) is a stopping time so \(\{ T \leq n \} \in \mathcal{F}_n\).) Martingale convergence then implies that \(X\) converges almost surely. Since \(T < \infty\), a.s., \(\min\{n,T\} \to T\) a.s. as \(n\to\infty\), and therefore \(X_n\) converges to \(Y_T + Z = Z\) for \(Z\) defined in the question.
  3. The martingale convergence theorem also implies that \(\operatorname{\mathbb{E}}\left[Z | \mathcal{F}_0\right] \leq X_0 = Y_0\). In other words, \(\operatorname{\mathbb{E}}\left[ \left(\frac{\mu-\lambda}{\mu+\lambda}\right) T | Y_0 = y \right] \leq y\).
  1. Let \(L(\theta; X_1, X_2, \ldots, X_n)\) be the likelihood of parameter \(\theta\) given a sample of independent and identically distributed random variables, \(X_1, X_2, \ldots, X_n\).
    1. Check that if the “true” value of \(\theta\) is \(\theta_0\) then the likelihood ratio \[M_n = \frac{L(\theta_1; X_1, X_2, \ldots, X_n)}{L(\theta_0; X_1, X_2, \ldots, X_n)}\] defines a martingale with \(\operatorname{\mathbb{E}}\left[M_n\right] = 1\) for all \(n \ge 1\).
    2. Using the strong law of large numbers and Jensen’s inequality, show that \[\frac{1}{n} \log M_n \to -c \text{ as $n \to \infty$.}\]

Hint:

  1. Use independence to write \(L(\theta; X_1,X_2,\dots,X_n)\) as a product of identically distributed terms, each having mean 1.

Solution:

  1. Suppose that \(f(\theta; x)\) is the common density of \(X_i\). Then \(M_n = \prod_{i=1}^n \frac{f(\theta_1; X_i)}{f(\theta_0;X_i)}\), and noting that \(\operatorname{\mathbb{E}}\left[ \frac{f(\theta_1; X_i)}{f(\theta_0;X_i)}\right] = \int f(\theta_0;x) \frac{f(\theta_1; x)}{f(\theta_0;x)} dx = 1\) (computing expectations using \(\theta = \theta_0\)), yields \(\operatorname{\mathbb{E}}\left[M_n | \mathcal{F}_{n-1}\right] = \operatorname{\mathbb{E}}\left[ M_{n-1} \frac{f(\theta_1; X_n)}{f(\theta_0;X_n)} | \mathcal{F}_{n-1}\right] = M_{n-1}\). Since \(M\) is a martingale, \(\operatorname{\mathbb{E}}\left[M_n\right] = \operatorname{\mathbb{E}}\left[M_1\right] = 1\).
  1. Let \(X\) be a simple symmetric random walk absorbed at boundaries \(a<b\).
    1. Show that \[ f(x)=\frac{x-a}{b-a} \qquad x\in[a,b]\] is a bounded harmonic function.
    2. Use the martingale convergence theorem and optional stopping theorem to show that \[f(x) = \operatorname{\mathbb{P}}\left(X \text{ hits }b\text{ before }a|X_0=x\right)\,.\]

Hint:

  1. Find the non-zero values of \(p_{x,y}\).

  2. This is the ‘same’ example as in the lectures, but on the interval \([a,b]\) rather than \([-a,b]\).

Solution:

  1. \(f(x)\) is increasing on \([a,b]\), hence \(0 = f(a) \leq f(x) \leq f(b) = 1\). If \(x \in \{ a, b \}\) then \(p_{x,y} = 1\) if and only if \(x=y\) (since walk is absorbed); hence \(\sum_y f(y)p_{x,y} = f(x)\). Otherwise, \(p_{x,x-1} = p_{x,x+1} = 1/2\) and \(p_{x,y} = 0\) for all \(y \not\in \{x-1,x+1\}\); hence \[\sum_y f(y)p_{x,y} = \frac{x-1-a}{b-a}\frac12 + \frac{x+1-a}{b-a}\frac12 = \frac{x-a}{b-a} = f(x).\]

  2. Since \(f\) is bounded harmonic function, \(f(X_n)\) is a bounded martingale. Hence by the martingale convergence theorem, \(f(X_n)\) converges a.s. to a limit \(Z\) and \(\operatorname{\mathbb{E}}\left[Z | \mathcal{F}_0\right] = f(X_0)\). But \(Z\) equals \(f(b) = 1\) if \(X\) hits \(b\) before \(a\) and equals \(f(a) = 0\) otherwise; hence \(\operatorname{\mathbb{E}}\left[Z | \mathcal{F}_0 \right] = \operatorname{\mathbb{P}}\left(X \text{ hits }b\text{ before }a|X_0\right)\). This can also be deduced from the optional stopping theorem, using the stopping time \(T = \min \{n \geq 0: X_n \in \{a,b\}\}\) which is a.s. finite. On the time interval \([0,T]\) the martingale \(f(X_n)\) is bounded, hence \(\operatorname{\mathbb{E}}\left[f(X_T) |\mathcal{F}_0\right] = f(X_0)\).