MA22020: Advanced linear algebra

4.4 Jordan normal form

We complete our analyis of linear operators by improving on Proposition 4.12.

First we introduce the key ingredient.

4.4.1 Jordan blocks

Definition. The Jordan block of size \(n\in \Z _+\) and eigenvalue \(\lambda \in \F \) is \(J(\lambda ,n)\in M_n(\F )\) with \(\lambda \)’s on the diagonal, \(1\)’s on the super-diagonal and zeros elsewhere. Thus
\(\seteqnumber{0}{4.}{3}\)
\begin{equation*} J(\lambda ,n)= \begin{pmatrix} \lambda &1&0&\dots &0\\ &\ddots &\ddots &\ddots &\vdots \\ &&\ddots &\ddots &0\\ &&&\ddots &1\\ 0&&&&\lambda \end {pmatrix} \end{equation*}

Notation. Set \(J_n:=J(0,n)\) so that \(J(\lambda ,n)=\lambda I_n+J_n\).

We have:

Exercises.⁹
- (1) \(\ker J_n^k=\Span {\lst {e}1k}\). In particular, \(J_n\) is nilpotent: \(J_n^n=0\).
- (2) \(\im J_n^k=\Span {\lst {e}1{n-k}}\).
- (3) \(\lambda \) is the only eigenvalue of \(J(\lambda ,n)\) and \(E_{J(\lambda ,n)}(\lambda )=\Span {e_1}\), \(G_{J(\lambda ,n)}(\lambda )=\F ^{n}\).
- (4) \(m_{J(\lambda ,n)}=\pm \Delta _{J(\lambda ,n)}=(x-\lambda )^n\).

⁹ Exercise sheet 5, question 1.

We are going to prove that any nilpotent operator \(\phi \in L(V)\) on a finite-dimensional vector space has a basis for which the matrix of \(\phi \) is a direct sum of Jordan blocks: \(J_{n_1}\oplus \dots \oplus J_{n_k}\) with \(\plst {n}1k=\dim V\).

We start by spelling out what it means for an operator to have a Jordan block as matrix:

Lemma 4.16. Let \(\lst {v}1n\) be a basis for a vector space \(V\) and \(\phi \in L(V)\).

Then the following are equivalent:
- (1) \(\phi \) has matrix \(J_n\) with respect to \(\lst {v}1n\).
- (2) \(\phi (v_1)=0\) and \(\phi (v_i)=v_{i-1}\), for \(\bw 2in\).
- (3) \(v_i=\phi ^{n-i}(v_n)\), \(\bw 0i{n-1}\) and \(\phi ^{n}(v_n)=0\).

Proof. The equivalence of (1) and (2) comes straight from the definitions since \((J_n)_{i-1,i}=1\) and all other entries in the \(i\)-th column vanish.

The equivalence of (2) and (3) is an easy exercise¹⁰. □

¹⁰ Question 2 on sheet 5.

We will work with characterisation (3) and prove:

Theorem 4.17. Let \(\phi \in L(V)\) be a nilpotent operator on a finite-dimensional vector space over \(\F \). Then there are \(\lst {v}1k\in V\) and \(\lst {n}1k\in \Z _+\) such that
\(\seteqnumber{0}{4.}{3}\)
\begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots ,\phi ^{n_k-1}(v_k),\dots ,\phi (v_k),v_k \end{equation*}

is a basis of \(V\) and \(\phi ^{n_i}(v_i)=0\), for \(\bw 1ik\).

Using this basis and Lemma 4.16 we immediately conclude:

Corollary 4.18. Let \(\phi \in L(V)\) be a nilpotent operator on a finite-dimensional vector space over \(\F \). Then there is a basis for which \(\phi \) has matrix \(J_{n_1}\oplus \dots \oplus J_{n_k}\).

Remark. Note that direct sums of the \(J_{n_i}\) are characterised by having \(1\)’s and zeros (at the joins of successive blocks) on the super-diagonal and zeros elsewhere.

Proof of Theorem 4.17. Once again we induct on \(\dim V\).

If \(\dim V=1\), the only nilpotent operator is the zero operator and any basis \(v_1\) will do.

For the induction step, suppose that the theorem is true when \(\dim V<n\) and suppose that \(\dim V=n\). We prove the theorem for \(V\) in three steps.

Step 1: apply the induction hypothesis to \(\im \phi \). We let \(r=\rank \phi \) and \(k=n-r=\dim \ker \phi \). Since \(\phi \) is nilpotent, \(k>0\) so that \(r=\dim \im \phi <n\). We therefore apply the induction hypothesis to \(\phi \restr {\im \phi }\) to get \(\lst {w}1\ell \in \im \phi \), \(\lst {m}1\ell \in \Z _+\) such that
\(\seteqnumber{0}{4.}{3}\)
\begin{equation*} \lst {u}1r:=\phi ^{m_1-1}(w_1),\dots ,\phi (w_1),w_1,\dots , \phi ^{m_{\ell }-1}(w_{\ell }),\dots ,\phi (w_{\ell }),w_{\ell } \end{equation*}

is a basis of \(\im \phi \) and \(\phi ^{m_i}(w_i)=0\), for \(\bw 1i\ell \). Observe that each \(\phi (u_i)\) is either \(u_{i-1}\) or zero.

Step 2: Find the first \(\ell \) of the \(v_i\). Each \(w_i\in \im \phi \) so choose \(\lst {v}1\ell \) such that \(\phi (v_i)=w_i\), for \(\bw 1i\ell \).

We claim that \(\lst {u}1r,\lst {v}1\ell \) are linearly independent. For this, suppose that we have a linear relation
\(\seteqnumber{0}{4.}{3}\)
\begin{equation} \label {eq:16} \sum _{j=1}^r\lambda _ju_j+\sum _{i=1}^{\ell }\mu _iv_i=0 \end{equation}

and take \(\phi \) of this to get
\(\seteqnumber{0}{4.}{4}\)
\begin{equation*} \sum _{j=1}^r\lambda _j\phi (u_j)+\sum _{i=1}^{\ell }\mu _i\phi (v_i)=0 \end{equation*}

which reads
\(\seteqnumber{0}{4.}{4}\)
\begin{equation} \label {eq:17} \sum _{j\st \phi (u_j)\neq 0}\lambda _ju_{j-1}+\sum _{i=1}^{\ell }\mu _iw_i=0. \end{equation}

Since these \(u_{j-1}\) and \(w_i\) are distinct, (4.5) is still a linear relation on the linearly independent \(u_j\) and so, in particular, each \(\mu _i=0\). Now (4.4) becomes a linear relation on the \(u_j\) and so all \(\lambda _j=0\) also. This proves the claim.

Step 3: extend \(\lst {u}1r,\lst {v}1\ell \) to a basis of \(V\) by adding elements of \(\ker \phi \). Define \(U\leq V\) by
\(\seteqnumber{0}{4.}{5}\)
\begin{equation*} U=\Span {\lst {u}1r,\lst {v}1\ell }\geq \im \phi \end{equation*}

and note that \(\im \phi =\phi (U)\) since any \(u_i=\phi ^m(v_j)\), for some \(\bw 1j\ell \) and \(\bw 1m{m_j}\). We extend to get a basis
\(\seteqnumber{0}{4.}{5}\)
\begin{equation*} \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k \end{equation*}

of \(V\). Now, for \(\bw {\ell +1}{j}k\), there is some \(y_j\in U\) such that \(\phi (y_j)=\phi (x_j)\) whence \(v_j:=x_j-y_j\in \ker \phi \).

By construction
\(\seteqnumber{0}{4.}{5}\)
\begin{equation*} \Span {\lst {u}1r,\lst {v}1k}=\Span { \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k}=V \end{equation*}

so that \(\lst {u}1r,\lst {v}1k\) is a basis of \(V\). Moreover, setting
\(\seteqnumber{0}{4.}{5}\)
\begin{equation*} n_i= \begin{cases} m_i+1&\bw 1i\ell \\ 1&\bw {\ell +1}ik \end {cases} \end{equation*}

we have \(\phi ^{n_i}(v_i)=0\), for all \(\bw 1ik\) and our basis, reordered to slot the first \(\ell \) \(v_i\) into the right places, is
\(\seteqnumber{0}{4.}{5}\)
\begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots , \phi ^{n_\ell -1}(v_\ell ),\dots ,\phi (v_\ell ),v_\ell ,v_{\ell +1},\dots ,v_k, \end{equation*}

which is of the required form. □

The only question left is how unique are the \(n_i\)? We already know from the proof of Theorem 4.17 that there are \(k=\dim \ker \phi \) of them¹¹ but we can do better. For this, set \(A=J_{n_1}\oplus \dots \oplus J_{n_k}\) so that, for \(s\in \N \), \(A^s=J_{n_1}^s\oplus \dots \oplus J_{n_k}^s\). Now

\begin{equation*} \dim \ker J_{n_i}^s=s, \end{equation*}

for \(s\leq n_i\) so that

\begin{equation} \label {eq:18} \dim \ker J_{n_i}^s-\dim \ker J_{n_i}^{s-1}= \begin{cases} 1&\bw 1s{n_i}\\0&s>n_i. \end {cases} \end{equation}

Now \(\ker A^s=\bigoplus _{i=1}^k\ker J_{n_i}^s\) so summing (4.6) over \(i\) yields:

\begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker A^s-\dim \ker A^{s-1}. \end{equation*}

This proves:

¹¹ Alternatively, if you have not read the proof: if there are \(k\) Jordan blocks \(J_{n_{i}}\), we have \(\dim \ker \phi =\sum _{i=1}^k\dim \ker J(n_i)=k\) since \(\dim \ker J(n_i)=1\).

Proposition 4.19. Let \(\phi \in L(V)\) be nilpotent with matrix \(J_{n_1}\oplus \dots \oplus J_{n_k}\) for some basis of \(V\). Then \(\lst {n}1k\) are unique up to order. Indeed,
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker \phi ^s-\dim \ker \phi ^{s-1}, \end{equation*}

for each \(s\geq 1\).

Exercise.¹² In the situation of Proposition 4.19, show that
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \#\set {i\st n_i= s}=2\dim \ker \phi ^s-\dim \ker \phi ^{s-1}-\dim \ker \phi ^{s+1}. \end{equation*}

¹² Question 3 on sheet 5.

In another direction:

Proposition 4.20. In the situation of Proposition 4.19, we have
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} m_{\phi }=x^s, \end{equation*}

where \(s=\max \set {\lst {n}1k}\).

Proof. Exercise¹³! □

¹³ Question 4 on sheet 5.

4.4.2 Jordan normal form

We put §4.4.1 together with Theorem 4.11 to prove the ultimate structure theorem for linear operators on a finite-dimensional complex vector space.

Theorem 4.21. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \). Then there is a basis of \(V\) for which \(\phi \) has as matrix a direct sum of Jordan blocks which are unique up to order.

Such a basis is called a Jordan basis and the direct sum of Jordan blocks is called the Jordan normal form (JNF) of \(\phi \).

Proof. Let \(\lst \lambda 1k\) be the distinct eigenvalues of \(\phi \). By Theorem 4.11, \(V=\bigoplus V_i\), for \(V_i=G_{\phi }(\lambda _i)\) and then \(\phi _i:=\phi \restr {V_i}\) can be written
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \phi _i=\lambda _i\id _{V_i}+N_i, \end{equation*}

with \(N_i\) nilpotent. Apply Corollary 4.18 to get a basis of \(V_i\) for which \(N_i\) has matrix \(J_{n_1}\oplus \dots \oplus J_{n_{\ell }}\). By Proposition 4.19, the \(\lst {n}1\ell \) are unique up to order. Now \(\phi _i\) has matrix
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} J(\lambda _i,n_1)\oplus \dots \oplus J(\lambda _i,n_{\ell }). \end{equation*}

We then concatenate these bases to get the required Jordan basis of \(V\). □

From this, Proposition 4.14 and Proposition 4.20, we get a complete account of the minimum polynomial:

Corollary 4.22. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \) with distinct eigenvalues \(\lst \lambda 1k\). Then
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} m_{\phi }=\prod _{i=1}^k(x-\lambda _i)^{s_i} \end{equation*}

where \(s_i\) is the size of the largest Jordan block of \(\phi \) with eigenvalue \(\lambda _{i}\).

Exercise.¹⁴ \(\phi \) is diagonalisable if and only if \(m_{\phi }=\prod _{i=1}^k(x-\lambda _i)\) (that is, all \(s_i=1\)).

¹⁴ Question 5 on sheet 5.

We can apply all this to matrices and solve the similarity problem.

Corollary 4.23. Any \(A\in M_n(\C )\) is similar to a direct sum of Jordan blocks, that is, there is an invertible matrix \(P\in M_n(\C )\) such that
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} P^{-1}AP=\oplst {A}1r, \end{equation*}

with each \(A_i\) a Jordan block.

\(\oplst {A}1r\) is called the Jordan normal form (JNF) of \(A\) and is unique up to the order of the \(A_i\).

Proof. Apply Theorem 4.21 to \(\phi _A:\C ^n\to \C ^n\) and let \(P\) be the change of basis matrix from the standard basis to the Jordan basis of \(\phi _A\) (so that the columns of \(P\) are the Jordan basis). □

This gives:

Theorem 4.24. Matrices \(A,B\in M_n(\C )\) are similar if and only if they have the same Jordan normal form, up to reordering the Jordan blocks.

4.4.3 Examples

Example. Let \(\phi =\phi _A:\C ^4\to \C ^4\) where
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} A= \begin{pmatrix*}[r] 2&-4&2&2\\-2&0&1&3\\-2&-2&3&3\\-2&-6&3&7 \end {pmatrix*}. \end{equation*}

let us find the Jordan normal form of \(A\) and a Jordan basis of \(\phi \).

Step 1: compute \(\Delta _A\). This turns out to be \((2-x)^2(4-x)^2\) so that we have eigenvalues \(2,4\) and Proposition 4.13 tells us that
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \dim G_\phi (2)=\dim G_\phi (4)=2. \end{equation*}

Step 2: compute \(m_A\) by trial and error. It must be \((x-2)^{s_1}(x-4)^{s_2}\) with \(\bw 1{s_i}2\) so first try \((x-2)(x-4)\):
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} (A-2I)(A-4I)= \begin{pmatrix*}[r] 0&0&0&0\\0&-4&2&2\\0&-4&2&2\\0&-4&2&2 \end {pmatrix*}\neq 0. \end{equation*}

Next try \((x-2)(x-4)^{2}\):
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} (A-2I)(A-4I)^{2}=0\in M_4(\C ) \end{equation*}

so that \(m_A=(x-2)(x-4)^2\).

Step 3: deduce the shape of the Jordan normal form using Corollary 4.22:

Since \(s_1=1\), all Jordan blocks with eigenvalue \(2\) have size \(1\), \(E_{\phi }(2)=G_{\phi }(2)\).

Since \(s_2=2\), there is at least one Jordan block of size \(2\) with eigenvalue \(4\) and since \(\dim G_{\phi }(4)=2\) there is no room for any other block.

We conclude that \(A\) has JNF \(J(2,1)\oplus J(2,1)\oplus J(4,2)\):
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}

We find a Jordan basis by finding one for each generalised eigenspace in turn. Any basis of \(E_{\phi }(2)\) will do for the \(2\)-generalised eigenspace so solve \((A-2I)\bv =0\) to find one. I found \((2,1,0,2)\), \((0,1,2,0)\).

For the \(4\)-generalised eigenspace, we need a basis of the form \((\phi -4\id )v,v\) with \((\phi -4\id )^{2}(v)=0\). For this we work backwards:
- (a) Find an eigenvector with eigenvalue \(4\) by solving \(A\mathbf {w}=4\mathbf {w}\). One solution is \(w=(0,1,1,1)\).
- (b) Find \(v\) by solving \((A-4I)\bv =\mathbf {w}\). One solution is \((1,0,0,1)\).
We therefore have a Jordan basis \((2,1,0,2)\), \((0,1,2,0)\), \((0,1,1,1)\), \((1,0,0,1)\).

It follows that
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} P= \begin{pmatrix} 2&0&0&1\\1&1&1&0\\0&2&1&0\\2&0&1&1 \end {pmatrix} \end{equation*}

satisfies
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} P^{-1}AP= \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}

Example. Let \(\phi \in L(V)\) with \(\Delta _{\phi }=(x-5)^4\) and \(m_{\phi }=(x-5)^2\). What can be said about the JNF of \(\phi \)?

Solution: We see from \(\Delta _{\phi }\) that \(5\) is the only eigenvalue of \(\phi \) and that \(\dim V=\deg \Delta _{\phi }=4\).

From \(m_{\phi }\), we see that there must be at least one Jordan block of size \(2\). This gives two possibilities:
\(\seteqnumber{0}{4.}{6}\)
\begin{gather*} J(5,2)\oplus J(5,2)\\ J(5,2)\oplus J(5,1)\oplus J(5,1). \end{gather*} In the first case, \(\dim E_{\phi }(5)=2\) and, in the second, \(\dim E_{\phi }(5)=3\).

Example. What is the JNF of \(A\) given by
\(\seteqnumber{0}{4.}{6}\)
\begin{equation*} \begin{pmatrix*}[r] -1&1&0\\ -1&1&0\\ -1&1&0 \end {pmatrix*}? \end{equation*}

Find a Jordan basis for \(A\).

Solution: One readily checks that \(\Delta _A=x^3\), and \(A^2=0\) whence \(A\) is nilpotent with \(m_A=x^2\). Thus \(A\) has at least one \(J_2=J(0,2)\) block of size two so the JNF must be \(J_2\oplus J_1\).

A Jordan basis is \(v_1,v_2,v_3\) with \(A\bv _2=\bv _1\) and \(A\bv _1=A\bv _3=0\) so we seek \(v_1\in \im A\cap \ker A\) and work backwards from there.

Solve linear equations to see that
\(\seteqnumber{0}{4.}{6}\)
\begin{align*} \ker A&=\set {(x,x,y)\st x,y\in \F }\\ \im A&=\set {(x,x,x)\st x\in \F } \end{align*} so take \(v_1=(1,1,1)\) and solve \(A\bv _2=v_1\) to get, for example, \(v_2=(0,1,0)\). Finally take any \(v_3\in \ker A\) that is linearly independent of \(v_1\): \((0,0,1)\) will do.

Thus we have arrived at the Jordan basis \((1,1,1)\), \((0,1,0)\), \((0,0,1)\).

Remark. We see from these computations that Jordan bases of \(\phi \) are far from unique: many choices are made when finding one.