4.4 Jordan normal form

We complete our analyis of linear operators by improving on Proposition 4.12.

First we introduce the key ingredient.

4.4.1 Jordan blocks
  • Definition. The Jordan block of size \(n\in \Z _+\) and eigenvalue \(\lambda \in \F \) is \(J(\lambda ,n)\in M_n(\F )\) with \(\lambda \)’s on the diagonal, \(1\)’s on the super-diagonal and zeros elsewhere. Thus

    \begin{equation*} J(\lambda ,n)= \begin{pmatrix} \lambda &1&0&\dots &0\\ &\ddots &\ddots &\ddots &\vdots \\ &&\ddots &\ddots &0\\ &&&\ddots &1\\ 0&&&&\lambda \end {pmatrix} \end{equation*}

  • Notation. Set \(J_n:=J(0,n)\) so that \(J(\lambda ,n)=\lambda I_n+J_n\).

We have:

  • Exercises.9

    • (1) \(\ker J_n^k=\Span {\lst {e}1k}\). In particular, \(J_n\) is nilpotent: \(J_n^n=0\).

    • (2) \(\im J_n^k=\Span {\lst {e}1{n-k}}\).

    • (3) \(\lambda \) is the only eigenvalue of \(J(\lambda ,n)\) and \(E_{J(\lambda ,n)}(\lambda )=\Span {e_1}\), \(G_{J(\lambda ,n)}(\lambda )=\F ^{n}\).

    • (4) \(m_{J(\lambda ,n)}=\pm \Delta _{J(\lambda ,n)}=(x-\lambda )^n\).

9 Exercise sheet 5, question 1.

We are going to prove that any nilpotent operator \(\phi \in L(V)\) on a finite-dimensional vector space has a basis for which the matrix of \(\phi \) is a direct sum of Jordan blocks: \(J_{n_1}\oplus \dots \oplus J_{n_k}\) with \(\plst {n}1k=\dim V\).

We start by spelling out what it means for an operator to have a Jordan block as matrix:

  • Lemma 4.16. Let \(\lst {v}1n\) be a basis for a vector space \(V\) and \(\phi \in L(V)\).

    Then the following are equivalent:

    • (1) \(\phi \) has matrix \(J_n\) with respect to \(\lst {v}1n\).

    • (2) \(\phi (v_1)=0\) and \(\phi (v_i)=v_{i-1}\), for \(\bw 2in\).

    • (3) \(v_i=\phi ^{n-i}(v_n)\), \(\bw 0i{n-1}\) and \(\phi ^{n}(v_n)=0\).

  • Proof. The equivalence of (1) and (2) comes straight from the definitions since \((J_n)_{i-1,i}=1\) and all other entries in the \(i\)-th column vanish.

    The equivalence of (2) and (3) is an easy exercise10.  □

10 Question 2 on sheet 5.

We will work with characterisation (3) and prove:

  • Theorem 4.17. Let \(\phi \in L(V)\) be a nilpotent operator on a finite-dimensional vector space over \(\F \). Then there are \(\lst {v}1k\in V\) and \(\lst {n}1k\in \Z _+\) such that

    \begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots ,\phi ^{n_k-1}(v_k),\dots ,\phi (v_k),v_k \end{equation*}

    is a basis of \(V\) and \(\phi ^{n_i}(v_i)=0\), for \(\bw 1ik\).

Using this basis and Lemma 4.16 we immediately conclude:

  • Corollary 4.18. Let \(\phi \in L(V)\) be a nilpotent operator on a finite-dimensional vector space over \(\F \). Then there is a basis for which \(\phi \) has matrix \(J_{n_1}\oplus \dots \oplus J_{n_k}\).

  • Remark. Note that direct sums of the \(J_{n_i}\) are characterised by having \(1\)’s and zeros (at the joins of successive blocks) on the super-diagonal and zeros elsewhere.

  • Proof of Theorem 4.17. Once again we induct on \(\dim V\).

    If \(\dim V=1\), the only nilpotent operator is the zero operator and any basis \(v_1\) will do.

    For the induction step, suppose that the theorem is true when \(\dim V<n\) and suppose that \(\dim V=n\). We prove the theorem for \(V\) in three steps.

    Step 1: apply the induction hypothesis to \(\im \phi \). We let \(r=\rank \phi \) and \(k=n-r=\dim \ker \phi \). Since \(\phi \) is nilpotent, \(k>0\) so that \(r=\dim \im \phi <n\). We therefore apply the induction hypothesis to \(\phi \restr {\im \phi }\) to get \(\lst {w}1\ell \in \im \phi \), \(\lst {m}1\ell \in \Z _+\) such that

    \begin{equation*} \lst {u}1r:=\phi ^{m_1-1}(w_1),\dots ,\phi (w_1),w_1,\dots , \phi ^{m_{\ell }-1}(w_{\ell }),\dots ,\phi (w_{\ell }),w_{\ell } \end{equation*}

    is a basis of \(\im \phi \) and \(\phi ^{m_i}(w_i)=0\), for \(\bw 1i\ell \). Observe that each \(\phi (u_i)\) is either \(u_{i-1}\) or zero.

    Step 2: Find the first \(\ell \) of the \(v_i\). Each \(w_i\in \im \phi \) so choose \(\lst {v}1\ell \) such that \(\phi (v_i)=w_i\), for \(\bw 1i\ell \).

    We claim that \(\lst {u}1r,\lst {v}1\ell \) are linearly independent. For this, suppose that we have a linear relation

    \begin{equation} \label {eq:16} \sum _{j=1}^r\lambda _ju_j+\sum _{i=1}^{\ell }\mu _iv_i=0 \end{equation}

    and take \(\phi \) of this to get

    \begin{equation*} \sum _{j=1}^r\lambda _j\phi (u_j)+\sum _{i=1}^{\ell }\mu _i\phi (v_i)=0 \end{equation*}

    which reads

    \begin{equation} \label {eq:17} \sum _{j\st \phi (u_j)\neq 0}\lambda _ju_{j-1}+\sum _{i=1}^{\ell }\mu _iw_i=0. \end{equation}

    Since these \(u_{j-1}\) and \(w_i\) are distinct, (4.5) is still a linear relation on the linearly independent \(u_j\) and so, in particular, each \(\mu _i=0\). Now (4.4) becomes a linear relation on the \(u_j\) and so all \(\lambda _j=0\) also. This proves the claim.

    Step 3: extend \(\lst {u}1r,\lst {v}1\ell \) to a basis of \(V\) by adding elements of \(\ker \phi \). Define \(U\leq V\) by

    \begin{equation*} U=\Span {\lst {u}1r,\lst {v}1\ell }\geq \im \phi \end{equation*}

    and note that \(\im \phi =\phi (U)\) since any \(u_i=\phi ^m(v_j)\), for some \(\bw 1j\ell \) and \(\bw 1m{m_j}\). We extend to get a basis

    \begin{equation*} \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k \end{equation*}

    of \(V\). Now, for \(\bw {\ell +1}{j}k\), there is some \(y_j\in U\) such that \(\phi (y_j)=\phi (x_j)\) whence \(v_j:=x_j-y_j\in \ker \phi \).

    By construction

    \begin{equation*} \Span {\lst {u}1r,\lst {v}1k}=\Span { \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k}=V \end{equation*}

    so that \(\lst {u}1r,\lst {v}1k\) is a basis of \(V\). Moreover, setting

    \begin{equation*} n_i= \begin{cases} m_i+1&\bw 1i\ell \\ 1&\bw {\ell +1}ik \end {cases} \end{equation*}

    we have \(\phi ^{n_i}(v_i)=0\), for all \(\bw 1ik\) and our basis, reordered to slot the first \(\ell \) \(v_i\) into the right places, is

    \begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots , \phi ^{n_\ell -1}(v_\ell ),\dots ,\phi (v_\ell ),v_\ell ,v_{\ell +1},\dots ,v_k, \end{equation*}

    which is of the required form.  □

The only question left is how unique are the \(n_i\)? We already know from the proof of Theorem 4.17 that there are \(k=\dim \ker \phi \) of them11 but we can do better. For this, set \(A=J_{n_1}\oplus \dots \oplus J_{n_k}\) so that, for \(s\in \N \), \(A^s=J_{n_1}^s\oplus \dots \oplus J_{n_k}^s\). Now

\begin{equation*} \dim \ker J_{n_i}^s=s, \end{equation*}

for \(s\leq n_i\) so that

\begin{equation} \label {eq:18} \dim \ker J_{n_i}^s-\dim \ker J_{n_i}^{s-1}= \begin{cases} 1&\bw 1s{n_i}\\0&s>n_i. \end {cases} \end{equation}

Now \(\ker A^s=\bigoplus _{i=1}^k\ker J_{n_i}^s\) so summing (4.6) over \(i\) yields:

\begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker A^s-\dim \ker A^{s-1}. \end{equation*}

This proves:

11 Alternatively, if you have not read the proof: if there are \(k\) Jordan blocks \(J_{n_{i}}\), we have \(\dim \ker \phi =\sum _{i=1}^k\dim \ker J(n_i)=k\) since \(\dim \ker J(n_i)=1\).

  • Proposition 4.19. Let \(\phi \in L(V)\) be nilpotent with matrix \(J_{n_1}\oplus \dots \oplus J_{n_k}\) for some basis of \(V\). Then \(\lst {n}1k\) are unique up to order. Indeed,

    \begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker \phi ^s-\dim \ker \phi ^{s-1}, \end{equation*}

    for each \(s\geq 1\).

  • Exercise.12 In the situation of Proposition 4.19, show that

    \begin{equation*} \#\set {i\st n_i= s}=2\dim \ker \phi ^s-\dim \ker \phi ^{s-1}-\dim \ker \phi ^{s+1}. \end{equation*}

12 Question 3 on sheet 5.

In another direction:

  • Proposition 4.20. In the situation of Proposition 4.19, we have

    \begin{equation*} m_{\phi }=x^s, \end{equation*}

    where \(s=\max \set {\lst {n}1k}\).

  • Proof. Exercise13!  □

13 Question 4 on sheet 5.

4.4.2 Jordan normal form

We put §4.4.1 together with Theorem 4.11 to prove the ultimate structure theorem for linear operators on a finite-dimensional complex vector space.

  • Theorem 4.21. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \). Then there is a basis of \(V\) for which \(\phi \) has as matrix a direct sum of Jordan blocks which are unique up to order.

    Such a basis is called a Jordan basis and the direct sum of Jordan blocks is called the Jordan normal form (JNF) of \(\phi \).

  • Proof. Let \(\lst \lambda 1k\) be the distinct eigenvalues of \(\phi \). By Theorem 4.11, \(V=\bigoplus V_i\), for \(V_i=G_{\phi }(\lambda _i)\) and then \(\phi _i:=\phi \restr {V_i}\) can be written

    \begin{equation*} \phi _i=\lambda _i\id _{V_i}+N_i, \end{equation*}

    with \(N_i\) nilpotent. Apply Corollary 4.18 to get a basis of \(V_i\) for which \(N_i\) has matrix \(J_{n_1}\oplus \dots \oplus J_{n_{\ell }}\). By Proposition 4.19, the \(\lst {n}1\ell \) are unique up to order. Now \(\phi _i\) has matrix

    \begin{equation*} J(\lambda _i,n_1)\oplus \dots \oplus J(\lambda _i,n_{\ell }). \end{equation*}

    We then concatenate these bases to get the required Jordan basis of \(V\).  □

From this, Proposition 4.14 and Proposition 4.20, we get a complete account of the minimum polynomial:

  • Corollary 4.22. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \) with distinct eigenvalues \(\lst \lambda 1k\). Then

    \begin{equation*} m_{\phi }=\prod _{i=1}^k(x-\lambda _i)^{s_i} \end{equation*}

    where \(s_i\) is the size of the largest Jordan block of \(\phi \) with eigenvalue \(\lambda _{i}\).

  • Exercise.14 \(\phi \) is diagonalisable if and only if \(m_{\phi }=\prod _{i=1}^k(x-\lambda _i)\) (that is, all \(s_i=1\)).

14 Question 5 on sheet 5.

We can apply all this to matrices and solve the similarity problem.

  • Corollary 4.23. Any \(A\in M_n(\C )\) is similar to a direct sum of Jordan blocks, that is, there is an invertible matrix \(P\in M_n(\C )\) such that

    \begin{equation*} P^{-1}AP=\oplst {A}1r, \end{equation*}

    with each \(A_i\) a Jordan block.

    \(\oplst {A}1r\) is called the Jordan normal form (JNF) of \(A\) and is unique up to the order of the \(A_i\).

  • Proof. Apply Theorem 4.21 to \(\phi _A:\C ^n\to \C ^n\) and let \(P\) be the change of basis matrix from the standard basis to the Jordan basis of \(\phi _A\) (so that the columns of \(P\) are the Jordan basis).  □

This gives:

  • Theorem 4.24. Matrices \(A,B\in M_n(\C )\) are similar if and only if they have the same Jordan normal form, up to reordering the Jordan blocks.

4.4.3 Examples
  • Example. Let \(\phi =\phi _A:\C ^4\to \C ^4\) where

    \begin{equation*} A= \begin{pmatrix*}[r] 2&-4&2&2\\-2&0&1&3\\-2&-2&3&3\\-2&-6&3&7 \end {pmatrix*}. \end{equation*}

    let us find the Jordan normal form of \(A\) and a Jordan basis of \(\phi \).

    Step 1: compute \(\Delta _A\). This turns out to be \((2-x)^2(4-x)^2\) so that we have eigenvalues \(2,4\) and Proposition 4.13 tells us that

    \begin{equation*} \dim G_\phi (2)=\dim G_\phi (4)=2. \end{equation*}

    Step 2: compute \(m_A\) by trial and error. It must be \((x-2)^{s_1}(x-4)^{s_2}\) with \(\bw 1{s_i}2\) so first try \((x-2)(x-4)\):

    \begin{equation*} (A-2I)(A-4I)= \begin{pmatrix*}[r] 0&0&0&0\\0&-4&2&2\\0&-4&2&2\\0&-4&2&2 \end {pmatrix*}\neq 0. \end{equation*}

    Next try \((x-2)(x-4)^{2}\):

    \begin{equation*} (A-2I)(A-4I)^{2}=0\in M_4(\C ) \end{equation*}

    so that \(m_A=(x-2)(x-4)^2\).

    Step 3: deduce the shape of the Jordan normal form using Corollary 4.22:

    Since \(s_1=1\), all Jordan blocks with eigenvalue \(2\) have size \(1\), \(E_{\phi }(2)=G_{\phi }(2)\).

    Since \(s_2=2\), there is at least one Jordan block of size \(2\) with eigenvalue \(4\) and since \(\dim G_{\phi }(4)=2\) there is no room for any other block.

    We conclude that \(A\) has JNF \(J(2,1)\oplus J(2,1)\oplus J(4,2)\):

    \begin{equation*} \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}

    We find a Jordan basis by finding one for each generalised eigenspace in turn. Any basis of \(E_{\phi }(2)\) will do for the \(2\)-generalised eigenspace so solve \((A-2I)\bv =0\) to find one. I found \((2,1,0,2)\), \((0,1,2,0)\).

    For the \(4\)-generalised eigenspace, we need a basis of the form \((\phi -4\id )v,v\) with \((\phi -4\id )^{2}(v)=0\). For this we work backwards:

    • (a) Find an eigenvector with eigenvalue \(4\) by solving \(A\mathbf {w}=4\mathbf {w}\). One solution is \(w=(0,1,1,1)\).

    • (b) Find \(v\) by solving \((A-4I)\bv =\mathbf {w}\). One solution is \((1,0,0,1)\).

    We therefore have a Jordan basis \((2,1,0,2)\), \((0,1,2,0)\), \((0,1,1,1)\), \((1,0,0,1)\).

    It follows that

    \begin{equation*} P= \begin{pmatrix} 2&0&0&1\\1&1&1&0\\0&2&1&0\\2&0&1&1 \end {pmatrix} \end{equation*}

    satisfies

    \begin{equation*} P^{-1}AP= \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}

  • Example. Let \(\phi \in L(V)\) with \(\Delta _{\phi }=(x-5)^4\) and \(m_{\phi }=(x-5)^2\). What can be said about the JNF of \(\phi \)?

    Solution: We see from \(\Delta _{\phi }\) that \(5\) is the only eigenvalue of \(\phi \) and that \(\dim V=\deg \Delta _{\phi }=4\).

    From \(m_{\phi }\), we see that there must be at least one Jordan block of size \(2\). This gives two possibilities:

    \begin{gather*} J(5,2)\oplus J(5,2)\\ J(5,2)\oplus J(5,1)\oplus J(5,1). \end{gather*} In the first case, \(\dim E_{\phi }(5)=2\) and, in the second, \(\dim E_{\phi }(5)=3\).

  • Example. What is the JNF of \(A\) given by

    \begin{equation*} \begin{pmatrix*}[r] -1&1&0\\ -1&1&0\\ -1&1&0 \end {pmatrix*}? \end{equation*}

    Find a Jordan basis for \(A\).

    Solution: One readily checks that \(\Delta _A=x^3\), and \(A^2=0\) whence \(A\) is nilpotent with \(m_A=x^2\). Thus \(A\) has at least one \(J_2=J(0,2)\) block of size two so the JNF must be \(J_2\oplus J_1\).

    A Jordan basis is \(v_1,v_2,v_3\) with \(A\bv _2=\bv _1\) and \(A\bv _1=A\bv _3=0\) so we seek \(v_1\in \im A\cap \ker A\) and work backwards from there.

    Solve linear equations to see that

    \begin{align*} \ker A&=\set {(x,x,y)\st x,y\in \F }\\ \im A&=\set {(x,x,x)\st x\in \F } \end{align*} so take \(v_1=(1,1,1)\) and solve \(A\bv _2=v_1\) to get, for example, \(v_2=(0,1,0)\). Finally take any \(v_3\in \ker A\) that is linearly independent of \(v_1\): \((0,0,1)\) will do.

    Thus we have arrived at the Jordan basis \((1,1,1)\), \((0,1,0)\), \((0,0,1)\).

  • Remark. We see from these computations that Jordan bases of \(\phi \) are far from unique: many choices are made when finding one.