4.4 Jordan normal form
We complete our analyis of linear operators by improving on Proposition 4.12.
First we introduce the key ingredient.
4.4.1 Jordan blocks
-
Definition. The Jordan block of size \(n\in \Z _+\) and eigenvalue \(\lambda \in \F \) is \(J(\lambda ,n)\in M_n(\F )\) with \(\lambda \)’s on the diagonal, \(1\)’s on the super-diagonal and zeros elsewhere. Thus
\(\seteqnumber{0}{4.}{3}\)\begin{equation*} J(\lambda ,n)= \begin{pmatrix} \lambda &1&0&\dots &0\\ &\ddots &\ddots &\ddots &\vdots \\ &&\ddots &\ddots &0\\ &&&\ddots &1\\ 0&&&&\lambda \end {pmatrix} \end{equation*}
We have:
-
-
(1) \(\ker J_n^k=\Span {\lst {e}1k}\). In particular, \(J_n\) is nilpotent: \(J_n^n=0\).
-
(2) \(\im J_n^k=\Span {\lst {e}1{n-k}}\).
-
(3) \(\lambda \) is the only eigenvalue of \(J(\lambda ,n)\) and \(E_{J(\lambda ,n)}(\lambda )=\Span {e_1}\), \(G_{J(\lambda ,n)}(\lambda )=\F ^{n}\).
-
(4) \(m_{J(\lambda ,n)}=\pm \Delta _{J(\lambda ,n)}=(x-\lambda )^n\).
-
We are going to prove that any nilpotent operator \(\phi \in L(V)\) on a finite-dimensional vector space has a basis for which the matrix of \(\phi \) is a direct sum of Jordan blocks: \(J_{n_1}\oplus \dots \oplus J_{n_k}\) with \(\plst {n}1k=\dim V\).
We start by spelling out what it means for an operator to have a Jordan block as matrix:
-
Lemma 4.16. Let \(\lst {v}1n\) be a basis for a vector space \(V\) and \(\phi \in L(V)\).
Then the following are equivalent:
-
(1) \(\phi \) has matrix \(J_n\) with respect to \(\lst {v}1n\).
-
(2) \(\phi (v_1)=0\) and \(\phi (v_i)=v_{i-1}\), for \(\bw 2in\).
-
(3) \(v_i=\phi ^{n-i}(v_n)\), \(\bw 0i{n-1}\) and \(\phi ^{n}(v_n)=0\).
-
-
Proof. The equivalence of (1) and (2) comes straight from the definitions since \((J_n)_{i-1,i}=1\) and all other entries in the \(i\)-th column vanish.
The equivalence of (2) and (3) is an easy exercise10. □
We will work with characterisation (3) and prove:
-
Theorem 4.17. Let \(\phi \in L(V)\) be a nilpotent operator on a finite-dimensional vector space over \(\F \). Then there are \(\lst {v}1k\in V\) and \(\lst {n}1k\in \Z _+\) such that
\(\seteqnumber{0}{4.}{3}\)\begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots ,\phi ^{n_k-1}(v_k),\dots ,\phi (v_k),v_k \end{equation*}
is a basis of \(V\) and \(\phi ^{n_i}(v_i)=0\), for \(\bw 1ik\).
Using this basis and Lemma 4.16 we immediately conclude:
-
Proof of Theorem 4.17. Once again we induct on \(\dim V\).
If \(\dim V=1\), the only nilpotent operator is the zero operator and any basis \(v_1\) will do.
For the induction step, suppose that the theorem is true when \(\dim V<n\) and suppose that \(\dim V=n\). We prove the theorem for \(V\) in three steps.
Step 1: apply the induction hypothesis to \(\im \phi \). We let \(r=\rank \phi \) and \(k=n-r=\dim \ker \phi \). Since \(\phi \) is nilpotent, \(k>0\) so that \(r=\dim \im \phi <n\). We therefore apply the induction hypothesis to \(\phi \restr {\im \phi }\) to get \(\lst {w}1\ell \in \im \phi \), \(\lst {m}1\ell \in \Z _+\) such that
\(\seteqnumber{0}{4.}{3}\)\begin{equation*} \lst {u}1r:=\phi ^{m_1-1}(w_1),\dots ,\phi (w_1),w_1,\dots , \phi ^{m_{\ell }-1}(w_{\ell }),\dots ,\phi (w_{\ell }),w_{\ell } \end{equation*}
is a basis of \(\im \phi \) and \(\phi ^{m_i}(w_i)=0\), for \(\bw 1i\ell \). Observe that each \(\phi (u_i)\) is either \(u_{i-1}\) or zero.
Step 2: Find the first \(\ell \) of the \(v_i\). Each \(w_i\in \im \phi \) so choose \(\lst {v}1\ell \) such that \(\phi (v_i)=w_i\), for \(\bw 1i\ell \).
We claim that \(\lst {u}1r,\lst {v}1\ell \) are linearly independent. For this, suppose that we have a linear relation
\(\seteqnumber{0}{4.}{3}\)\begin{equation} \label {eq:16} \sum _{j=1}^r\lambda _ju_j+\sum _{i=1}^{\ell }\mu _iv_i=0 \end{equation}
and take \(\phi \) of this to get
\(\seteqnumber{0}{4.}{4}\)\begin{equation*} \sum _{j=1}^r\lambda _j\phi (u_j)+\sum _{i=1}^{\ell }\mu _i\phi (v_i)=0 \end{equation*}
which reads
\(\seteqnumber{0}{4.}{4}\)\begin{equation} \label {eq:17} \sum _{j\st \phi (u_j)\neq 0}\lambda _ju_{j-1}+\sum _{i=1}^{\ell }\mu _iw_i=0. \end{equation}
Since these \(u_{j-1}\) and \(w_i\) are distinct, (4.5) is still a linear relation on the linearly independent \(u_j\) and so, in particular, each \(\mu _i=0\). Now (4.4) becomes a linear relation on the \(u_j\) and so all \(\lambda _j=0\) also. This proves the claim.
Step 3: extend \(\lst {u}1r,\lst {v}1\ell \) to a basis of \(V\) by adding elements of \(\ker \phi \). Define \(U\leq V\) by
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} U=\Span {\lst {u}1r,\lst {v}1\ell }\geq \im \phi \end{equation*}
and note that \(\im \phi =\phi (U)\) since any \(u_i=\phi ^m(v_j)\), for some \(\bw 1j\ell \) and \(\bw 1m{m_j}\). We extend to get a basis
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k \end{equation*}
of \(V\). Now, for \(\bw {\ell +1}{j}k\), there is some \(y_j\in U\) such that \(\phi (y_j)=\phi (x_j)\) whence \(v_j:=x_j-y_j\in \ker \phi \).
By construction
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} \Span {\lst {u}1r,\lst {v}1k}=\Span { \lst {u}1r,\lst {v}1\ell ,\lst {x}{\ell +1}k}=V \end{equation*}
so that \(\lst {u}1r,\lst {v}1k\) is a basis of \(V\). Moreover, setting
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} n_i= \begin{cases} m_i+1&\bw 1i\ell \\ 1&\bw {\ell +1}ik \end {cases} \end{equation*}
we have \(\phi ^{n_i}(v_i)=0\), for all \(\bw 1ik\) and our basis, reordered to slot the first \(\ell \) \(v_i\) into the right places, is
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} \phi ^{n_1-1}(v_1),\dots ,\phi (v_1),v_1,\dots , \phi ^{n_\ell -1}(v_\ell ),\dots ,\phi (v_\ell ),v_\ell ,v_{\ell +1},\dots ,v_k, \end{equation*}
which is of the required form. □
The only question left is how unique are the \(n_i\)? We already know from the proof of Theorem 4.17 that there are \(k=\dim \ker \phi \) of them11 but we can do better. For this, set \(A=J_{n_1}\oplus \dots \oplus J_{n_k}\) so that, for \(s\in \N \), \(A^s=J_{n_1}^s\oplus \dots \oplus J_{n_k}^s\). Now
\(\seteqnumber{0}{4.}{5}\)\begin{equation*} \dim \ker J_{n_i}^s=s, \end{equation*}
for \(s\leq n_i\) so that
\(\seteqnumber{0}{4.}{5}\)\begin{equation} \label {eq:18} \dim \ker J_{n_i}^s-\dim \ker J_{n_i}^{s-1}= \begin{cases} 1&\bw 1s{n_i}\\0&s>n_i. \end {cases} \end{equation}
Now \(\ker A^s=\bigoplus _{i=1}^k\ker J_{n_i}^s\) so summing (4.6) over \(i\) yields:
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker A^s-\dim \ker A^{s-1}. \end{equation*}
This proves:
11 Alternatively, if you have not read the proof: if there are \(k\) Jordan blocks \(J_{n_{i}}\), we have \(\dim \ker \phi =\sum _{i=1}^k\dim \ker J(n_i)=k\) since \(\dim \ker J(n_i)=1\).
-
Proposition 4.19. Let \(\phi \in L(V)\) be nilpotent with matrix \(J_{n_1}\oplus \dots \oplus J_{n_k}\) for some basis of \(V\). Then \(\lst {n}1k\) are unique up to order. Indeed,
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \#\set {i\st n_i\geq s}=\dim \ker \phi ^s-\dim \ker \phi ^{s-1}, \end{equation*}
for each \(s\geq 1\).
-
Exercise.12 In the situation of Proposition 4.19, show that
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \#\set {i\st n_i= s}=2\dim \ker \phi ^s-\dim \ker \phi ^{s-1}-\dim \ker \phi ^{s+1}. \end{equation*}
In another direction:
-
Proposition 4.20. In the situation of Proposition 4.19, we have
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} m_{\phi }=x^s, \end{equation*}
where \(s=\max \set {\lst {n}1k}\).
4.4.2 Jordan normal form
We put §4.4.1 together with Theorem 4.11 to prove the ultimate structure theorem for linear operators on a finite-dimensional complex vector space.
-
Theorem 4.21. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \). Then there is a basis of \(V\) for which \(\phi \) has as matrix a direct sum of Jordan blocks which are unique up to order.
Such a basis is called a Jordan basis and the direct sum of Jordan blocks is called the Jordan normal form (JNF) of \(\phi \).
-
Proof. Let \(\lst \lambda 1k\) be the distinct eigenvalues of \(\phi \). By Theorem 4.11, \(V=\bigoplus V_i\), for \(V_i=G_{\phi }(\lambda _i)\) and then \(\phi _i:=\phi \restr {V_i}\) can be written
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \phi _i=\lambda _i\id _{V_i}+N_i, \end{equation*}
with \(N_i\) nilpotent. Apply Corollary 4.18 to get a basis of \(V_i\) for which \(N_i\) has matrix \(J_{n_1}\oplus \dots \oplus J_{n_{\ell }}\). By Proposition 4.19, the \(\lst {n}1\ell \) are unique up to order. Now \(\phi _i\) has matrix
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} J(\lambda _i,n_1)\oplus \dots \oplus J(\lambda _i,n_{\ell }). \end{equation*}
We then concatenate these bases to get the required Jordan basis of \(V\). □
From this, Proposition 4.14 and Proposition 4.20, we get a complete account of the minimum polynomial:
-
Corollary 4.22. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\C \) with distinct eigenvalues \(\lst \lambda 1k\). Then
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} m_{\phi }=\prod _{i=1}^k(x-\lambda _i)^{s_i} \end{equation*}
where \(s_i\) is the size of the largest Jordan block of \(\phi \) with eigenvalue \(\lambda _{i}\).
We can apply all this to matrices and solve the similarity problem.
-
Corollary 4.23. Any \(A\in M_n(\C )\) is similar to a direct sum of Jordan blocks, that is, there is an invertible matrix \(P\in M_n(\C )\) such that
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} P^{-1}AP=\oplst {A}1r, \end{equation*}
with each \(A_i\) a Jordan block.
\(\oplst {A}1r\) is called the Jordan normal form (JNF) of \(A\) and is unique up to the order of the \(A_i\).
-
Proof. Apply Theorem 4.21 to \(\phi _A:\C ^n\to \C ^n\) and let \(P\) be the change of basis matrix from the standard basis to the Jordan basis of \(\phi _A\) (so that the columns of \(P\) are the Jordan basis). □
This gives:
4.4.3 Examples
-
Example. Let \(\phi =\phi _A:\C ^4\to \C ^4\) where
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} A= \begin{pmatrix*}[r] 2&-4&2&2\\-2&0&1&3\\-2&-2&3&3\\-2&-6&3&7 \end {pmatrix*}. \end{equation*}
let us find the Jordan normal form of \(A\) and a Jordan basis of \(\phi \).
Step 1: compute \(\Delta _A\). This turns out to be \((2-x)^2(4-x)^2\) so that we have eigenvalues \(2,4\) and Proposition 4.13 tells us that
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \dim G_\phi (2)=\dim G_\phi (4)=2. \end{equation*}
Step 2: compute \(m_A\) by trial and error. It must be \((x-2)^{s_1}(x-4)^{s_2}\) with \(\bw 1{s_i}2\) so first try \((x-2)(x-4)\):
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} (A-2I)(A-4I)= \begin{pmatrix*}[r] 0&0&0&0\\0&-4&2&2\\0&-4&2&2\\0&-4&2&2 \end {pmatrix*}\neq 0. \end{equation*}
Next try \((x-2)(x-4)^{2}\):
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} (A-2I)(A-4I)^{2}=0\in M_4(\C ) \end{equation*}
so that \(m_A=(x-2)(x-4)^2\).
Step 3: deduce the shape of the Jordan normal form using Corollary 4.22:
Since \(s_1=1\), all Jordan blocks with eigenvalue \(2\) have size \(1\), \(E_{\phi }(2)=G_{\phi }(2)\).
Since \(s_2=2\), there is at least one Jordan block of size \(2\) with eigenvalue \(4\) and since \(\dim G_{\phi }(4)=2\) there is no room for any other block.
We conclude that \(A\) has JNF \(J(2,1)\oplus J(2,1)\oplus J(4,2)\):
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}
We find a Jordan basis by finding one for each generalised eigenspace in turn. Any basis of \(E_{\phi }(2)\) will do for the \(2\)-generalised eigenspace so solve \((A-2I)\bv =0\) to find one. I found \((2,1,0,2)\), \((0,1,2,0)\).
For the \(4\)-generalised eigenspace, we need a basis of the form \((\phi -4\id )v,v\) with \((\phi -4\id )^{2}(v)=0\). For this we work backwards:
-
(a) Find an eigenvector with eigenvalue \(4\) by solving \(A\mathbf {w}=4\mathbf {w}\). One solution is \(w=(0,1,1,1)\).
-
(b) Find \(v\) by solving \((A-4I)\bv =\mathbf {w}\). One solution is \((1,0,0,1)\).
We therefore have a Jordan basis \((2,1,0,2)\), \((0,1,2,0)\), \((0,1,1,1)\), \((1,0,0,1)\).
It follows that
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} P= \begin{pmatrix} 2&0&0&1\\1&1&1&0\\0&2&1&0\\2&0&1&1 \end {pmatrix} \end{equation*}
satisfies
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} P^{-1}AP= \begin{pmatrix} 2&&&\\&2&&\\&&4&1\\&&&4 \end {pmatrix}. \end{equation*}
-
-
Example. Let \(\phi \in L(V)\) with \(\Delta _{\phi }=(x-5)^4\) and \(m_{\phi }=(x-5)^2\). What can be said about the JNF of \(\phi \)?
Solution: We see from \(\Delta _{\phi }\) that \(5\) is the only eigenvalue of \(\phi \) and that \(\dim V=\deg \Delta _{\phi }=4\).
From \(m_{\phi }\), we see that there must be at least one Jordan block of size \(2\). This gives two possibilities:
\(\seteqnumber{0}{4.}{6}\)\begin{gather*} J(5,2)\oplus J(5,2)\\ J(5,2)\oplus J(5,1)\oplus J(5,1). \end{gather*} In the first case, \(\dim E_{\phi }(5)=2\) and, in the second, \(\dim E_{\phi }(5)=3\).
-
Example. What is the JNF of \(A\) given by
\(\seteqnumber{0}{4.}{6}\)\begin{equation*} \begin{pmatrix*}[r] -1&1&0\\ -1&1&0\\ -1&1&0 \end {pmatrix*}? \end{equation*}
Find a Jordan basis for \(A\).
Solution: One readily checks that \(\Delta _A=x^3\), and \(A^2=0\) whence \(A\) is nilpotent with \(m_A=x^2\). Thus \(A\) has at least one \(J_2=J(0,2)\) block of size two so the JNF must be \(J_2\oplus J_1\).
A Jordan basis is \(v_1,v_2,v_3\) with \(A\bv _2=\bv _1\) and \(A\bv _1=A\bv _3=0\) so we seek \(v_1\in \im A\cap \ker A\) and work backwards from there.
Solve linear equations to see that
\(\seteqnumber{0}{4.}{6}\)\begin{align*} \ker A&=\set {(x,x,y)\st x,y\in \F }\\ \im A&=\set {(x,x,x)\st x\in \F } \end{align*} so take \(v_1=(1,1,1)\) and solve \(A\bv _2=v_1\) to get, for example, \(v_2=(0,1,0)\). Finally take any \(v_3\in \ker A\) that is linearly independent of \(v_1\): \((0,0,1)\) will do.
Thus we have arrived at the Jordan basis \((1,1,1)\), \((0,1,0)\), \((0,0,1)\).