4.3 Jordan decomposition

4.3.1 Powers of operators and Fitting’s Lemma
  • Proposition 4.6 (Increasing kernels, decreasing images). Let \(V\) be a vector space over a field \(\F \) and \(\phi \in L(V)\). Then

    • (1) \(\ker \phi ^k\leq \ker \phi ^{k+1}\), for all \(k\in \N \). That is,

      \begin{equation*} \set {0}=\ker \phi ^0\leq \ker \phi \leq \ker \phi ^2\leq \dots . \end{equation*}

      If \(\ker \phi ^k=\ker \phi ^{k+1}\) then \(\ker \phi ^k=\ker \phi ^{k+n}\), for all \(n\in \N \).

    • (2) \(\im \phi ^k\geq \im \phi ^{k+1}\), for all \(k\in \N \). That is,

      \begin{equation*} V=\im \phi ^0\geq \im \phi \geq \im \phi ^2\geq \dots . \end{equation*}

      If \(\im \phi ^k=\im \phi ^{k+1}\) then \(\im \phi ^k=\im \phi ^{k+n}\), for all \(n\in \N \).

  • Proof. We prove (1) and leave (2) as an exercise5.

    If \(v\in \ker \phi ^k\) then \(\phi ^k(v)=0\) so that \(\phi ^{k+1}(v)=\phi (\phi ^k(v))=\phi (0)=0\). Thus \(v\in \ker \phi ^{k+1}\) as required.

    Now suppose that \(\ker \phi ^k=\ker \phi ^{k+1}\) and induct to prove that \(\ker \phi ^k=\ker \phi ^{k+n}\), for \(n\in \N \). We already have the \(n=1\) case by assumption so suppose \(\ker \phi ^k=\ker \phi ^{k+n}\), for some \(n\) and let \(v\in \ker \phi ^{k+n+1}\). Then

    \begin{equation*} 0=\phi ^{k+n+1}(v)=\phi ^{k+1}(\phi ^n(v)) \end{equation*}

    so that \(\phi ^n(v)\in \ker \phi ^{k+1}=\ker \phi ^k\). Thus \(\phi ^{n+k}(v)=0\) and \(v\in \ker \phi ^{n+k}=\ker \phi ^k\) by the induction hypothesis. Induction now tells us that \(\ker \phi ^k=\ker \phi ^{k+n}\), for all \(n\in \N \).  □

5 Question 3 on exercise sheet 5.

  • Corollary 4.7. Let \(V\) be finite-dimensional with \(\dim V=n\) and \(\phi \in L(V)\). Then, for all \(k\in \N \),

    \begin{align*} \ker \phi ^n&=\ker \phi ^{n+k}\\ \im \phi ^n&=\im \phi ^{n+k}. \end{align*}

  • Proof. By Proposition 4.6, we need to prove \(\ker \phi ^n=\ker \phi ^{n+1}\) and \(\im \phi ^n=\im \phi ^{n+1}\).

    If \(\ker \phi ^n\neq \ker \phi ^{n+1}\) then, by Proposition 4.6, we have subspaces

    \begin{equation*} \set {0}\lneqq \ker \phi \lneqq \dots \lneqq \ker \phi ^{n+1} \end{equation*}

    of strictly increasing dimension so that \(\dim \ker \phi ^{n+1}\geq n+1>\dim V\): a contradiction. Thus \(\ker \phi ^n=\ker \phi ^{n+1}\).

    Rank-nullity now tells us that \(\dim \im \phi ^{n}=\dim \im \phi ^{n+1}\) whence \(\im \phi ^n=\im \phi ^{n+1}\) also.  □

  • Theorem 4.8 (Fitting6’s Lemma). Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over a field \(\F \). Then, with \(n=\dim V\), we have

    \begin{equation*} V=\ker \phi ^n\oplus \im \phi ^n. \end{equation*}

6 Hans Fitting, 1906–1938.
  • Proof. From Corollary 4.7, we know that \(\ker \phi ^n=\ker \phi ^{n+k}\), \(\im \phi ^n=\im \phi ^{n+k}\), for all \(k\in \N \).

    We start by proving that \(\ker \phi ^n\cap \im \phi ^n=\set 0\). For this, let \(v\in \ker \phi ^n\cap \im \phi ^n\) so that \(\phi ^n(v)=0\) and there is \(w\in V\) such that \(v=\phi ^n(w)\). Then \(0=\phi ^n(v)=\phi ^{2n}(w)\) so that \(w\in \ker \phi ^{2n}=\ker \phi ^n\). Thus \(v=\phi ^n(w)=0\) as required.

    It follows that \(V\geq \ker \phi ^n\oplus \im \phi ^n\) but, by rank-nullity, the dimensions of these spaces coincide whence \(V=\ker \phi ^n\oplus \im \phi ^n\).  □

4.3.2 Generalised eigenspaces

Let us revisit the example of Section 4.1 of an operator with not enough eigenvectors: contemplate \(\phi :=\phi _A\in L(\C ^2)\) where

\begin{equation*} A= \begin{pmatrix} 0&1\\0&0 \end {pmatrix}. \end{equation*}

We know that \(\phi \) has only zero as eigenvalue and the corresponding eigenspace \(E_{\phi }(0)=\Span {(1,0)}\neq \C ^2\). However, \(A^2=0\) so that \(\ker (\phi -0\id )^2=\C ^2\).

This gives us a new idea: for \(\phi \in L(V)\) and \(\lambda \in \F \) look for non-zero \(v\in V\) such that

\begin{equation*} (\phi -\lambda \id )^k(v)=0, \end{equation*}

for some \(k\in \N \). Observe that this means that \((\phi -\lambda \id )^k\) is not injective (it has non-trivial kernel) so that \(\phi -\lambda \id \) is not injective either (and so has non-trivial kernel) and therefore \(\lambda \) is an eigenvalue of \(\phi \).

This prompts:

  • Definition. Let \(\phi \in L(V)\) be a linear operator on a vector space over a field \(\F \). A generalised eigenvector of \(\phi \) with eigenvalue \(\lambda \) is \(v\in V\) such that, for some \(k\in \N \),

    \begin{equation} \label {eq:6} (\phi -\lambda \id )^k(v)=0. \end{equation}

    The set of all such is called the generalised eigenspace of \(\phi \) with eigenvalue \(\lambda \) and denoted \(G_{\phi }(\lambda )\). Thus

    \begin{equation*} G_{\phi }(\lambda )=\set {v\in V\st (\phi -\lambda \id )^k(v)=0, \text {for some $k\in \N $}}= \bigcup _{k\in \N }\ker (\phi -\lambda \id _{V})^{k}. \end{equation*}

  • Lemma 4.9. \(E_{\phi }(\lambda )\leq G_{\phi }(\lambda )\leq V\) and \(G_{\phi }(\lambda )\) is \(\phi \)-invariant.

  • Proof. There are three things to prove:

    • (1) \(E_{\phi }(\lambda )\sub G_{\phi }(\lambda )\). This is clear: just take \(k=1\) in the definition of \(G_{\phi }(\lambda )\).

    • (2) \(G_{\phi }(\lambda )\) is a subspace. Let \(v,w\in G_{\phi }(\lambda )\) so that there are \(k_1,k_2\in \N \) with

      \begin{equation*} (\phi -\lambda \id )^{k_1}(v)=0=(\phi -\lambda \id )^{k_2}(w). \end{equation*}

      Thus, by Proposition 4.6, \(v,w\in \ker (\phi -\lambda \id )^k\), where \(k=\max \set {k_{1},k_2}\), whence \(v+\mu w\in \ker (\phi -\lambda \id )^k\sub G_{\phi }(\lambda )\), for all \(\mu \in \F \). Thus \(G_{\phi }(\lambda )\leq V\).

    • (3) \(G_{\phi }(\lambda )\) is \(\phi \)-invariant. For \(v\in G_{\phi }(\lambda )\) with \((\phi -\lambda \id )^k(v)=0\), we have

      \begin{equation*} (\phi -\lambda \id )^k(\phi (v))=\phi ((\phi -\lambda \id )^k(v))=\phi (0)=0. \end{equation*}

      Thus \(\phi (v)\in G_{\phi }(\lambda )\) also.

     □

  • Exercise.7 Let \(U_1\leq U_2\leq \dots \) be an increasing sequence of subspaces of \(V\): thus, \(U_m\leq U_n\) whenever \(m\leq n\). Use the argument of the second part of the proof of Lemma 4.9 to show that \(\bigcup _{n\in \N }U_n\leq V\).

7 Exercise sheet 6, question 2

  • Remark. When \(V\) is finite-dimensional with \(\dim V=n\), we can simplify somewhat by observing that, thanks to Corollary 4.7, we have

    \begin{equation*} \ker (\phi -\lambda \id )^k\leq \ker (\phi -\lambda \id )^n, \end{equation*}

    for all \(k\in \N \). It follows at once that

    \begin{equation*} G_{\phi }(\lambda )=\ker (\phi -\lambda \id )^n \end{equation*}

    which makes most of Lemma 4.9 almost obvious.

  • Lemma 4.10. Let \(\phi \in L(V)\) be a linear operator on a vector space over \(\F \) and \(\lambda _1,\lambda _2\in \F \) distinct eigenvalues of \(\phi \).

    Then \(G_{\phi }(\lambda _1)\cap G_{\phi }(\lambda _2)=\set 0\).

  • Proof. If \(v\in V\) is an eigenvector for \(\lambda _1\) and \(k\in \N \), we have from Proposition 3.10 that \((\phi -\lambda _2\id )^k(v)=(\lambda _1-\lambda _2)^kv\neq 0\). We conclude that \((\phi -\lambda _1\id )\restr {G_{\phi }(\lambda _2)}\) is injective so that \((\phi -\lambda _{1}\id )^k\restr {G_{\phi }(\lambda _2)}\) is injective also and therefore has trivial kernel. The result follows.  □

We now arrive at the promised generalisation of Proposition 4.5.

  • Theorem 4.11 (Jordan8 decomposition). Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over \(\C \) with distinct eigenvalues \(\lst \lambda 1k\). Then

    \begin{equation*} V=\bigoplus _{i=1}^kG_{\phi }(\lambda _i). \end{equation*}

8 Camille Jordan, 1838–1922.
  • Proof. We induct on \(n:=\dim V\).

    When \(n=1\), \(\phi =\lambda \id \), for some \(\lambda \in \C \), so that \(V=E_{\phi }(\lambda )=G_{\phi }(\lambda )\). This settles the base case.

    For the induction step, suppose that the theorem holds for spaces of dimension \(<n\) and that \(\dim V=n\). Now, by Theorem 3.9, \(\phi \) has an eigenvalue \(\lambda _1\), say (this is where we use \(\F =\C \)). Then \(G_{\phi }(\lambda _1)=\ker (\phi -\lambda _1\id )^n\) so that, by Theorem 4.8, we have

    \begin{equation*} V=G_{\phi }(\lambda _1)\oplus \im (\phi -\lambda _1\id )^n. \end{equation*}

    Set \(U:=\im (\phi -\lambda _1\id )^{n}\) and note that \(\dim U<n\) so that the theorem applies to \(\phi \restr {U}\):

    \begin{equation*} U=\bigoplus _{i=1}^{\ell }G_{\phi \restr {U}}(\mu _i), \end{equation*}

    where \(\lst \mu 1\ell \) are the distinct eigenvalues of \(\phi \restr {U}\).

    We are therefore done if we can show \(\set {\lst \mu 1\ell }=\set {\lst {\lambda }2k}\) and

    \begin{equation*} G_{\phi \restr {U}}(\lambda _j)=G_{\phi }(\lambda _j), \end{equation*}

    for \(\bw 2jk\). Now, thanks to Proposition 4.4, for \(j\neq 1\),

    \begin{equation*} G_{\phi }(\lambda _j)=(G_{\phi }(\lambda _j)\cap G_{\phi }(\lambda _1))\oplus (G_{\phi }(\lambda _j)\cap U). \end{equation*}

    However, the first summand is zero by Lemma 4.10 so that we conclude \(G_{\phi }(\lambda _j)\leq U\). In particular \(\set {\lst \lambda 2k}\sub \set {\lst \mu 1\ell }\).

    Meanwhile \(E_{\phi }(\lambda _1)\cap U=\set 0\) so that \(\lambda _1\neq \set {\lst \mu 1\ell }\).

    Putting this together, we conclude that \(\set {\lst \lambda 2k}=\set {\lst \mu 1\ell }\) and \(G_{\phi \restr {U}}(\lambda _j)=G_{\phi }(\lambda _j)\), for \(\bw 2jk\). Thus \(V=G_{\phi }(\lambda _1)\oplus (\bigoplus _{j=2}^kG_{\phi }(\lambda _j))\) and the theorem holds for \(V\) of dimension \(n\). Induction does the rest.  □

Let us summarise the situation. With \(V_i=G_{\phi }(\lambda _i)\) and \(\phi _i=\phi \restr {V_i}\), we have \(V=\oplst {V}1k\) and

\begin{equation*} \phi _i=\lambda _i\id _{V_i}+N_i, \end{equation*}

where we have set \(N_i=\phi _i-\lambda _i\id _{V_i}\in L(V_i)\). The key point is that \(N_i^n=0\) which prompts some terminology.

  • Definition. A linear operator \(\phi \) on a vector space \(V\) is nilpotent if \(\phi ^k=0\), for some \(k\in \N \). or, equivalently, if \(\ker \phi ^k=V\).

  • Remark. If \(V\) is finite-dimensional, we may take \(k=\dim V\) by Corollary 4.7.

Our remaining task is to understand nilpotent operators. As a useful first pass at this, we have:

  • Proposition 4.12. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space \(V\) over \(\F \).

    Then \(\phi \) is nilpotent if and only if there is a basis with respect to which \(\phi \) has a strictly upper triangular matrix \(A\) (thus \(A_{ij}=0\) whenever \(i\geq j\)):

    \begin{equation*} A= \begin{pmatrix} 0&&*\\&\ddots &\\0&&0 \end {pmatrix}. \end{equation*}

  • Proof. Begin by observing that \(\phi \) has strictly upper triangular matrix with respect to \(\cB :\lst {v}1n\) if and only if \(\phi (v_{1})=0\) and \(\phi (v_j)\in \Span {\lst {v}1{j-1}}\), for \(j>1\).

    Thus, if \(\phi \) has strictly upper triangular matrix \(A\in M_n(\F )\) with respect \(\lst {v}1n\), we can iterate to see that \(\phi ^k\) vanishes on \(\lst {v}1k\) and \(\phi ^k(v_j)\in \Span {\lst {v}1{j-k}}\), for \(j>k\). In particular \(\phi ^{n}=0\). Alternatively, \(A^k\) has zeros on the first \(k-1\) super-diagonals:

    \begin{equation*} A^k= \begin{pmatrix} 0&\dots &0&&* \\ &\ddots &&\ddots &\\ &&\ddots &&0\\ &&&\ddots &\vdots \\ 0&&&&0 \end {pmatrix}. \end{equation*}

    In particular, \(A^n=0\) so that \(\phi ^n=0\) also.

    For the converse, if \(\phi \) is nilpotent, we consider the subspaces

    \begin{equation*} \set 0\leq \ker \phi \leq \ker \phi ^2\leq \dots \leq \ker \phi ^{\dim V}=V. \end{equation*}

    Note that, if \(v\in \ker \phi ^k\), \(0=\phi ^k(v)=\phi ^{k-1}(\phi (v))\) so that \(\phi (v)\in \ker \phi ^{k-1}\), for \(k\geq 1\).

    Now take a basis \(\lst {v}1\ell \) of \(\ker \phi \), extend it successively to one of \(\ker \phi ^k\), for each \(k\), until we arrive at a basis \(\lst {v}1n\) of \(V\) with the property that each \(\phi (v_{j})\in \Span {\lst {v}1{j-1}}\). This means precisely that the matrix of \(\phi \) with respect to \(\lst {v}1n\) is strictly upper triangular.  □

Apply Proposition 4.12 to each \(N_i\) to get a basis of \(V_i\) for which \(\phi _i\) has a matrix of the form

\begin{equation*} \begin{pmatrix} \lambda _i&&*\\ &\ddots &\\ 0&&\lambda _i \end {pmatrix} \end{equation*}

so that, in particular, \(\Delta _{\phi _i}=(\lambda _i-x)^{\dim V_i}\). In view of Proposition 4.4(4), we conclude that

\begin{equation*} \Delta _\phi =\prod _{i=1}^k\Delta _{\phi _i}=\pm \prod _{i=1}(x-\lambda _i)^{\dim V_i}. \end{equation*}

Otherwise said, \(\am (\lambda _i)=\dim V_i\) and we have proved:

  • Proposition 4.13. Let \(\lambda \in \C \) be an eigenvalue of a linear operator \(\phi \) on a complex finite-dimensional vector space. Then

    \begin{equation*} \am (\lambda )=\dim G_{\phi }(\lambda ). \end{equation*}

  • Remark. Since \(E_{\phi }(\lambda )\leq G_{\phi }(\lambda )\), this explains the Algebra 1B result9that \(\gm (\lambda )\leq \am (\lambda )\).

9 Proposition 3.4.6

Finally, we can say something useful about the minimal polynomial of \(\phi \): it is the product of the minimal polynomials of the \(\phi _i\):

  • Proposition 4.14. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over \(\C \) with distinct eigenvalues \(\lst \lambda 1k\). Set \(\phi _i=\phi \restr {G_{\phi }(\lambda _i)}\). Then

    • (1) Each \(m_{\phi _i}=(x-\lambda _i)^{s_i}\), for some \(s_i\leq \dim G_{\phi }(\lambda _i)\).

    • (2) \(m_{\phi }=\prod _{i=1}^km_{\phi _i}=\prod _{i=1}^k(x-\lambda _i)^{s_i}\).

  • Proof. We know from Corollary 3.13(1) that \(m_{\phi _i}\) divides \(\Delta _{\phi _i}=(\lambda _i-x)^{\dim G_{\phi }(\lambda _i)}\) so (1) is immediate.

    For (2), let \(p=\prod _{i=1}^k(x-\lambda _i)^{s_i}\). Then \(p(\phi )=\bigoplus _{i=1}^kp(\phi _i)=0\) since each \(p(\phi _i)=0\). Thus \(m_{\phi }\) divides \(p\) and we see conclude that

    \begin{equation*} m_{\phi }=\prod _{i=1}^k(x-\lambda _i)^{t_i}, \end{equation*}

    with each \(\bw 1{t_i}{s_i}\).

    On the other hand, each \(m_{\phi _i}=(x-\lambda )^{s_i}\) divides \(m_{\phi }\) since \(m_{\phi }(\phi _i)=m_{\phi }(\phi )\restr {V_i}=0\). Thus \(s_i\leq t_i\), for \(\bw 1ik\), and \(m_{\phi }=p\).  □

As a corollary, we get an efficient (in the sense of low powers of \((\phi -\lambda _i\id _V)\)) expression for \(G_{\phi }(\lambda _i)\):

  • Corollary 4.15. Let \(\phi \in L(V)\) be a linear operator with minimum polynomial \(\prod _{i=1}^k(x-\lambda _i)^{s_i}\). Then

    \begin{equation*} G_{\phi }(\lambda _i)=\ker (\phi -\lambda _i\id _V)^{s_i}. \end{equation*}

  • Proof. By definition, \(\ker (\phi -\lambda _i\id _V)^{s_i}\leq G_{\phi }(\lambda _i)\). On the other hand, with \(V_i=G_{\phi }(\lambda _i)\) and \(\phi _i=\phi \restr {V_i}\), we know that \(0=m_{\phi _i}(\phi _i)=(\phi _i-\lambda _i\id _{V_i})^{s_{i}}\). Otherwise said, \((\phi -\lambda _i\id _V)^{s_i}\restr {V_i}=0\) so that \(G_{\phi }(\lambda _i)\leq \ker (\phi -\lambda _{i}\id _V)^{s_i}\).  □

  • Example. Let \(\phi =\phi _A\in L(\C ^3)\) where

    \begin{equation*} A= \begin{pmatrix} 1&1&1\\0&1&1\\0&0&2 \end {pmatrix}. \end{equation*}

    Find \(m_{\phi }\), the eigenspaces and generalised eigenspaces of \(\phi \).

    Solution: \(A\) being upper triangular, we see at once that \(\Delta _\phi =\Delta _A=(1-x)^2(2-x)\) so that \(m_A\) is either \((x-1)(x-2)\) or \((x-1)^2(x-2)\) by Corollary 3.13. We check the first possibility:

    \begin{equation*} (A-I_3)(A-2I_3)= \begin{pmatrix} 0&1&1\\0&0&1\\0&0&1 \end {pmatrix} \begin{pmatrix*}[r] -1&1&1\\0&-1&1\\0&0&0 \end {pmatrix*} = \begin{pmatrix*}[r] 0&-1&1\\0&0&0\\0&0&0 \end {pmatrix*}\neq 0. \end{equation*}

    We conclude that \(m_{\phi }=(x-1)^{2}(x-2)\) and immediately deduce from Corollary 4.15 that \(G_{\phi }(1)=\ker (\phi -\id )^{2}\) while \(G_{\phi }(2)=\ker (\phi -2\id )=E_{\phi }(2)\).

    It remains to compute these:

    \begin{align*} E_{\phi }(1)&=\ker (\phi -\id )=\ker \begin{pmatrix} 0&1&1\\0&0&1\\0&0&1 \end {pmatrix}=\Span {(1,0,0)}\\ G_{\phi }(1)&=\ker (\phi -\id )^{2}=\ker \begin{pmatrix} 0&1&1\\0&0&1\\0&0&1 \end {pmatrix}^{2}= \ker \begin{pmatrix} 0&0&2\\0&0&1\\0&0&1 \end {pmatrix}=\Span {(1,0,0),(0,1,0)}\\ E_{\phi }(2)&=G_{\phi }(2)=\ker (\phi -2\id )=\ker \begin{pmatrix*}[r] -1&1&1\\0&-1&1\\0&0&0 \end {pmatrix*}=\Span {(2,1,1)}. \end{align*}