\(\newcommand{\footnotename}{footnote}\)
\(\def \LWRfootnote {1}\)
\(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\let \LWRorighspace \hspace \)
\(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\)
\(\newcommand {\mathnormal }[1]{{#1}}\)
\(\newcommand \ensuremath [1]{#1}\)
\(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \)
\(\newcommand {\setlength }[2]{}\)
\(\newcommand {\addtolength }[2]{}\)
\(\newcommand {\setcounter }[2]{}\)
\(\newcommand {\addtocounter }[2]{}\)
\(\newcommand {\arabic }[1]{}\)
\(\newcommand {\number }[1]{}\)
\(\newcommand {\noalign }[1]{\text {#1}\notag \\}\)
\(\newcommand {\cline }[1]{}\)
\(\newcommand {\directlua }[1]{\text {(directlua)}}\)
\(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\)
\(\newcommand {\protect }{}\)
\(\def \LWRabsorbnumber #1 {}\)
\(\def \LWRabsorbquotenumber "#1 {}\)
\(\newcommand {\LWRabsorboption }[1][]{}\)
\(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\)
\(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\)
\(\def \mathcode #1={\mathchar }\)
\(\let \delcode \mathcode \)
\(\let \delimiter \mathchar \)
\(\let \LWRref \ref \)
\(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\)
\(\newcommand {\intertext }[1]{\text {#1}\notag \\}\)
\(\let \Hat \hat \)
\(\let \Check \check \)
\(\let \Tilde \tilde \)
\(\let \Acute \acute \)
\(\let \Grave \grave \)
\(\let \Dot \dot \)
\(\let \Ddot \ddot \)
\(\let \Breve \breve \)
\(\let \Bar \bar \)
\(\let \Vec \vec \)
\( \newcommand {\multicolumn }[3]{#3}\)
\(\newcommand {\mathllap }[2][]{{#1#2}}\)
\(\newcommand {\mathrlap }[2][]{{#1#2}}\)
\(\newcommand {\mathclap }[2][]{{#1#2}}\)
\(\newcommand {\mathmbox }[1]{#1}\)
\(\newcommand {\clap }[1]{#1}\)
\(\newcommand {\LWRmathmakebox }[2][]{#2}\)
\(\newcommand {\mathmakebox }[1][]{\LWRmathmakebox }\)
\(\newcommand {\cramped }[2][]{{#1#2}}\)
\(\newcommand {\crampedllap }[2][]{{#1#2}}\)
\(\newcommand {\crampedrlap }[2][]{{#1#2}}\)
\(\newcommand {\crampedclap }[2][]{{#1#2}}\)
\(\newenvironment {crampedsubarray}[1]{}{}\)
\(\newcommand {\crampedsubstack }{}\)
\(\newcommand {\smashoperator }[2][]{#2\limits }\)
\(\newcommand {\adjustlimits }{}\)
\(\newcommand {\SwapAboveDisplaySkip }{}\)
\(\require {extpfeil}\)
\(\Newextarrow \xleftrightarrow {10,10}{0x2194}\)
\(\Newextarrow \xLeftarrow {10,10}{0x21d0}\)
\(\Newextarrow \xhookleftarrow {10,10}{0x21a9}\)
\(\Newextarrow \xmapsto {10,10}{0x21a6}\)
\(\Newextarrow \xRightarrow {10,10}{0x21d2}\)
\(\Newextarrow \xLeftrightarrow {10,10}{0x21d4}\)
\(\Newextarrow \xhookrightarrow {10,10}{0x21aa}\)
\(\Newextarrow \xrightharpoondown {10,10}{0x21c1}\)
\(\Newextarrow \xleftharpoondown {10,10}{0x21bd}\)
\(\Newextarrow \xrightleftharpoons {10,10}{0x21cc}\)
\(\Newextarrow \xrightharpoonup {10,10}{0x21c0}\)
\(\Newextarrow \xleftharpoonup {10,10}{0x21bc}\)
\(\Newextarrow \xleftrightharpoons {10,10}{0x21cb}\)
\(\newcommand {\LWRdounderbracket }[3]{\mathinner {\underset {#3}{\underline {\llcorner {#1}\lrcorner }}}}\)
\(\newcommand {\LWRunderbracket }[2][]{\LWRdounderbracket {#2}}\)
\(\newcommand {\underbracket }[1][]{\LWRunderbracket }\)
\(\newcommand {\LWRdooverbracket }[3]{\mathinner {\overset {#3}{\overline {\ulcorner {#1}\urcorner }}}}\)
\(\newcommand {\LWRoverbracket }[2][]{\LWRdooverbracket {#2}}\)
\(\newcommand {\overbracket }[1][]{\LWRoverbracket }\)
\(\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}\)
\(\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}\)
\(\newenvironment {matrix*}[1][]{\begin {matrix}}{\end {matrix}}\)
\(\newenvironment {pmatrix*}[1][]{\begin {pmatrix}}{\end {pmatrix}}\)
\(\newenvironment {bmatrix*}[1][]{\begin {bmatrix}}{\end {bmatrix}}\)
\(\newenvironment {Bmatrix*}[1][]{\begin {Bmatrix}}{\end {Bmatrix}}\)
\(\newenvironment {vmatrix*}[1][]{\begin {vmatrix}}{\end {vmatrix}}\)
\(\newenvironment {Vmatrix*}[1][]{\begin {Vmatrix}}{\end {Vmatrix}}\)
\(\newenvironment {smallmatrix*}[1][]{\begin {matrix}}{\end {matrix}}\)
\(\newenvironment {psmallmatrix*}[1][]{\begin {pmatrix}}{\end {pmatrix}}\)
\(\newenvironment {bsmallmatrix*}[1][]{\begin {bmatrix}}{\end {bmatrix}}\)
\(\newenvironment {Bsmallmatrix*}[1][]{\begin {Bmatrix}}{\end {Bmatrix}}\)
\(\newenvironment {vsmallmatrix*}[1][]{\begin {vmatrix}}{\end {vmatrix}}\)
\(\newenvironment {Vsmallmatrix*}[1][]{\begin {Vmatrix}}{\end {Vmatrix}}\)
\(\newenvironment {psmallmatrix}[1][]{\begin {pmatrix}}{\end {pmatrix}}\)
\(\newenvironment {bsmallmatrix}[1][]{\begin {bmatrix}}{\end {bmatrix}}\)
\(\newenvironment {Bsmallmatrix}[1][]{\begin {Bmatrix}}{\end {Bmatrix}}\)
\(\newenvironment {vsmallmatrix}[1][]{\begin {vmatrix}}{\end {vmatrix}}\)
\(\newenvironment {Vsmallmatrix}[1][]{\begin {Vmatrix}}{\end {Vmatrix}}\)
\(\newcommand {\LWRmultlined }[1][]{\begin {multline*}}\)
\(\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}\)
\(\let \LWRorigshoveleft \shoveleft \)
\(\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }\)
\(\let \LWRorigshoveright \shoveright \)
\(\renewcommand {\shoveright }[1][]{\LWRorigshoveright }\)
\(\newenvironment {dcases}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {dcases*}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {rcases}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {rcases*}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {drcases}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {drcases*}{\begin {cases}}{\end {cases}}\)
\(\newenvironment {cases*}{\begin {cases}}{\end {cases}}\)
\(\newcommand {\MoveEqLeft }[1][]{}\)
\(\def \LWRAboxed #1!|!{\fbox {\(#1\)}&\fbox {\(#2\)}} \newcommand {\Aboxed }[1]{\LWRAboxed #1&&!|!} \)
\( \newcommand {\LWRABLines }[1][\Updownarrow ]{#1 \notag \\}\newcommand {\ArrowBetweenLines }{\ifstar \LWRABLines \LWRABLines } \)
\(\newcommand {\shortintertext }[1]{\text {#1}\notag \\}\)
\(\newcommand {\vdotswithin }[1]{\hspace {.5em}\vdots }\)
\(\newcommand {\LWRshortvdotswithinstar }[1]{\vdots \hspace {.5em} & \\}\)
\(\newcommand {\LWRshortvdotswithinnostar }[1]{& \hspace {.5em}\vdots \\}\)
\(\newcommand {\shortvdotswithin }{\ifstar \LWRshortvdotswithinstar \LWRshortvdotswithinnostar }\)
\(\newcommand {\MTFlushSpaceAbove }{}\)
\(\newcommand {\MTFlushSpaceBelow }{\\}\)
\(\newcommand \lparen {(}\)
\(\newcommand \rparen {)}\)
\(\newcommand {\ordinarycolon }{:}\)
\(\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}\)
\(\newcommand \dblcolon {\mathrel {\unicode {x2237}}}\)
\(\newcommand \coloneqq {\mathrel {\unicode {x2236}\!=}}\)
\(\newcommand \Coloneqq {\mathrel {\unicode {x2237}\!=}}\)
\(\newcommand \coloneq {\mathrel {\unicode {x2236}-}}\)
\(\newcommand \Coloneq {\mathrel {\unicode {x2237}-}}\)
\(\newcommand \eqqcolon {\mathrel {=\!\unicode {x2236}}}\)
\(\newcommand \Eqqcolon {\mathrel {=\!\unicode {x2237}}}\)
\(\newcommand \eqcolon {\mathrel {-\unicode {x2236}}}\)
\(\newcommand \Eqcolon {\mathrel {-\unicode {x2237}}}\)
\(\newcommand \colonapprox {\mathrel {\unicode {x2236}\!\approx }}\)
\(\newcommand \Colonapprox {\mathrel {\unicode {x2237}\!\approx }}\)
\(\newcommand \colonsim {\mathrel {\unicode {x2236}\!\sim }}\)
\(\newcommand \Colonsim {\mathrel {\unicode {x2237}\!\sim }}\)
\(\newcommand {\nuparrow }{\mathrel {\cancel {\uparrow }}}\)
\(\newcommand {\ndownarrow }{\mathrel {\cancel {\downarrow }}}\)
\(\newcommand {\bigtimes }{\mathop {\Large \times }\limits }\)
\(\newcommand {\prescript }[3]{{}^{#1}_{#2}#3}\)
\(\newenvironment {lgathered}{\begin {gathered}}{\end {gathered}}\)
\(\newenvironment {rgathered}{\begin {gathered}}{\end {gathered}}\)
\(\newcommand {\splitfrac }[2]{{}^{#1}_{#2}}\)
\(\let \splitdfrac \splitfrac \)
\(\newcommand {\bigzero }{\smash {\text {\huge 0}}}\)
\(\newcommand {\am }{\mathrm {am}}\)
\(\newcommand {\gm }{\mathrm {gm}}\)
\(\newcommand {\id }{\operatorname {id}}\)
\(\newcommand {\GL }{\operatorname {GL}}\)
\(\newcommand {\im }{\operatorname {im}}\)
\(\newcommand {\rank }{\operatorname {rank}}\)
\(\newcommand {\sol }{\operatorname {sol}}\)
\(\newcommand {\ann }{\operatorname {ann}}\)
\(\newcommand {\rO }{\operatorname {O}}\)
\(\newcommand {\rU }{\operatorname {U}}\)
\(\newcommand {\rSU }{\operatorname {SU}}\)
\(\newcommand {\ev }{\operatorname {ev}}\)
\(\newcommand {\bil }{\operatorname {Bil}}\)
\(\newcommand {\rad }{\operatorname {rad}}\)
\(\newcommand {\Span }[1]{\operatorname {span}\{#1\}}\)
\(\newcommand {\R }{\mathbb {R}}\)
\(\newcommand {\C }{\mathbb {C}}\)
\(\newcommand {\Z }{\mathbb {Z}}\)
\(\newcommand {\F }{\mathbb {F}}\)
\(\newcommand {\Q }{\mathbb {Q}}\)
\(\newcommand {\N }{\mathbb {N}}\)
\(\renewcommand {\P }{\mathbb {P}}\)
\(\newcommand {\I }{\mathrm {I}}\)
\(\newcommand {\half }{\tfrac 12}\)
\(\newcommand {\rel }{\mathrel {\mathrm {rel}}}\)
\(\renewcommand {\vec }[3]{(#1_{#2},\dots ,#1_{#3})}\)
\(\newcommand {\lst }[3]{#1_{#2},\dots ,#1_{#3}}\)
\(\newcommand {\plst }[3]{#1_{#2}+\dots +#1_{#3}}\)
\(\newcommand {\oplst }[3]{#1_{#2}\oplus \dots \oplus #1_{#3}}\)
\(\newcommand {\pplst }[3]{#1_{#2}\times \dots \times #1_{#3}}\)
\(\newcommand {\dlst }[3]{\lst {#1^{*}}{#2}{#3}}\)
\(\newcommand {\dlc }[4]{\lc #1{#2^{*}}#3#4}\)
\(\newcommand {\hmg }[3]{[#1_{#2},\dots ,#1_{#3}]}\)
\(\newcommand {\rng }[2]{#1,\dots ,#2}\)
\(\newcommand {\lc }[4]{#1_{#3}#2_{#3}+\dots +#1_{#4}#2_{#4}}\)
\(\newcommand {\plus }[2]{#1+\dots +#2}\)
\(\newcommand {\set }[1]{\{#1\}}\)
\(\newcommand {\abs }[1]{\lvert #1\rvert }\)
\(\newcommand {\ip }[1]{\langle #1\rangle }\)
\(\newcommand {\norm }[1]{\|#1\|}\)
\(\newcommand {\bx }{\mathbf {x}}\)
\(\newcommand {\be }{\mathbf {e}}\)
\(\newcommand {\bq }{\mathbf {q}}\)
\(\newcommand {\bu }{\mathbf {u}}\)
\(\newcommand {\by }{\mathbf {y}}\)
\(\newcommand {\bv }{\mathbf {v}}\)
\(\newcommand {\E }{\mathbb {E}}\)
\(\newcommand {\cI }{\mathcal {I}}\)
\(\newcommand {\cB }{\mathcal {B}}\)
\(\newcommand {\sub }{\subseteq }\)
\(\newcommand {\st }{\mathrel {|}}\)
\(\newcommand {\bw }[3]{#1\leq #2\leq #3}\)
\(\newcommand {\col }[3]{(#1_{#2})_{#2\in #3}}\)
\(\newcommand {\supp }{\mathrm {supp}}\)
\(\newcommand {\restr }[1]{_{|#1}}\)
3.5 The Cayley–Hamilton theorem
-
Theorem 3.12 (Cayley–Hamilton2 Theorem). Let \(\phi \in L(V)\) be a
linear operator on a finite-dimensional vector space over a field \(\F \).
Then \(\Delta _{\phi }(\phi )=0\).
Equivalently, for any \(A\in M_n(\F )\), \(\Delta _A(A)=0\).
Before proving this, let us see what it tells us. Let
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
A= \begin{pmatrix} a&b\\c&d \end {pmatrix}\in M_2(\F ).
\end{equation*}
Then
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
\Delta _A= \begin{vmatrix} a-x&b\\c&d-x \end {vmatrix}=x^2-(a+d)x+(ad-bc).
\end{equation*}
So the Cayley–Hamilton theorem is telling us that
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
A^2-(a+d)A+(ad-bc)I_2=0,
\end{equation*}
that is,
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
\begin{pmatrix} a^2+bc&ab+bd\\ca+dc&cb+d^2 \end {pmatrix} -(a+d) \begin{pmatrix} a&b\\c&d \end {pmatrix} + \begin{pmatrix} ad-bc&0\\0&ad-bc \end {pmatrix} = \begin{pmatrix} 0&0\\0&0 \end {pmatrix}.
\end{equation*}
This is certainly true (check it!) but is far from obvious! If you are not yet convinced, work out what the theorem says for \(A\in M_3(\F )\).
-
Proof of Theorem 3.12. We will prove the matrix version. So let \(A\in M_n(\F )\) and write
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
\Delta _{A}=a_0+\dots +a_nx^n.
\end{equation*}
Thus, our mission is to show that
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
a_0I_n+a_1A+\dots +a_nA^n=0.
\end{equation*}
The key is the adjugate formula from Algebra 1B3:
\(\seteqnumber{0}{3.}{5}\)
\begin{equation}
\label {eq:9} \mathrm {adj}(A-xI_{n})(A-xI_{n})=\det (A-xI_{n})I_{n}.
\end{equation}
Each entry of \(\mathrm {adj}(A-xI_n)\) is a polynomial in \(x\) of degree at most \(n-1\) so we write
\(\seteqnumber{0}{3.}{6}\)
\begin{equation*}
\mathrm {adj}(A-xI_n)=B_0+B_1x+\dots +B_{n-1}x^{n-1},
\end{equation*}
with each \(B_k\in M_n(\F )\). Substitute this into (3.6) to get
\(\seteqnumber{0}{3.}{6}\)
\begin{equation*}
(B_0+B_1x+\dots +B_{n-1}x^{n-1})(A-xI)=(a_0+\dots +a_nx^n)I_n
\end{equation*}
and compare coefficients of \(x^{k}\) to get
\(\seteqnumber{0}{3.}{6}\)
\begin{equation}
\label {eq:10} B_kA-B_{k-1}=a_kI_{n},
\end{equation}
for \(0\leq k\leq n\), where we have set \(B_{-1}=B_n=0\in M_n(\F )\).
Multiply (3.7) by \(A^k\) on the right to get
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
B_kA^{k+1}-B_{k-1}A^k=a_kA^k
\end{equation*}
and sum:
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
\Delta _{A}(A)=\sum _{k=0}^na_kA^k=\sum _{k=0}^n(B_kA^{k+1}-B_{k-1}A^k)=B_nA^{n+1}-B_{-1}=0
\end{equation*}
because nearly all terms in the penultimate sum cancel. □
-
Proof. By Theorem 3.12, \(\Delta _{\phi }(\phi )=0\) so \(m_{\phi }\) divides \(\Delta _{\phi }\) by Proposition 3.7. As a result, any root of
\(m_{\phi }\) is a root of \(\Delta _{\phi }\) and so an eigenvalue. Conversely, any eigenvalue is a root of \(m_{\phi }\) by Corollary 3.11. □
Let us summarise the situation when \(\F =\C \) so that any polynomial is a product of linear factors. So let \(\phi \in L(V)\) be a linear operator on a finite-dimensional complex vector space with distinct eigenvalues \(\lst \lambda 1k\). Then
\(\seteqnumber{0}{3.}{7}\)
\begin{align*}
\Delta _{\phi }&=\pm \prod _{i=1}^k(x-\lambda _i)^{r_i}\\ m_{\phi }&=\prod _{i=1}^k(x-\lambda _i)^{s_i},
\end{align*}
where \(r_i=\am (\lambda _i)\) and \(\bw 1{s_i}{r_i}\), for \(\bw 1ik\).
This gives us a new method for computing the minimum polynomial of an operator so long as we can factorise its characteristic polynomial. If \(\Delta _{\phi }=\prod _{i=1}^k(x-\lambda _i)^{r_i}\), we try \(p=\prod _{i=1}^k(x-\lambda _{i})^{a_i}\), for \(\bw
1{a_i}{r_i}\), in increasing degree, until we find one with \(p(\phi )=0\).
-
Examples.
-
(1) Find \(m_A\) where
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
A= \begin{pmatrix} 1&1&2\\0&1&1\\0&0&2 \end {pmatrix}.
\end{equation*}
For any upper triangular matrix, the determinant is just the product of the diagonal entries so \(\Delta _A=-(x-1)^2(x-2)\). This gives just two possibilities for \(m_A\): it is either \((x-1)(x-2)\) or \((x-1)^2(x-2)\). We try the degree \(2\) candidate:
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
(A-I_3)(A-2I_3)= \begin{pmatrix} 0&1&2\\0&0&1\\0&0&1 \end {pmatrix} \begin{pmatrix*}[r] -1&1&2\\0&-1&1\\0&0&0 \end {pmatrix*}= \begin{pmatrix*}[r]
0&-1&1\\0&0&0\\0&0&0 \end {pmatrix*}\neq 0.
\end{equation*}
Thus we must have \(m_A=(x-1)^2(x-2)\).
-
(2) Repeat the analysis when
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
A= \begin{pmatrix} 1&0&3\\0&1&2\\0&0&2 \end {pmatrix}.
\end{equation*}
Again we have \(\Delta _A=-(x-1)^2(x-2)\) and so the same two candidates for \(m_A\). This time, however,
\(\seteqnumber{0}{3.}{7}\)
\begin{equation*}
(A-I_3)(A-2I_3)= \begin{pmatrix} 0&0&3\\0&0&2\\0&0&1 \end {pmatrix} \begin{pmatrix*}[r] -1&0&3\\0&-1&2\\0&0&0 \end {pmatrix*}=0.
\end{equation*}
Thus \(m_A=(x-1)(x-2)\).