\(\newcommand{\footnotename}{footnote}\)
\(\def \LWRfootnote {1}\)
\(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\let \LWRorighspace \hspace \)
\(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\)
\(\newcommand {\mathnormal }[1]{{#1}}\)
\(\newcommand \ensuremath [1]{#1}\)
\(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \)
\(\newcommand {\setlength }[2]{}\)
\(\newcommand {\addtolength }[2]{}\)
\(\newcommand {\setcounter }[2]{}\)
\(\newcommand {\addtocounter }[2]{}\)
\(\newcommand {\arabic }[1]{}\)
\(\newcommand {\number }[1]{}\)
\(\newcommand {\noalign }[1]{\text {#1}\notag \\}\)
\(\newcommand {\cline }[1]{}\)
\(\newcommand {\directlua }[1]{\text {(directlua)}}\)
\(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\)
\(\newcommand {\protect }{}\)
\(\def \LWRabsorbnumber #1 {}\)
\(\def \LWRabsorbquotenumber "#1 {}\)
\(\newcommand {\LWRabsorboption }[1][]{}\)
\(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\)
\(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\)
\(\def \mathcode #1={\mathchar }\)
\(\let \delcode \mathcode \)
\(\let \delimiter \mathchar \)
\(\def \oe {\unicode {x0153}}\)
\(\def \OE {\unicode {x0152}}\)
\(\def \ae {\unicode {x00E6}}\)
\(\def \AE {\unicode {x00C6}}\)
\(\def \aa {\unicode {x00E5}}\)
\(\def \AA {\unicode {x00C5}}\)
\(\def \o {\unicode {x00F8}}\)
\(\def \O {\unicode {x00D8}}\)
\(\def \l {\unicode {x0142}}\)
\(\def \L {\unicode {x0141}}\)
\(\def \ss {\unicode {x00DF}}\)
\(\def \SS {\unicode {x1E9E}}\)
\(\def \dag {\unicode {x2020}}\)
\(\def \ddag {\unicode {x2021}}\)
\(\def \P {\unicode {x00B6}}\)
\(\def \copyright {\unicode {x00A9}}\)
\(\def \pounds {\unicode {x00A3}}\)
\(\let \LWRref \ref \)
\(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\)
\( \newcommand {\multicolumn }[3]{#3}\)
\(\require {textcomp}\)
\(\newcommand {\intertext }[1]{\text {#1}\notag \\}\)
\(\let \Hat \hat \)
\(\let \Check \check \)
\(\let \Tilde \tilde \)
\(\let \Acute \acute \)
\(\let \Grave \grave \)
\(\let \Dot \dot \)
\(\let \Ddot \ddot \)
\(\let \Breve \breve \)
\(\let \Bar \bar \)
\(\let \Vec \vec \)
\(\require {mathtools}\)
\(\newenvironment {crampedsubarray}[1]{}{}\)
\(\newcommand {\smashoperator }[2][]{#2\limits }\)
\(\newcommand {\SwapAboveDisplaySkip }{}\)
\(\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}\)
\(\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}\)
\(\newcommand {\LWRmultlined }[1][]{\begin {multline*}}\)
\(\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}\)
\(\let \LWRorigshoveleft \shoveleft \)
\(\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }\)
\(\let \LWRorigshoveright \shoveright \)
\(\renewcommand {\shoveright }[1][]{\LWRorigshoveright }\)
\(\newcommand {\shortintertext }[1]{\text {#1}\notag \\}\)
\(\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}\)
\(\newcommand {\bigzero }{\smash {\text {\huge 0}}}\)
\(\newcommand {\am }{\mathrm {am}}\)
\(\newcommand {\gm }{\mathrm {gm}}\)
\(\newcommand {\id }{\operatorname {id}}\)
\(\newcommand {\GL }{\operatorname {GL}}\)
\(\newcommand {\im }{\operatorname {im}}\)
\(\newcommand {\rank }{\operatorname {rank}}\)
\(\newcommand {\sol }{\operatorname {sol}}\)
\(\newcommand {\ann }{\operatorname {ann}}\)
\(\newcommand {\rO }{\operatorname {O}}\)
\(\newcommand {\rU }{\operatorname {U}}\)
\(\newcommand {\rSU }{\operatorname {SU}}\)
\(\newcommand {\ev }{\operatorname {ev}}\)
\(\newcommand {\bil }{\operatorname {Bil}}\)
\(\newcommand {\rad }{\operatorname {rad}}\)
\(\newcommand {\Span }[1]{\operatorname {span}\{#1\}}\)
\(\newcommand {\R }{\mathbb {R}}\)
\(\newcommand {\C }{\mathbb {C}}\)
\(\newcommand {\Z }{\mathbb {Z}}\)
\(\newcommand {\F }{\mathbb {F}}\)
\(\newcommand {\Q }{\mathbb {Q}}\)
\(\newcommand {\N }{\mathbb {N}}\)
\(\renewcommand {\P }{\mathbb {P}}\)
\(\newcommand {\I }{\mathrm {I}}\)
\(\newcommand {\half }{\tfrac 12}\)
\(\newcommand {\rel }{\mathrel {\mathrm {rel}}}\)
\(\renewcommand {\vec }[3]{(#1_{#2},\dots ,#1_{#3})}\)
\(\newcommand {\lst }[3]{#1_{#2},\dots ,#1_{#3}}\)
\(\newcommand {\plst }[3]{#1_{#2}+\dots +#1_{#3}}\)
\(\newcommand {\oplst }[3]{#1_{#2}\oplus \dots \oplus #1_{#3}}\)
\(\newcommand {\pplst }[3]{#1_{#2}\times \dots \times #1_{#3}}\)
\(\newcommand {\dlst }[3]{\lst {#1^{*}}{#2}{#3}}\)
\(\newcommand {\dlc }[4]{\lc #1{#2^{*}}#3#4}\)
\(\newcommand {\hmg }[3]{[#1_{#2},\dots ,#1_{#3}]}\)
\(\newcommand {\rng }[2]{#1,\dots ,#2}\)
\(\newcommand {\lc }[4]{#1_{#3}#2_{#3}+\dots +#1_{#4}#2_{#4}}\)
\(\newcommand {\plus }[2]{#1+\dots +#2}\)
\(\newcommand {\set }[1]{\{#1\}}\)
\(\newcommand {\abs }[1]{\lvert #1\rvert }\)
\(\newcommand {\ip }[1]{\langle #1\rangle }\)
\(\newcommand {\norm }[1]{\|#1\|}\)
\(\newcommand {\bx }{\mathbf {x}}\)
\(\newcommand {\be }{\mathbf {e}}\)
\(\newcommand {\bq }{\mathbf {q}}\)
\(\newcommand {\bu }{\mathbf {u}}\)
\(\newcommand {\by }{\mathbf {y}}\)
\(\newcommand {\bv }{\mathbf {v}}\)
\(\newcommand {\E }{\mathbb {E}}\)
\(\newcommand {\cI }{\mathcal {I}}\)
\(\newcommand {\cB }{\mathcal {B}}\)
\(\newcommand {\sub }{\subseteq }\)
\(\newcommand {\st }{\mathrel {|}}\)
\(\newcommand {\bw }[3]{#1\leq #2\leq #3}\)
\(\newcommand {\col }[3]{(#1_{#2})_{#2\in #3}}\)
\(\newcommand {\supp }{\mathrm {supp}}\)
\(\newcommand {\restr }[1]{_{|#1}}\)
3.3 The minimum polynomial
-
Proposition 3.5. Let \(A\in M_n(\F )\). Then there is a monic polynomial \(p\in \F [x]\) such that \(p(A)=0\).
Similarly, if \(\phi \in L(V)\) is a linear operator on a finite-dimensional vector space over \(\F \) then there is a monic polynomial \(p\in \F [x]\) with \(p(\phi )=0\).
-
Proof. We prove the result for \(A\) and then deduce that for \(\phi \).
We know that \(\dim M_n(\F )=n^2\) so that the \(n^2+1\) elements \(I_n,A,\dots ,A^{n^2}\) of \(M_n(\F )\) must be linearly dependent. We therefore have a linear relation
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
a_0I_n+\dots +a_{n^2}A^{n^2}=0
\end{equation*}
with not all \(a_k\) zero. Otherwise said, \(q(A)=0\), where
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
q=a_0+\dots +a_{n^2}x^{n^2}\in \F [x].
\end{equation*}
Let \(a_{m}\) be the leading term of \(q\) (\(m\) could be less than \(n^2\)). Then \(p:=q/a_m\) is a monic polynomial with \(p(A)=0\).
Now let \(\phi \in L(V)\) and let \(A\) be its matrix with respect to some basis. Let \(p\in \F [x]\) be a monic polynomial with \(p(A)=0\). Then \(p(\phi )=0\) also. □
This prompts:
-
Definition. A minimum polynomial for \(\phi \in L(V)\), \(V\) a vector space over \(\F \) is a monic polynomial \(p\in \F [x]\) of minimum degree with \(p(\phi )=0\): thus, if \(r\in \F [x]\) has
\(r(\phi )=0\) and \(\deg r< \deg p\), then \(r=0\).
Similarly, a minimum polynomial for \(A\in M_n(\F )\) is a monic polynomial \(p\) of least degree with \(p(A)=0\).
Minimum polynomials exist and are unique:
-
Theorem 3.6. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over a field \(\F \). Then \(\phi
\) has a unique minimum polynomial.
Similarly, any \(A\in M_n(\F )\) has a unique minimum polynomial.
We denote these by \(m_{\phi }\) and \(m_A\) respectively.
-
Proof. We prove this for \(\phi \). The argument for \(A\) is the same.
By Proposition 3.5, the set of non-zero polynomials which vanish on \(\phi \) is non-empty. Choose one of smallest degree and divide by the leading term if necessary to get a monic one. This settles existence.
For uniqueness, suppose that we have \(p_1,p_2\) in the set, both monic and of smallest degree. Set \(r=p_1-p_2\). Then \(\deg r<\deg p_{i}\), since the leading terms of the \(p_{i}\) cancel, while \(r(\phi )=p_1(\phi )-p_2(\phi )=0\). Thus \(r=0\) and
\(p_1=p_2\). □
-
Examples.
-
-
(1) \(m_0=x\).
-
(2) \(m_{\id _V}=x-1\).
-
(3) More generally, for \(\lambda \in \F \), \(m_{\lambda \id _V}=x-\lambda \). Thus \(\deg m_{\phi }=1\) if and only if \(\phi =\lambda \id _{V}\), for some \(\lambda \in \F \).
-
(4) Let \(\pi \in L(V)\) be a projection2 with \(0<\dim \ker \pi <\dim V\). Then \(m_{\pi }=x^2-x\) (exercise!).
How can we compute \(m_A\)? One method is to find it by brute force: for each \(k\geq 1\) in turn, seek \(\lst {a}0{k-1}\) such that
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
a_0I+\dots +a_{k-1}A^{k-1}+A^k=0.
\end{equation*}
This is \(n^2\) inhomogeneous linear equations in \(k\) unknowns. They are either inconsistent, in which case you move on to \(k+1\) or, the first time you find a solution, \(m_{A}=a_0+\dots +x^k\).
-
Examples.
-
-
(1) Find \(m_A\) where
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
A= \begin{pmatrix} 1&2\\3&4 \end {pmatrix}.
\end{equation*}
Solution. \(A\neq \lambda I\) so \(\deg m_{A}\geq 2\). First try to find \(a_0,a_1\) with \(a_0I+a_1A+A^2=0\). This expands out to
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
\begin{pmatrix} a_0+a_1+7&0+2a_1+10\\0+3a_1+15&a_0+4a_1+22 \end {pmatrix}=0
\end{equation*}
The equation in the \((1,2)\)-slot gives \(a_1=-5\) and then that in the \((1,1)\)-slot gives \(a_0=-2\). These also satisfy the other two equations and so \(m_A=-2-5x+x^2\).
-
(2) Find \(m_A\) where
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
A= \begin{pmatrix} 0&1&0\\0&0&1\\1&0&0 \end {pmatrix}.
\end{equation*}
Solution. We have
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
A^{2}= \begin{pmatrix} 0&0&1\\1&0&0\\0&1&0 \end {pmatrix}
\end{equation*}
so that the \((1,3)\)-slot of \(a_0I_3+a_1A+A^2=0\) gives the inconsistent equation \(a_00+a_10+1=0\) and we conclude that \(\deg m_A\) is at least three. Carrying on, we compute \(A^3\) and find that \(A^3=I_3\) which short-circuits the whole story: \(A^3-I_3=0\) so
that \(m_A=x^3-1\).
We will see other ways to compute the minimum polynomial later.
One reason the minimum polynomial is important:
-
Proposition 3.7. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over \(\F \) and \(p\in \F
[x]\).
Then \(p(\phi )=0\) if and only if \(m_{\phi }\) divides \(p\), that is, there is \(s\in \F [x]\) such that \(p=sm_{\phi }\).
-
Proof. If \(p(\phi )=0\) then, by Theorem 3.1, there are \(s,r\in \F [x]\) with \(\deg r<\deg m_{\phi }\) such that \(p=sm_{\phi }+r\). But then
\(\seteqnumber{0}{3.}{5}\)
\begin{equation*}
0=p(\phi )=s(\phi )m_{\phi }(\phi )+r(\phi )=r(\phi )
\end{equation*}
so that \(r=0\) and \(p=sm_{\phi }\) by the smallest degree property of \(m_{\phi }\).
Conversely, if \(p=sm_{\phi }\) then \(p(\phi )=s(\phi )m_{\phi }(\phi )=0\). □
Of course, the same statement (and proof!) holds for the minimum polynomial of a matrix \(A\in M_n(\F )\).