4.2 Invariant subspaces
The next lemma gives us lots of examples:
-
Proof. Let \(v\in \ker \psi \) so that \(\psi (v)=0\). Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \psi (\phi (v))=\phi (\psi (v))=\phi (0)=0 \end{equation*}
so that \(\phi (v)\in \ker \psi \) also.
Again, if \(v\in \im \psi \), there is \(w\in V\) with \(\psi (w)=v\) and now
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi (\psi (w))=\psi (\phi (w))\in \im \psi , \end{equation*}
as required. □
As a consequence, the following are \(\phi \)-invariant:
-
• \(\ker \phi \) and \(\im \phi \) (since \(\phi \) commutes with itself!).
-
• \(\ker p(\phi )\), \(\im p(\phi )\), for any \(p\in \F [x]\) (since \(xp=px\) so that \(\phi p(\phi )=p(\phi )\phi \)).
Also, we have
-
• \(\Span {v}\), for any eigenvector \(v\) of \(\phi \), since \(\phi (v)=\lambda v\in \Span {v}\). Thus:
-
• Any \(U\leq E_{\phi }(\lambda )\) is \(\phi \)-invariant.
-
Definition. Let \(\lst {V}1k\leq V\) with \(V=\oplst {V}1k\) and let \(\phi _i\in L(V_i)\), for \(\bw 1ik\).
Define \(\phi :V\to V\) by
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi _1(v_1)+\dots +\phi _k(v_k), \end{equation*}
where \(v=\plst {v}1k\) with \(v_i\in V_i\), for \(\bw 1ik\).
Call \(\phi \) the direct sum of the \(\phi _i\) and write \(\phi =\oplst \phi 1k\).
There is a related notion for matrices:
-
Definition. Let \(\lst {A}1k\) be square matrices with \(A_i\in M_{n_i}(\F )\). The direct sum of the \(A_i\) is
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \oplst {A}1k:= \begin{pmatrix} A_1&&0\\&\ddots &\\0&&A_k \end {pmatrix}\in M_n(\F ), \end{equation*}
where \(n=\plst {n}1k\).
A matrix of this type is said to be block diagonal.
-
\(\seteqnumber{0}{4.}{1}\)
\begin{equation*} \begin{pmatrix} 1&2\\3&4 \end {pmatrix}\oplus \begin{pmatrix} 5 \end {pmatrix}\oplus \begin{pmatrix} 1&1\\1&1 \end {pmatrix} =\left ( \begin{array}{cc|c|cc} 1&2&0&0&0\\ 3&4&0&0&0 \\\hline 0&0&5&0&0 \\\hline 0&0&0&1&1 \\ 0&0&0&1&1 \end {array} \right )\in M_5(\R ). \end{equation*}
-
Proposition 4.2. Let \(\lst {V}1k\leq V\) with \(V=\oplst {V}1k\) and let \(\phi _i\in L(V_i)\), for \(\bw 1ik\). Let \(\phi =\oplst \phi 1k\). Then
-
(1) \(\phi \) is linear so that \(\phi \in L(V)\).
-
(2) Each \(V_i\) is \(\phi \)-invariant and \(\phi \restr {V_i}=\phi _i\), \(\bw 1ik\).
-
(3) Let \(\cB _i\) be a basis of \(V_i\) and \(\phi _i\) have matrix \(A_i\) with respect to \(\cB _i\), \(\bw 1ik\). Then \(\phi \) has matrix \(\oplst {A}1k\) with respect to the concatenated basis \(\cB =\cB _1\dots \cB _k\).
-
-
Proof. For (1), let \(v,w\in V\) and write
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} v=\plst {v}1k\qquad w=\plst {w}1k, \end{equation*}
with each \(v_i,w_i\in V_i\). Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} v+\lambda w=(v_1+\lambda w_1)+\dots +(v_k+\lambda w_k) \end{equation*}
with each \(v_i+\lambda w_i\in V_i\).
Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v+\lambda w)=\sum _{i=1}^{k}\phi _i(v_i+\lambda w_i) =\sum _{i=1}^{k}\bigl (\phi _i(v_i)+\lambda \phi _i(w_i)\bigr ) =\sum _{i=1}^{k}\phi _i(v_i)+\lambda \sum _{i=1}^{k}\phi _i(w_i)=\phi (v)+\lambda \phi (w), \end{equation*}
where we used the linearity of \(\phi _i\) in the second equality.
For (2), let \(v\in V_i\) so that we can write \(v=\plst {v}1k\) with \(v_i=v\) and \(v_j=0\), for \(i\neq j\). Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi _1(0)+\dots +\phi _i(v)+\dots +\phi _k(0)=\phi _i(v)\in V_i \end{equation*}
so that \(V_i\) is \(\phi \)-invariant and \(\phi \restr {V_i}=\phi _i\).
Finally, for (3), let \(\cB =\cB _1\dots \cB _k=\lst {v}1n\) with \(\cB _i=\lst {v}{a+1}{a+r}\). Let \(\phi \) have matrix \(A\) with respect to \(\cB \). Then, for \(\bw 1jr\),
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v_{a+j})=\sum _{b=1}^nA_{b,a+j}v_b. \end{equation*}
On the other hand,
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v_{a+j})=\phi _i(v_{a+j})=\sum _{c=1}^r(A_i)_{cj}v_{a+c}. \end{equation*}
Now compare coefficients to see that
\(\seteqnumber{0}{4.}{1}\)\begin{align*} A_{a+c,a+j}&=(A_i)_{cj},\quad \bw 1{j}r\\ A_{b,a+j}&=0\quad \text {otherwise}. \end{align*} Otherwise said, the \(a+j\)-th column of \(A\) has the \(j\)-th column of the \(r\times r\) matrix \(A_i\) in rows \(a+1,\dots ,a+r\) and zeros elsewhere. This settles (3). □
Conversely, any direct sum decomposition into \(\phi \)-invariant subspaces arises this way:
-
Proof. This is almost obvious: write \(v\in V\) as \(v=\plst {v}1k\) with each \(v_i\in V_i\). Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi (v_1)+\dots +\phi (v_k)=\phi _1(v_1)+\dots +\phi _k(v_k)= \oplst \phi 1k(v), \end{equation*}
where the first equality comes from linearity of \(\phi \) and the last from the definition of \(\oplst \phi 1k\). □
The usefulness of such a decomposition comes from the fact that nearly all properties of \(\phi \) reduce to properties of the simpler \(\phi _i\):
-
Proposition 4.4. Let \(\lst {V}1k\leq V\) with \(V=\oplst {V}1k\), \(\phi _i\in L(V_i)\), \(\bw 1ik\) and \(\phi =\oplst \phi 1k\).
Then:
-
(1) \(\ker \phi =\oplst {\ker \phi }1k\).
-
(2) \(\im \phi =\oplst {\im \phi }1k\).
-
(3) \(p(\phi )=p(\phi _1)\oplus \dots \oplus p(\phi _k)\), for any \(p\in \F [x]\).
-
(4) \(\Delta _{\phi }=\prod _{i=1}^k\Delta _{\phi _i}\).
-
Note that the sums in (1) and (2) are direct thanks to:
-
Proof of Proposition 4.4. For (1), write \(v\in \ker \phi \) as \(v=\plst {v}1k\) with each \(v_i\in V_i\). Then
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi _1(v_1)+\dots +\phi _k(v_k)=0=0+\dots +0, \end{equation*}
with \(\phi _i(v_i),0\in V_i\). The direct sum property tells us that each \(\phi _i(v_i)=0\) so that \(v\in \oplst {\ker \phi }1k\). Thus \(\ker \phi \leq \oplst {\ker \phi }1k\).
Conversely, if \(v=\plst {v}1k\in \oplst {\ker \phi }1k\) then each \(\phi _i(v_i)=0\) and
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi (v)=\phi _1(v_1)+\dots +\phi _k(v_k)=0. \end{equation*}
The argument for item (2) is very similar and so left as an exercise3.
For item (3), note that, for \(v_i\in V_i\), \(\phi (v_i)=\phi _i(v_i)\in V_i\) so that
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \phi ^2(v_i)=\phi (\phi _i(v_i))=\phi _i(\phi _i(v_i))=\phi _i^2(v_i) \end{equation*}
and so on.
Finally, for item (4), let \(A_i\) be the matrix of \(\phi \) with respect to some basis \(\cB _i\) of \(V_i\). The \(\phi \) has matrix \(\oplst {A}1k\) with respect to \(\cB _1\dots \cB _k\) by Proposition 4.2(3). Now Theorem 2.1.4 of Algebra 1B tells us
\(\seteqnumber{0}{4.}{1}\)\begin{equation*} \Delta _{\phi }=\det (A-xI)= \begin{vmatrix} A_1-xI&&0\\&\ddots &\\0&&A_k-xI \end {vmatrix}= \prod _{i=1}^k\det (A_i-xI)=\prod _{i=1}^k\Delta _{\phi _i}. \end{equation*}
□
Here is a first example of these ideas in action:
-
Proposition 4.5. Let \(\phi \in L(V)\) be a linear operator on a finite-dimensional vector space over a field \(\F \) and let \(\lst \lambda 1k\) be the distinct eigenvalues of \(\phi \).
Then \(\phi \) is diagonalisable if and only if
\(\seteqnumber{0}{4.}{1}\)\begin{equation} \label {eq:14} V=\bigoplus _{i=1}^kE_{\phi }(\lambda _i). \end{equation}
-
Proof. Suppose that (4.2) holds and let \(\cB _i\) be a basis of \(E_{\phi }(\lambda _i)\). Then, by Corollary 2.7, \(\cB _1\dots \cB _k\) is a basis of \(V\) which consists of eigenvectors and so is an eigenbasis. Thus \(\phi \) is diagonalisable.
Conversely, suppose that \(\cB =\lst {v}1n\) is an eigenbasis for \(\phi \) so that each \(\phi (v_j)=\mu _{j}v_j\), for some \(\mu _j\in \set {\lst \lambda 1k}\).
We claim: for \(\lambda \) an eigenvalue,
\(\seteqnumber{0}{4.}{2}\)\begin{equation*} U_{\lambda }:=\Span {v_j\st \mu _j=\lambda }=E_{\phi }(\lambda ). \end{equation*}
Given this, \(\cB _i:=\set {v_j\st \mu _j=\lambda _i}\) is a basis for \(E_{\phi }(\lambda _i)\) and then \(\cB =\cB _1\dots \cB _k\) so that (4.2) holds, again by Corollary 2.7.
It remains to prove the claim. Clearly \(U_{\lambda }\leq E_{\phi }(\lambda )\). Conversely, if \(v\in E_{\phi }(\lambda )\), write \(v=\sum _{j=1}^na_jv_j\). Then
\(\seteqnumber{0}{4.}{2}\)\begin{equation*} 0=(\phi -\lambda \id )(v)= \sum _{j\st \mu _j=\lambda }(\mu _j-\lambda )a_jv_j+ \sum _{j\st \mu _j\neq \lambda }(\mu _j-\lambda )a_jv_j= \sum _{j\st \mu _j\neq \lambda }(\mu _j-\lambda )a_jv_j. \end{equation*}
Since the \(v_j\) are linearly independent, we see that \((\mu _{j}-\lambda )a_j=0\), for all \(j\) with \(\mu _j\neq \lambda \), and so all such \(a_j\) vanish. Thus
\(\seteqnumber{0}{4.}{2}\)\begin{equation*} v=\sum _{j\st \mu _j=\lambda }a_jv_j\in U_{\lambda }. \end{equation*}
□
To summarise the situation: when \(\phi \) is diagonalisable, then with \(V_i:=E_{\phi }(\lambda _i)\) and \(\phi _i:=\phi \restr {V_i}\), we have \(V=\oplst {V}1k\), \(\phi =\oplst {\phi }1k\) and
\(\seteqnumber{0}{4.}{2}\)\begin{equation*} \phi _i=\lambda _i\id _{V_i}. \end{equation*}
Thus the \(\phi _i\) are as simple as they possibly can be!
We now turn to what we can say about general \(\phi \).