Home page for accesible maths 6 Jordan normal form 6 Jordan normal form 6.B Minimal polynomials

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

6.A The Cayley-Hamilton theorem

Assume we have a square matrix $A\in\operatorname{M}_{{n}}({F})$ , and a polynomial

p(x)=a_{0}+a_{1}x+\cdots+a_{r}x^{r}\in\mathcal{P}_{r}(F).

Then we will allow ourselves to evaluate the polynomial $p$ at the matrix $A$ :

p(A)=a_{0}I_{n}+a_{1}A+\cdots+a_{r}A^{r}\in\operatorname{M}_{{n}}({F}).

The result $p(A)$ is a square matrix whose entries are all in the field $F$ . We have replaced all the $x$ ’s with $A$ ’s, and also multiplied the constant term $a_{0}$ by the identity matrix $I_{n}$ .

Exercise 6.1:

Evaluate $x^{2}+2x-1\in\mathcal{P}_{2}({\mathbb{R}})$ at the following matrices:

$A=\begin{bmatrix}-1&-1\\ 1&0\end{bmatrix}$
$B=\begin{bmatrix}1&2&3\\ 0&2&1\\ 0&0&-1\end{bmatrix}$
$C=\begin{bmatrix}-1&2\\ 1&-1\end{bmatrix}$

[End of Exercise]

Recall that for a square matrix in $\operatorname{M}_{{n}}({F})$ , where $F$ is a field, its characteristic polynomial is the polynomial in a single variable (usually denoted by $x$ or $\lambda$ ):

c_{A}(x):=\det(A-xI_{n}).

So $c_{A}\in\mathcal{P}_{n}(F)$ , since it is a polynomial of degree less than or equal to $n$ ; in fact, its degree is always equal to $n$ . [Aside: Some authors define the characteristic polynomial slightly differently, as $\det(xI_{n}-A)$ , because then the coefficient of $x^{n}$ is always 1.]

The characteristic polynomial could be expanded, and written in the following form:

c_{A}(x)=c_{0}+c_{1}x+c_{2}x^{2}+\cdots+c_{n}x^{n}\in\mathcal{P}_{n}(F),

for some numbers $c_{i}\in F$ .

The main result of this subsection is a statement about evaluating this polynomial at the original matrix $A$ :

c_{A}(A):=c_{0}\cdot I_{n}+c_{1}A+c_{2}A^{2}+\cdots+c_{n}A^{n}\in\operatorname% {M}_{{n}}({F}).

Using this notation, we can state the theorem.

Theorem 6.2 (Cayley-Hamilton).

If $A\in\operatorname{M}_{{n}}({F})$ , then $c_{A}(A)=\vec{0}$ .

In other words, this Theorem says that if you replace each instance of $x$ in the expanded characteristic polynomial with the matrix $A$ , and multiply the constant term by $I_{n}$ , then the result is the zero matrix $\vec{0}$ . Yet another way of stating this result is: Any square matrix satisfies its own characteristic equation.

For several different proofs, see the Wikipedia article on the Cayley-Hamilton theorem (see also Exercise 6.70 for an invalid proof). We will omit the proof of Theorem 6.2 from this module.

Example 6.3.

Let $A=\begin{bmatrix}1&2&0\\ 3&4&0\\ 0&0&5\end{bmatrix}\in\operatorname{M}_{{3}}({{\mathbb{R}}})$ . Then:

c_{A}(x)=\det\begin{bmatrix}1-x&2&0\\ 3&4-x&0\\ 0&0&5-x\end{bmatrix}=(x^{2}-5x-2)(5-x)=-10-23x+10x^{2}-x^{3}.

To verify that the Cayley-Hamilton theorem is true in this case, compute:

	$\displaystyle c_{A}(A)$	$\displaystyle=-10\cdot I_{3}-23A+10A^{2}-A^{3}$
		$\displaystyle=-10I_{3}-23\begin{bmatrix}1&2&0\\ 3&4&0\\ 0&0&5\end{bmatrix}+10\begin{bmatrix}7&10&0\\ 15&22&0\\ 0&0&25\end{bmatrix}-\begin{bmatrix}37&54&0\\ 81&118&0\\ 0&0&125\end{bmatrix}=\begin{bmatrix}0&0&0\\ 0&0&0\\ 0&0&0\end{bmatrix}.$

Exercise 6.4:

Verify that the Cayley-Hamilton theorem is true for the following matrices in $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ :

i.

$\begin{bmatrix}1&-1\\ 2&0\end{bmatrix}$ ,
ii.

$\begin{bmatrix}-1&2&0\\ 0&2&1\\ 1&0&0\end{bmatrix}$ ,
iii.

$\begin{bmatrix}0&1&-1\\ 1&0&-1\\ 1&-1&0\end{bmatrix}$ .

[End of Exercise]

The Cayley-Hamilton theorem lets us use matrix algebra to give a new way of computing powers of the matrix $A$ . As an example of this method, consider the following.

Example 6.5.

Let $A=\begin{bmatrix}1&2&0\\ 3&4&0\\ 0&0&5\end{bmatrix}$ be the matrix from the previous example. Write $A^{4}$ and $A^{-1}$ as a linear combination of $I_{3},A,A^{2}$ .

(Solution:) The Cayley-Hamilton theorem tells us that

c_{A}(A)=-10\cdot I_{3}-23A+10A^{2}-A^{3}=\vec{0}.

By rearranging this equation, we know that

A^{3}=10A^{2}-23A-10I_{3}.

Now we multiply this by the matrix $A$ (either on the left, or the right):

	$\displaystyle A^{4}=10A^{3}-23A^{2}-10A$	$\displaystyle=10(10A^{2}-23A-10I_{3})-23A^{2}-10A$
		$\displaystyle=100A^{2}-230A-100I_{3}-23A^{2}-10A$
		$\displaystyle=77A^{2}-240A-100I_{3}.$

One could check that both the left and the right hand sides are equal to $\begin{bmatrix}199&290&0\\ 435&634&0\\ 0&0&625\end{bmatrix}$ . So we have expressed $A^{4}$ as a linear combination of $I_{3},A,$ and $A^{2}$ .

For $A^{-1}$ , rearrange the Cayley-Hamilton equation as follows:

A\left(A^{2}-10A+23I_{3}\right)=-10I_{3},

which implies

A\left(\frac{-1}{10}A^{2}+A-\frac{23}{10}I_{3}\right)=I_{3}.

This proves that

A^{-1}=-\frac{1}{10}A^{2}+A-\frac{23}{10}\cdot I_{3}.

One could also check that both the left and right hand sides of this equation are equal to $\begin{bmatrix}-2&1&0\\ 3/2&-1/2&0\\ 0&0&1/5\end{bmatrix}$ . So we have expressed $A^{-1}$ as a linear combination of $I_{3}$ , $A$ ,and $A^{2}$ .

Exercise 6.6:

For each of the matrices in $\operatorname{M}_{{n}}({{\mathbb{R}}})$ from Exercise 6.4, express both $A^{4}$ and $A^{-1}$ as a linear combination of $I_{n},A,\cdots,A^{n-1}$ .

[End of Exercise]