Home page for accesible maths 6 Jordan normal form

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

6.A The Cayley-Hamilton theorem

Assume we have a square matrix AMn(F), and a polynomial

p(x)=a0+a1x++arxr𝒫r(F).

Then we will allow ourselves to evaluate the polynomial p at the matrix A:

p(A)=a0In+a1A++arArMn(F).

The result p(A) is a square matrix whose entries are all in the field F. We have replaced all the x’s with A’s, and also multiplied the constant term a0 by the identity matrix In.

Exercise 6.1:

Evaluate x2+2x-1𝒫2() at the following matrices:

  1. A=[-1-110]

  2. B=[12302100-1]

  3. C=[-121-1]

[End of Exercise]

Recall that for a square matrix in Mn(F), where F is a field, its characteristic polynomial is the polynomial in a single variable (usually denoted by x or λ):

cA(x):=det(A-xIn).

So cA𝒫n(F), since it is a polynomial of degree less than or equal to n; in fact, its degree is always equal to n. [Aside: Some authors define the characteristic polynomial slightly differently, as det(xIn-A), because then the coefficient of xn is always 1.]

The characteristic polynomial could be expanded, and written in the following form:

cA(x)=c0+c1x+c2x2++cnxn𝒫n(F),

for some numbers ciF.

The main result of this subsection is a statement about evaluating this polynomial at the original matrix A:

cA(A):=c0In+c1A+c2A2++cnAnMn(F).

Using this notation, we can state the theorem.

Theorem 6.2 (Cayley-Hamilton).

If AMn(F), then cA(A)=0.

In other words, this Theorem says that if you replace each instance of x in the expanded characteristic polynomial with the matrix A, and multiply the constant term by In, then the result is the zero matrix 0. Yet another way of stating this result is: Any square matrix satisfies its own characteristic equation.

For several different proofs, see the Wikipedia article on the Cayley-Hamilton theorem (see also Exercise 6.70 for an invalid proof). We will omit the proof of Theorem 6.2 from this module.

Example 6.3.

Let A=[120340005]M3(R). Then:

cA(x)=det[1-x2034-x0005-x]=(x2-5x-2)(5-x)=-10-23x+10x2-x3.

To verify that the Cayley-Hamilton theorem is true in this case, compute:

cA(A) =-10I3-23A+10A2-A3
=-10I3-23[120340005]+10[7100152200025]-[3754081118000125]=[000000000].
Exercise 6.4:

Verify that the Cayley-Hamilton theorem is true for the following matrices in AMn():

  1. i.

    [1-120],

  2. ii.

    [-120021100],

  3. iii.

    [01-110-11-10].

[End of Exercise]

The Cayley-Hamilton theorem lets us use matrix algebra to give a new way of computing powers of the matrix A. As an example of this method, consider the following.

Example 6.5.

Let A=[120340005] be the matrix from the previous example. Write A4 and A-1 as a linear combination of I3,A,A2.

(Solution:) The Cayley-Hamilton theorem tells us that

cA(A)=-10I3-23A+10A2-A3=0.

By rearranging this equation, we know that

A3=10A2-23A-10I3.

Now we multiply this by the matrix A (either on the left, or the right):

A4=10A3-23A2-10A =10(10A2-23A-10I3)-23A2-10A
=100A2-230A-100I3-23A2-10A
=77A2-240A-100I3.

One could check that both the left and the right hand sides are equal to [1992900435634000625]. So we have expressed A4 as a linear combination of I3,A, and A2.

For A-1, rearrange the Cayley-Hamilton equation as follows:

A(A2-10A+23I3)=-10I3,

which implies

A(-110A2+A-2310I3)=I3.

This proves that

A-1=-110A2+A-2310I3.

One could also check that both the left and right hand sides of this equation are equal to [-2103/2-1/20001/5]. So we have expressed A-1 as a linear combination of I3,A,and A2.

Exercise 6.6:

For each of the matrices in Mn() from Exercise 6.4, express both A4 and A-1 as a linear combination of In,A,,An-1.

[End of Exercise]