Home page for accesible maths 5 Spectral decomposition

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

5.B Real symmetric matrices

Symmetric matrices naturally occur in applications. For example the covariance matrix in statistics, and the adjacency matrix in graph theory, are both symmetric. In both of those situations it is desirable to find the eigenvalues of the matrix, because those eigenvalues have certain meaningful interpretations.

But you might ask: “What if there are non-real eigenvalues?” This is a great question, since in general real matrices might have non-real eigenvalues (See Exercise 5.6). Fortunately, we have the following theorem:

Theorem 5.5.

Let AMn(R) be a symmetric matrix. Then every eigenvalue of A is a real number.

For the proof, see Exercise 5.31.

Exercise 5.6:
  1. i.

    Choose your own 3×3 real symmetric matrix which is not diagonal, and find its eigenvalues (they should be real!).

  2. ii.

    Find a 3×3 real matrix which has at least one non-real eigenvalue.

[End of Exercise]

The following theorem decomposes A into simpler, easier to work with components: P and D. Another way of finding the matrices P and D is to use the computer program R, with the command eigen.

Theorem 5.7 (Spectral decomposition).

Let AMn(R).

  1. i.

    If P is a matrix whose columns form an orthonormal basis of eigenvectors of A, and D is the diagonal matrix of eigenvalues (in the same order), then

    A=PDPT.
  2. ii.

    A has an orthonormal basis of eigenvectors if and only if A is symmetric.

This is also called the spectral theorem. The name comes from applications (in particular by physicists) where the set of eigenvalues of a matrix is called its “spectrum”.

Proof.

To prove (i), assume is an orthonormal basis of eigenvectors of A, and P and D are as in the Theorem. Then P=[Id]𝒞 is the change of basis matrix from to the standard basis 𝒞 of n. So by Theorem 4.54(i), P-1AP=D is a diagonal matrix. Rearranging this equation we get A=PDP-1. But the columns of P form an orthonormal basis, so by Theorem 5.1 we have P-1=PT (i.e. P is orthogonal). Therefore A=PDPT.

To prove (ii), firstly notice that if an orthonormal basis of eigenvectors exists, by part (i) we can write A=PDPT. Since (ABC)T=CTBTAT, and diagonal matrices are symmetric, we see that AT=(PDPT)T=PDPT=A, which shows A is symmetric.

The difficult part of (ii) is the other direction. We omit the rest of this proof from this module, but include it here for completeness. Assume A is symmetric. To construct an orthonormal basis, we proceed by induction on the size of A. The base case, n=1, follows because any unit vector is an eigenvector and also an orthonormal basis. So let n>1, and assume for every square matrix of size n-1, the statement of the theorem is true. By Lemma 5.5, we can choose an eigenvalue λ of A, and let x1n be a corresponding eigenvector with norm 1.

By Corollary 3.39, we can extend x1 to an orthonormal basis of n, which we can write :=(x1,y2,,yn). Notice that

x1,Ayi=x1TAyi=(Ax1)Tyi=λx1Tyi=0,

, for any i=2,,n. So by Thereom 3.33, the vectors Ayi have x1 coordinate equal to zero, in the basis . Therefore, if Q is the change of basis matrix from to the standard basis, then by Theorem 4.52

Q-1AQ=[λ000A0].

Since Q is orthogonal (Theorem 5.1), we know Q-1=QT, so the matrix Q-1AQ is symmetric; and therefore so is A. Since A is a real symmetric matrix of dimension (n-1)×(n-1), by our induction assumption, there is an orthonormal basis of eigenvectors of A (in n-1), and therefore an orthogonal matrix P such that A=PDPT, where D is diagonal. We create our final matrix P as a matrix product:

P:=QT[1000P0],

because then

A=P[λ000D0]PT.

So by Theorem 4.54(ii), the columns of P form a basis of eigenvectors. Since P is the product of orthogonal matrices, it is itself an orthogonal matrix, which means this basis is in fact orthonormal. Therefore the result holds for all n by induction. ∎

Exercise 5.8:

Assume AMn() is symmetric with exactly one eigenvalue, λ. Prove that A=λIn. [ Hint: Use the spectral decomposition of A.]

[End of Exercise]

Example 5.9.

Find a basis of orthonormal eigenvectors for the following matrix, and hence find its spectral decomposition:

A=[211121112].

Solution: First, we find the eigenvalues, by finding the roots of the characteristic polynomial:

0=cA(λ)=det(A-λI3)==-λ3+6λ2-9λ+4.

Cubic polynomials are, in general, hard to solve. If there is an integer solution (which in general, there is not, but one can hope!), then it must divide the constant term, which is 4. If we try λ=1, we see that cA(1)=0. Therefore 1 is a root, so we can factor

cA(λ)=(λ-1)(-λ2+5λ-4)=-(λ-1)(λ-1)(λ-4).

Hence, the eigenvalues are λ=1 and 4. Next, we compute each of the eigenspaces. Omitting details, we find: V4=span{(1,1,1)} and V1=span{(1,0,-1),(0,1,-1)}.

Eigenvectors for two different eigenvalues are always orthogonal to each other (see Exercise 5.26). But if your eigenspace has dimension two or larger, then the basis you write down for it is not necessarily orthogonal.

In this example, both vectors (1,0,-1),(0,1,-1) in V1 are orthogonal to (1,1,1)V4, but they are not orthogonal to each other. How do we produce eigenvectors in V1 which are orthogonal to each other? The answer is to use the Gram-Schmidt process. Let x1=(1,0,-1) and x2=(0,1,-1). Then

  1. b1:=x1=(1,0,-1)

  2. b2:=x2-x2,b1||b1||2b1=(0,1,-1)-12(1,0,-1)=(-12,1,-12).

By Theorem 3.36, both b1 and b2 are eigenvectors in V1, and are orthogonal to each other. Next, we scale so that they have length 1 (of course, scaling doesn’t change the fact that they are eigenvectors). So we obtain an orthonormal basis of eigenvectors:

  1. (13,13,13)

  2. (12,0,-12)

  3. (-16,26,-16)

Finally, we write down the corresponding change of basis matrix, and spectral decomposition:

  1. P=[1312-161302613-12-16]

  2. A=P[400010001]PT

As a final check, one could perform the matrix multiplication PDPT, and confirm that the resulting matrix is indeed A; one could also check that PPT=I3.

Summary of the method used in the above example:

  • For each eigenvalue λ, find a basis for the eigenspace Vλ,

  • For each λ such that dimVλ2, find an orthogonal basis (use Gram-Schmidt),

  • Scale each vector to obtain an orthonormal basis of n,

  • P is the matrix whose columns are the vectors in , and D is the diagonal matrix whose entries are the eigenvalues of (in the same order as ).

  • If done correctly, you should be able to check A=PDPT.

Exercise 5.10:

Let P be the orthogonal matrix found in Example 5.9. Choose your own vector x3 of length 1. Compute Px, and then compute ||Px||. If your answer is not 1, then you have made a mistake, due to Theorem 5.1(v).

Exercise 5.11:

Find a basis of orthonormal eigenvectors for the following matrix, and hence obtain its spectral decomposition.

A:=[7-2-2-214-241]
Exercise 5.12:

If A=PDPT, where P is orthogonal and D diagonal, prove that the columns of P are all eigenvectors of A.

[End of Exercise]

A matrix formed by deleting a collection of rows and/or columns of a bigger matrix is known as a submatrix. Given a square matrix AMn(), the leading principal minor of size k is the determinant of the submatrix consisting of the k×k entries in the upper-left corner of A, for any k=1,,n. In other words, the determinant of the matrix formed by deleting the right-most n-k columns and the bottom n-k rows.

Example 5.13.

Find the leading principal minors of A=[1-30-312023].

Solution: The determinant of the upper-left 1×1 submatrix is 1. The determinant of the upper-left 2×2 submatrix is -8. The upper-left 3×3 submatrix is all of A, which has determinant -28. So the leading principal minors are 1,-8, and -28.

Exercise 5.14:

Find a matrix AM3() such that the coefficients of A are all non-zero and the leading principal minors of A are all positive numbers.

[End of Exercise]

We saw in Theorem 3.11 that a bilinear form is symmetric exactly when its associated matrix is symmetric. Similarly, we will call a matrix associated to a positive definite form a positive definite matrix; in other words:

xTAx>0

for all non-zero 0xn.

Given a symmetric matrix, there are a few convenient tests for positive definiteness:

Theorem 5.15.

Let AMn(R) be real symmetric. The following are equivalent:

  1. i.

    A is positive definite,

  2. ii.

    All of the eigenvalues of A are positive (i.e. >0),

  3. iii.

    (Sylvester’s criterion) The leading principal minors are positive (i.e. >0).

The criterion (iii) is named after English mathematician J.J. Sylvester (1814 - 1897) who discovered many fundamental results in matrix theory.

Proof.

Assume (ii). By the spectral theorem, we can write A=PTDP, where P invertible and D=diag(λ1,,λn) with λi>0 for all i=1,,n. Now take x0. Since P is invertible, y=Px0. Therefore we have that

xTAx=xTPTDPx=(Px)TD(Px)=yTDy=λ1y12++λnyn2>0.

So we have proved (ii)(i).

For the opposite direction, assume (i). Let λ be an eigenvalue. By Theorem 5.5, λ, so we can find an eigenvector xn. In other words, there is a vector xn such that Ax=λx and x0. Since A is positive definite by assumption, λ(xTx)=xTAx>0. But we also know that if x0 then xTx>0 (since the standard scalar product is positive definite bilinear form; see 3.B). Therefore, λ>0. So we have shown (i)(ii).

We omit the proof that these are equivalent to (iii). ∎

Exercise 5.16:

Let A=[2-10-12-10-12].

  • Find the eigenvalues of A.

  • Calculate the leading principal minors of A.

  • Define 2 vectors of your choice, and check that for each of them xTAx>0.

In your opinion, which of these methods is the best to show positive definiteness?

Exercise 5.17:

Let A=[1-401]. Prove that the leading principal minors are all positive, and also prove that A is not positive definite. Why doesn’t this contradict Theorem 5.15?

[End of Exercise]