Home page for accesible maths 5 Spectral decomposition 5.A Orthogonal matrices 5.C Matrix square roots

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

5.B Real symmetric matrices

Symmetric matrices naturally occur in applications. For example the covariance matrix in statistics, and the adjacency matrix in graph theory, are both symmetric. In both of those situations it is desirable to find the eigenvalues of the matrix, because those eigenvalues have certain meaningful interpretations.

But you might ask: “What if there are non-real eigenvalues?” This is a great question, since in general real matrices might have non-real eigenvalues (See Exercise 5.6). Fortunately, we have the following theorem:

Theorem 5.5.

Let $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ be a symmetric matrix. Then every eigenvalue of $A$ is a real number.

For the proof, see Exercise 5.31.

Exercise 5.6:

i.

Choose your own $3\times 3$ real symmetric matrix which is not diagonal, and find its eigenvalues (they should be real!).
ii.

Find a $3\times 3$ real matrix which has at least one non-real eigenvalue.

[End of Exercise]

The following theorem decomposes $A$ into simpler, easier to work with components: $P$ and $D$ . Another way of finding the matrices $P$ and $D$ is to use the computer program R, with the command eigen.

Theorem 5.7 (Spectral decomposition).

Let $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ .

i.

If $P$ is a matrix whose columns form an orthonormal basis of eigenvectors of $A$ , and $D$ is the diagonal matrix of eigenvalues (in the same order), then

$A=PDP^{T}.$
ii.

$A$ has an orthonormal basis of eigenvectors if and only if $A$ is symmetric.

This is also called the spectral theorem. The name comes from applications (in particular by physicists) where the set of eigenvalues of a matrix is called its “spectrum”.

Proof.

To prove (i), assume $\mathcal{B}$ is an orthonormal basis of eigenvectors of $A$ , and $P$ and $D$ are as in the Theorem. Then $P={{}_{\mathcal{C}}[}\operatorname{Id}\nolimits]_{\mathcal{B}}$ is the change of basis matrix from $\mathcal{B}$ to the standard basis $\mathcal{C}$ of ${\mathbb{R}}^{n}$ . So by Theorem 4.54(i), $P^{-1}AP=D$ is a diagonal matrix. Rearranging this equation we get $A=PDP^{-1}$ . But the columns of $P$ form an orthonormal basis, so by Theorem 5.1 we have $P^{-1}=P^{T}$ (i.e. $P$ is orthogonal). Therefore $A=PDP^{T}$ .

To prove (ii), firstly notice that if an orthonormal basis of eigenvectors exists, by part (i) we can write $A=PDP^{T}$ . Since $(ABC)^{T}=C^{T}B^{T}A^{T}$ , and diagonal matrices are symmetric, we see that $A^{T}=(PDP^{T})^{T}=PDP^{T}=A$ , which shows $A$ is symmetric.

The difficult part of (ii) is the other direction. We omit the rest of this proof from this module, but include it here for completeness. Assume $A$ is symmetric. To construct an orthonormal basis, we proceed by induction on the size of $A$ . The base case, $n=1$ , follows because any unit vector is an eigenvector and also an orthonormal basis. So let $n>1$ , and assume for every square matrix of size $\leq n-1$ , the statement of the theorem is true. By Lemma 5.5, we can choose an eigenvalue $\lambda\in{\mathbb{R}}$ of $A$ , and let $\vec{x_{1}}\in{\mathbb{R}}^{n}$ be a corresponding eigenvector with norm 1.

By Corollary 3.39, we can extend $\vec{x_{1}}$ to an orthonormal basis of ${\mathbb{R}}^{n}$ , which we can write $\mathcal{B}:=(\vec{x_{1}},\vec{y_{2}},\cdots,\vec{y_{n}})$ . Notice that

\langle\vec{x_{1}},A\vec{y_{i}}\rangle=\vec{x_{1}}^{T}A\vec{y_{i}}=(A\vec{x_{1% }})^{T}\vec{y_{i}}=\lambda\vec{x_{1}}^{T}\vec{y_{i}}=0,

, for any $i=2,\cdots,n$ . So by Thereom 3.33, the vectors $A\vec{y_{i}}$ have $\vec{x_{1}}$ coordinate equal to zero, in the basis $\mathcal{B}$ . Therefore, if $Q$ is the change of basis matrix from $\mathcal{B}$ to the standard basis, then by Theorem 4.52

Q^{-1}AQ=\begin{bmatrix}\lambda&0&\cdots&0\\ 0&&&\\ \vdots&&A^{\prime}&\\ 0&&&\end{bmatrix}.

Since $Q$ is orthogonal (Theorem 5.1), we know $Q^{-1}=Q^{T}$ , so the matrix $Q^{-1}AQ$ is symmetric; and therefore so is $A^{\prime}$ . Since $A^{\prime}$ is a real symmetric matrix of dimension $(n-1)\times(n-1)$ , by our induction assumption, there is an orthonormal basis of eigenvectors of $A^{\prime}$ (in ${\mathbb{R}}^{n-1}$ ), and therefore an orthogonal matrix $P^{\prime}$ such that $A^{\prime}=P^{\prime}D^{\prime}P^{\prime T}$ , where $D^{\prime}$ is diagonal. We create our final matrix $P$ as a matrix product:

P:=Q^{T}\begin{bmatrix}1&0&\cdots&0\\ 0&&&\\ \vdots&&P^{\prime}&\\ 0&&&\end{bmatrix},

because then

A=P\begin{bmatrix}\lambda&0&\cdots&0\\ 0&&&\\ \vdots&&D^{\prime}&\\ 0&&&\end{bmatrix}P^{T}.

So by Theorem 4.54(ii), the columns of $P$ form a basis of eigenvectors. Since $P$ is the product of orthogonal matrices, it is itself an orthogonal matrix, which means this basis is in fact orthonormal. Therefore the result holds for all $n$ by induction. ∎

Exercise 5.8:

Assume $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ is symmetric with exactly one eigenvalue, $\lambda$ . Prove that $A=\lambda I_{n}$ . [ Hint: Use the spectral decomposition of $A$ .]

[End of Exercise]

Example 5.9.

Find a basis of orthonormal eigenvectors for the following matrix, and hence find its spectral decomposition:

A=\begin{bmatrix}2&1&1\\ 1&2&1\\ 1&1&2\end{bmatrix}.

Solution: First, we find the eigenvalues, by finding the roots of the characteristic polynomial:

0=c_{A}(\lambda)=\det(A-\lambda I_{3})=\cdots=-\lambda^{3}+6\lambda^{2}-9% \lambda+4.

Cubic polynomials are, in general, hard to solve. If there is an integer solution (which in general, there is not, but one can hope!), then it must divide the constant term, which is 4. If we try $\lambda=1$ , we see that $c_{A}(1)=0$ . Therefore 1 is a root, so we can factor

c_{A}(\lambda)=(\lambda-1)(-\lambda^{2}+5\lambda-4)=-(\lambda-1)(\lambda-1)(% \lambda-4).

Hence, the eigenvalues are $\lambda=1$ and $4$ . Next, we compute each of the eigenspaces. Omitting details, we find: $V_{4}=\operatorname{span}\{(1,1,1)\}$ and $V_{1}=\operatorname{span}\{(1,0,-1),(0,1,-1)\}$ .

Eigenvectors for two different eigenvalues are always orthogonal to each other (see Exercise 5.26). But if your eigenspace has dimension two or larger, then the basis you write down for it is not necessarily orthogonal.

In this example, both vectors $(1,0,-1),(0,1,-1)$ in $V_{1}$ are orthogonal to $(1,1,1)\in V_{4}$ , but they are not orthogonal to each other. How do we produce eigenvectors in $V_{1}$ which are orthogonal to each other? The answer is to use the Gram-Schmidt process. Let $\vec{x_{1}}=(1,0,-1)$ and $\vec{x_{2}}=(0,1,-1)$ . Then

$\vec{b_{1}}:=\vec{x_{1}}=(1,0,-1)$
$\vec{b_{2}}:=\vec{x_{2}}-\frac{\langle\vec{x_{2}},\vec{b_{1}}\rangle}{||\vec{b% _{1}}||^{2}}\vec{b_{1}}=(0,1,-1)-\frac{1}{2}(1,0,-1)=(-\frac{1}{2},1,-\frac{1}% {2})$ .

By Theorem 3.36, both $\vec{b_{1}}$ and $\vec{b_{2}}$ are eigenvectors in $V_{1}$ , and are orthogonal to each other. Next, we scale so that they have length 1 (of course, scaling doesn’t change the fact that they are eigenvectors). So we obtain an orthonormal basis of eigenvectors:

$(\frac{1}{\sqrt{3}},\frac{1}{\sqrt{3}},\frac{1}{\sqrt{3}})$
$(\frac{1}{\sqrt{2}},0,\frac{-1}{\sqrt{2}})$
$(\frac{-1}{\sqrt{6}},\frac{2}{\sqrt{6}},\frac{-1}{\sqrt{6}})$

Finally, we write down the corresponding change of basis matrix, and spectral decomposition:

$P={\begin{bmatrix}\frac{1}{\sqrt{3}}&\frac{1}{\sqrt{2}}&\frac{-1}{\sqrt{6}}\\ \frac{1}{\sqrt{3}}&0&\frac{2}{\sqrt{6}}\\ \frac{1}{\sqrt{3}}&\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{6}}\end{bmatrix}}$
$A=P\begin{bmatrix}4&0&0\\ 0&1&0\\ 0&0&1\end{bmatrix}P^{T}$

As a final check, one could perform the matrix multiplication $PDP^{T}$ , and confirm that the resulting matrix is indeed $A$ ; one could also check that $PP^{T}=I_{3}$ .

Summary of the method used in the above example:

•

For each eigenvalue $\lambda$ , find a basis for the eigenspace $V_{\lambda}$ ,
•

For each $\lambda$ such that $\dim V_{\lambda}\geq 2$ , find an orthogonal basis (use Gram-Schmidt),
•

Scale each vector to obtain an orthonormal basis $\mathcal{B}$ of ${\mathbb{R}}^{n}$ ,
•

$P$ is the matrix whose columns are the vectors in $\mathcal{B}$ , and $D$ is the diagonal matrix whose entries are the eigenvalues of $\mathcal{B}$ (in the same order as $\mathcal{B}$ ).
•

If done correctly, you should be able to check $A=PDP^{T}$ .

Exercise 5.10:

Let $P$ be the orthogonal matrix found in Example 5.9. Choose your own vector $\vec{x}\in{\mathbb{R}}^{3}$ of length 1. Compute $P\vec{x}$ , and then compute $||P\vec{x}||$ . If your answer is not 1, then you have made a mistake, due to Theorem 5.1(v).

Exercise 5.11:

Find a basis of orthonormal eigenvectors for the following matrix, and hence obtain its spectral decomposition.

A:=\begin{bmatrix}7&-2&-2\\ -2&1&4\\ -2&4&1\end{bmatrix}

Exercise 5.12:

If $A=PDP^{T}$ , where $P$ is orthogonal and $D$ diagonal, prove that the columns of $P$ are all eigenvectors of $A$ .

[End of Exercise]

A matrix formed by deleting a collection of rows and/or columns of a bigger matrix is known as a submatrix. Given a square matrix $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ , the leading principal minor of size $k$ is the determinant of the submatrix consisting of the $k\times k$ entries in the upper-left corner of $A$ , for any $k=1,\cdots,n$ . In other words, the determinant of the matrix formed by deleting the right-most $n-k$ columns and the bottom $n-k$ rows.

Example 5.13.

Find the leading principal minors of $A=\begin{bmatrix}1&-3&0\\ -3&1&2\\ 0&2&3\end{bmatrix}$ .

Solution: The determinant of the upper-left $1\times 1$ submatrix is 1. The determinant of the upper-left $2\times 2$ submatrix is $-8$ . The upper-left $3\times 3$ submatrix is all of $A$ , which has determinant $-28$ . So the leading principal minors are $1,-8,$ and $-28$ .

Exercise 5.14:

Find a matrix $A\in\operatorname{M}_{{3}}({{\mathbb{R}}})$ such that the coefficients of $A$ are all non-zero and the leading principal minors of $A$ are all positive numbers.

[End of Exercise]

We saw in Theorem 3.11 that a bilinear form is symmetric exactly when its associated matrix is symmetric. Similarly, we will call a matrix associated to a positive definite form a positive definite matrix; in other words:

\vec{x}^{T}A\vec{x}>0

for all non-zero $\vec{0}\neq\vec{x}\in{\mathbb{R}}^{n}$ .

Given a symmetric matrix, there are a few convenient tests for positive definiteness:

Theorem 5.15.

Let $A\in\operatorname{M}_{{n}}({{\mathbb{R}}})$ be real symmetric. The following are equivalent:

i.

$A$ is positive definite,
ii.

All of the eigenvalues of $A$ are positive (i.e. $>0$ ),
iii.

(Sylvester’s criterion) The leading principal minors are positive (i.e. $>0$ ).

The criterion (iii) is named after English mathematician J.J. Sylvester (1814 - 1897) who discovered many fundamental results in matrix theory.

Proof.

Assume (ii). By the spectral theorem, we can write $A=P^{T}DP$ , where $P$ invertible and $D=\operatorname{diag}(\lambda_{1},\cdots,\lambda_{n})$ with $\lambda_{i}>0$ for all $i=1,\cdots,n$ . Now take $\vec{x}\neq\vec{0}$ . Since $P$ is invertible, $\vec{y}=P\vec{x}\neq\vec{0}$ . Therefore we have that

\vec{x}^{T}A\vec{x}=\vec{x}^{T}P^{T}DP\vec{x}=(P\vec{x})^{T}D(P\vec{x})=\vec{y% }^{T}D\vec{y}=\lambda_{1}y_{1}^{2}+\cdots+\lambda_{n}y_{n}^{2}>0.

So we have proved (ii) $\Rightarrow$ (i).

For the opposite direction, assume (i). Let $\lambda$ be an eigenvalue. By Theorem 5.5, $\lambda\in{\mathbb{R}}$ , so we can find an eigenvector $\vec{x}\in{\mathbb{R}}^{n}$ . In other words, there is a vector $\vec{x}\in{\mathbb{R}}^{n}$ such that $A\vec{x}=\lambda\vec{x}$ and $\vec{x}\neq\vec{0}$ . Since $A$ is positive definite by assumption, $\lambda(\vec{x}^{T}\vec{x})=\vec{x}^{T}A\vec{x}>0$ . But we also know that if $\vec{x}\neq\vec{0}$ then $\vec{x}^{T}\vec{x}>0$ (since the standard scalar product is positive definite bilinear form; see 3.B). Therefore, $\lambda>0$ . So we have shown (i) $\Rightarrow$ (ii).

We omit the proof that these are equivalent to (iii). ∎

Exercise 5.16:

Let $A=\begin{bmatrix}2&-1&0\\ -1&2&-1\\ 0&-1&2\end{bmatrix}$ .

•

Find the eigenvalues of $A$ .
•

Calculate the leading principal minors of $A$ .
•

Define 2 vectors of your choice, and check that for each of them $\vec{x}^{T}A\vec{x}>0$ .

In your opinion, which of these methods is the best to show positive definiteness?

Exercise 5.17:

Let $A=\begin{bmatrix}1&-4\\ 0&1\end{bmatrix}$ . Prove that the leading principal minors are all positive, and also prove that $A$ is not positive definite. Why doesn’t this contradict Theorem 5.15?

[End of Exercise]