4 Markov chains 4.8 Expected hitting times 4.10 More general

N

4.9 $N$ -step formula for transition matrices - intro.

Example 4.9.1.

Define

P:=\frac{1}{60}\left(\begin{array}[]{ccc}29&8&23\\ 8&44&8\\ 23&8&29\end{array}\right)

and set

A=\frac{1}{\sqrt{6}}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)~{}~{},~{}~{}B=\frac{1}{\sqrt{6}}\left(% \begin{array}[]{ccc}\sqrt{2}&\sqrt{2}&\sqrt{2}\\ 1&-2&1\\ \sqrt{3}&0&-\sqrt{3}\end{array}\right)~{}~{}\mbox{and}~{}~{}D=\left(\begin{% array}[]{ccc}1&0&0\\ 0&3/5&0\\ 0&0&1/10\end{array}\right).

Show that $P=ADB$ and that $AB=I$ , and hence find $P^{n}$ .

Firstly

$\displaystyle ADB$	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)\left(\begin{array}[]{ccc}\sqrt{2}&\sqrt% {2}&\sqrt{2}\\ 3/5&-6/5&3/5\\ \sqrt{3}/10&0&-\sqrt{3}/10\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}2+3/5+3/10&2-6/5&2+3/5-3/10% \\ 2-6/5&2+12/5&2-6/5\\ 2+3/5-3/10&2-6/5&2+3/5+3/10\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}29/10&4/5&23/10\\ 4/5&22/5&4/5\\ 23/10&4/5&29/10\end{array}\right)=P.$

Next $A B$ is

\frac{1}{6}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)\left(\begin{array}[]{ccc}\sqrt{2}&\sqrt% {2}&\sqrt{2}\\ 1&-2&1\\ \sqrt{3}&0&-\sqrt{3}\end{array}\right)=\frac{1}{6}\left(\begin{array}[]{ccc}6&% 0&0\\ 0&6&0\\ 0&0&6\end{array}\right)=I.

We have therefore shown that $B=A^{-1}$ , and hence $P=ADA^{-1}$ . Hence

P^{n}=(ADA^{-1})^{n}=ADA^{-1}~{}~{}ADA^{-1}~{}~{}\dots~{}~{}ADA^{-1}=AD^{n}A^{% -1}.

Therefore

$\displaystyle P^{n}$	$\displaystyle=$	$\displaystyle\frac{1}{\sqrt{6}}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)\left(\begin{array}[]{ccc}1&0&0\\ 0&3/5&0\\ 0&0&1/10\end{array}\right)^{n}\frac{1}{\sqrt{6}}\left(\begin{array}[]{ccc}% \sqrt{2}&\sqrt{2}&\sqrt{2}\\ 1&-2&1\\ \sqrt{3}&0&-\sqrt{3}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)\left(\begin{array}[]{ccc}1&0&0\\ 0&(3/5)^{n}&0\\ 0&0&(1/10)^{n}\end{array}\right)\left(\begin{array}[]{ccc}\sqrt{2}&\sqrt{2}&% \sqrt{2}\\ 1&-2&1\\ \sqrt{3}&0&-\sqrt{3}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}\sqrt{2}&1&\sqrt{3}\\ \sqrt{2}&-2&0\\ \sqrt{2}&1&-\sqrt{3}\end{array}\right)\left(\begin{array}[]{ccc}\sqrt{2}&\sqrt% {2}&\sqrt{2}\\ (3/5)^{n}&-2(3/5)^{n}&(3/5)^{n}\\ \sqrt{3}(1/10)^{n}&0&-\sqrt{3}(1/10)^{n}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{6}\left(\begin{array}[]{ccc}2+(3/5)^{n}+3(1/10)^{n}&2-2(% 3/5)^{n}&2+(3/5)^{n}-3(1/10)^{n}\\ 2-2(3/5)^{n}&2+4(3/5)^{n}&2-2(3/5)^{n}\\ 2+(3/5)^{n}-3(1/10)^{n}&2-2(3/5)^{n}&2+(3/5)^{n}+3(1/10)^{n}\end{array}\right).$

Remark.

(a)

As $n\rightarrow\infty$ , $(3/5)^{n}\rightarrow 0$ and $(1/10)^{n}\rightarrow 0$ so

$P^{n}\rightarrow\left(\begin{array}[]{ccc}1/3&1/3&1/3\\ 1/3&1/3&1/3\\ 1/3&1/3&1/3\end{array}\right)=\left(\begin{array}[]{c}\pi\\ \pi\\ \pi\end{array}\right).$

- but we could have worked this out simply by showing that the stationary distribution was $(1/3,1/3,1/3)$ !
(b)

The formula tells us how quickly the Markov chain converges to its stationary distribution, i.e. the rate of convergence. Very quickly $(1/10)^{n}\ll(3/5)^{n}$ and so the main discrepancy from the stationary distribution is due to terms in $(3/5)^{n}$ .

4.9.1 MATH103 revision: eigenvectors and eigenvalues

Definition 4.9.2.

A matrix $M$ is defined to have a left eigenvector $e$ with eigenvalue $\lambda$ when $eM=\lambda e$ .

If $e$ is a left eigenvector of $M$ with eigenvalue $\lambda$ then so is $c e$ for any $c\neq 0$ .

If $P$ is a TPM with invariant distribution $\pi$ then $\pi$ is a left eigenvector with eigenvalue $1$ , since $\pi P=\pi$ .

Suppose that $M=ADA^{-1}$ with

D=\left(\begin{array}[]{cccc}\lambda_{1}&0&\dots&0\\ 0&\lambda_{2}&\dots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\dots&\lambda_{m}\end{array}\right).

\mbox{Set}~{}~{}~{}~{}~{}~{}~{}e_{i}:=\left(\begin{array}[]{ccccccccc}0&0&% \dots&0&1&0&\dots&0\end{array}\right)

(i.e. $e_{i}$ has a single $1$ in the $i^{th}$ column with the other entries zero) and note that

e_{i}D=\left(\begin{array}[]{ccccccccc}0&0&\dots&0&\lambda_{i}&0&\dots&0\end{% array}\right)=\lambda_{i}e_{i}.

Now $e_{i}A^{-1}$ is a left eigenvector of $M$ and its eigenvalue is $\lambda_{i}$ since

e_{i}A^{-1}M=e_{i}A^{-1}~{}ADA^{-1}=e_{i}DA^{-1}=\lambda_{i}e_{i}A^{-1}.

We will not be explicitly interested in the eigenvectors of $P$ (except, of course, for $\pi$ ), but we will use the eigenvalues of $P$ , $\lambda_{1},\dots,\lambda_{m}$ , through the decomposition $P=ADA^{-1}$ .

4.9.2 Remarks on eigenvalues (based on the Perron-Frobenius Theorem)

(i)

It can be shown that because the entries in any row of a TPM add up to $1$ , $|\lambda_{i}|\leq 1$ for all $i$ .
(ii)

Further, if $P$ is such that the Markov chain has an asymptotic distribution then $\pi$ is the only eigenvector for which the eigenvalue has modulus $1$ . For all other eigenvectors, $\left|{\lambda_{i}}\right|<1$ .
(iii)

Eigenvalues can be complex (i.e. have real and imaginary parts); we will not be dealing with such cases, but extension is straightforward.
(iv)

It is usual to set $\lambda_{1}=1$ and to arrange the other eigenvalues in order of decreasing magnitude; i.e. $1=\lambda_{1}\geq\left|{\lambda_{2}}\right|\geq\left|{\lambda_{3}}\right|\geq% \dots\geq\left|{\lambda_{m}}\right|$ .

4.9.3 The rate of convergence to the asymptotic distribution

In the example we showed that

P:=\frac{1}{60}\left(\begin{array}[]{ccc}29&8&23\\ 8&44&8\\ 23&8&29\end{array}\right)=ADA^{-1}~{}~{}\mbox{with}~{}~{}D=\left(\begin{array}% []{ccc}1&0&0\\ 0&3/5&0\\ 0&0&1/10\end{array}\right)

so the eigenvalues of $P$ are $1,~{}3/5,$ and $1/10$ .

We then found that

P^{n}=\left(ADA^{-1}\right)^{n}=AD^{n}A^{-1}.

As $n\rightarrow\infty$ ,

D^{n}=\left(\begin{array}[]{ccc}1&0&0\\ 0&(3/5)^{n}&0\\ 0&0&(1/10)^{n}\end{array}\right)\rightarrow\left(\begin{array}[]{ccc}1&0&0\\ 0&0&0\\ 0&0&0\end{array}\right).

For any Markov chain with an asymptotic distribution, all eigenvalues except the first have $\left|{\lambda_{i}}\right|<1$ and so $\lambda_{i}^{n}\rightarrow 0$ as $n\rightarrow\infty$ ( $i\geq 2$ ). The above limit for $D^{n}$ therefore holds for any such Markov chain.

Quickly $(1/10)^{n}\ll(3/5)^{n}$ and so the main discrepancy between

P^{n}=A\left(\begin{array}[]{ccc}1&0&0\\ 0&(3/5)^{n}&0\\ 0&0&(1/10)^{n}\end{array}\right)A^{-1}~{}~{}~{}\mbox{and}~{}~{}~{}\left(\begin% {array}[]{ccc}\pi\\ \pi\\ \pi\end{array}\right)=A\left(\begin{array}[]{ccc}1&0&0\\ 0&0&0\\ 0&0&0\end{array}\right)A^{-1}

is due to terms in $(3/5)^{n}$ (check back to the formula for the $n$ -step transition matrix). In general the biggest discrepancy from the asymptotic distribution, $\pi$ , is due to the second largest (in modulus) eigenvalue, $\lambda_{2}$ .

$\left|{\lambda_{2}}\right|$ is called the geometric rate of convergence of the Markov chain. The larger $\left|{\lambda_{2}}\right|$ the slower the Markov chain is to converge.

NB: If the eigenvalues were $1,~{}-3/5,~{}1/10$ then the geometric rate of convergence would still be $3/5$ .

4.9 N-step formula for transition matrices - intro.