N

-step formula for transition matrices - intro.5 Continuous Markov chains

4.10 More general $N$ -step formulae

Example 4.10.1.

Suppose that $0<a,b\leq 1$ . Show that

P:=\left(\begin{array}[]{cc}1-a&a\\ b&1-b\end{array}\right)=ADA^{-1}

where

A=\frac{1}{\sqrt{a+b}}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)~{}~{}~{}\mbox{and}~{}~{}~{}D=\left(\begin{% array}[]{cc}1&0\\ 0&1-(a+b)\end{array}\right).

Hence, show that

P^{n}=\left(\begin{array}[]{cc}1-a&a\\ b&1-b\end{array}\right)^{n}=\frac{1}{a+b}\left(\begin{array}[]{cc}b+a\lambda_{% 2}^{n}&a-a\lambda_{2}^{n}\\ b+b\lambda_{2}^{n}&a-b\lambda_{2}^{n}\end{array}\right),

where $\lambda_{2}=1-(a+b)$ .

Firstly

A^{-1}=\sqrt{a+b}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)^{-1}=\frac{1}{\sqrt{a+b}}\left(\begin{array}[]% {cc}b&a\\ \sqrt{ab}&-\sqrt{ab}\end{array}\right).

$\displaystyle ADA^{-1}$	$\displaystyle=$	$\displaystyle\frac{1}{\sqrt{a+b}}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)\left(\begin{array}[]{cc}1&0\\ 0&1-(a+b)\end{array}\right)\frac{1}{\sqrt{a+b}}\left(\begin{array}[]{cc}b&a\\ \sqrt{ab}&-\sqrt{ab}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{a+b}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)\left(\begin{array}[]{cc}b&a\\ \sqrt{ab}(1-(a+b))&-\sqrt{ab}(1-(a+b))\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{a+b}\left(\begin{array}[]{cc}b+a(1-(a+b))&a-a(1-(a+b))\\ b-b(1-(a+b))&a+b(1-(a+b))\end{array}\right)$
	$\displaystyle=$	$\displaystyle\left(\begin{array}[]{cc}1-a&a\\ b&1-b\end{array}\right)=P.$

P^{n}=(ADA^{-1})^{n}=ADA^{-1}~{}~{}ADA^{-1}~{}~{}\dots~{}~{}ADA^{-1}=AD^{n}A^{% -1}.

Therefore

$\displaystyle P^{n}$	$\displaystyle=$	$\displaystyle\frac{1}{\sqrt{a+b}}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)\left(\begin{array}[]{cc}1&0\\ 0&(1-(a+b))^{n}\end{array}\right)\frac{1}{\sqrt{a+b}}\left(\begin{array}[]{cc}% b&a\\ \sqrt{ab}&-\sqrt{ab}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{a+b}\left(\begin{array}[]{cc}1&\sqrt{a/b}\\ 1&-\sqrt{b/a}\end{array}\right)\left(\begin{array}[]{cc}b&a\\ \sqrt{ab}(1-(a+b))^{n}&-\sqrt{ab}(1-(a+b))^{n}\end{array}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{a+b}\left(\begin{array}[]{cc}b+a(1-(a+b))^{n}&a-a(1-(a+b% ))^{n}\\ b-b(1-(a+b))^{n}&a+b(1-(a+b))^{n}\end{array}\right).$

Remark.

(a)

Provided $a<1$ or $b<1$ , $|1-(a+b)|<1$ and so $(1-(a+b))^{n}\rightarrow 0$ and the $n$ -step transition matrix converges to

$\frac{1}{a+b}\left(\begin{array}[]{cc}b&a\\ b&a\end{array}\right).$
(b)
The rate of convergence is $|1-(a+b)|$ . Discuss the behaviour of the chain when
(i) $a=b=0.9$ , (ii) $a=b=0.1$ , and (iii) $a=b=0.5$ .
- (i)
  
  $1-(a+b)=-0.8<0$ . Chain converges by alternating.
- (ii)
  
  $1-(a+b)=0.8>0$ . Chain converges monotonically.
- (iii)
  
  $1-(a+b)=0$ . Chain enters invariant distribution in 1 step.

Theorem 4.10.2.

Let the $m$ -state transition kernel $P$ have eigenvalues $\lambda_{1},\dots,\lambda_{m}$ . If either or both of the following hold

(a)

the eigenvalues are distinct (i.e. $\lambda_{i}\neq\lambda_{j}$ for $i\neq j$ ),
(b)

$P$ is reversible with respect to a stationary distribution $\pi$ ,

then it is possible to decompose $P$ as $P=ADA^{-1}$ , where $A$ is a square matrix and

D:=\left(\begin{array}[]{cccc}\lambda_{1}&0&\dots&0\\ 0&\lambda_{2}&\dots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\dots&\lambda_{m}\end{array}\right).

Corollary 4.10.3.

If the transition matrix satisfies either or both of the conditions of Theorem 4.10.2 then the $n$ -step transition matrix is

P^{n}=A\left(\begin{array}[]{cccc}\lambda_{1}^{n}&0&\dots&0\\ 0&\lambda_{2}^{n}&\dots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\dots&\lambda_{m}^{n}\end{array}\right)A^{-1}.

This corollary is very powerful, but actually finding the eigenvalues and the matrix $A$ can be time-consuming. Fortunately R (or other mathematical software) can do much of the hard work.

NB: Note the general form: $[P^{n}]_{ij}$ is a linear combination of $\lambda_{1}^{n},\lambda_{2}^{n},\dots,\lambda_{m}^{n}$ .

Example 4.10.4.

Consider a Markov chain with the following transition matrix

P=\left(\begin{array}[]{cccc}0.1&0.6&0.3\\ 0.1&0.8&0.1\\ 0.3&0.1&0.6\end{array}\right).

Use R to find the geometric rate of convergence, the asymptotic distribution and the general formula for the probability that a chain which was started in state 1 will be in state 2 after $n$ time-steps.

>   P<-matrix(data=c(0.1,0.6,0.3,0.1,0.8,0.1,0.3,0.1,0.6),byrow=T,nrow=3)
>   P
     [,1] [,2] [,3]
[1,]  0.1  0.6  0.3
[2,]  0.1  0.8  0.1
[3,]  0.3  0.1  0.6
>   a<-eigen(P) # find eigenvalues and matrix A
>   a$values
[1]  1.00000000  0.57015621 -0.07015621

The geometric rate of convergence is therefore $\approx 0.570$ .

>   A<-a$vectors
>   D<-diag(a$values)
>   A
           [,1]        [,2]        [,3]
[1,] -0.5773503  0.04837303  0.91438384
[2,] -0.5773503 -0.41610883 -0.05905442
[3,] -0.5773503  0.90802725 -0.40051813
>   D
     [,1]      [,2]        [,3]
[1,]    1 0.0000000  0.00000000
[2,]    0 0.5701562  0.00000000
[3,]    0 0.0000000 -0.07015621
>   solve(A)
           [,1]       [,2]       [,3]
[1,] -0.2635729 -1.0166385 -0.4518393
[2,]  0.2358878 -0.9083522  0.6724644
[3,]  0.9147313 -0.5938609 -0.3208704
>   A %*% D %*% solve(A)
     [,1] [,2] [,3]
[1,]  0.1  0.6  0.3
[2,]  0.1  0.8  0.1
[3,]  0.3  0.1  0.6
>

By definition the chain approaches its asymptotic distribution whatever the initial state. Without loss of generality let us assume that it starts in state 1.

As $n\rightarrow\infty$

D^{n}\rightarrow\left(\begin{array}[]{ccc}1&0&0\\ 0&0&0\\ 0&0&0\end{array}\right)

$\displaystyle\left(\begin{array}[]{ccc}1&0&0\end{array}\right)P^{n}$	$\displaystyle\rightarrow$	$\displaystyle\left(\begin{array}[]{ccc}-0.577&0.0484&0.914\end{array}\right)% \left(\begin{array}[]{ccc}1&0&0\\ 0&0&0\\ 0&0&0\end{array}\right)\left(\begin{array}[]{ccc}-0.264&-1.017&-0.452\\ 0.236&-0.908&0.672\\ 0.915&-0.594&-0.321\end{array}\right)$
	$\displaystyle\approx$	$\displaystyle\left(\begin{array}[]{ccc}-0.577&0&0\end{array}\right)\left(% \begin{array}[]{ccc}-0.264&-1.017&-0.452\\ 0.236&-0.908&0.672\\ 0.915&-0.594&-0.321\end{array}\right)$
	$\displaystyle=$	$\displaystyle\left(\begin{array}[]{ccc}-0.577\times-0.264&-0.577\times-1.017&-% 0.577\times-0.452\end{array}\right)$
	$\displaystyle\approx$	$\displaystyle\left(\begin{array}[]{ccc}0.152&0.587&0.261\end{array}\right)% \approx\pi.$

We can use R to do this calculation.

>   A[1,1]*solve(A)[1,]
[1] 0.1521739 0.5869565 0.2608696

Finally we consider the probability that a chain which started in state 1 will be in state 2 after $n$ steps.

$\displaystyle[P^{n}]_{12}$	$\displaystyle=$	$\displaystyle\left(\begin{array}[]{ccc}1&0&0\end{array}\right)P^{n}\left(% \begin{array}[]{c}0\\ 1\\ 0\end{array}\right)=[\mbox{top row of A}]~{}D^{n}~{}\left[\begin{array}[]{c}% \mbox{middle}\\ \mbox{column}\\ \mbox{of }A^{-1}\end{array}\right]$
	$\displaystyle\approx$	$\displaystyle\left(\begin{array}[]{ccc}-0.577&0.0484&0.914\end{array}\right)% \left(\begin{array}[]{ccc}1&0&0\\ 0&0.570^{n}&0\\ 0&0&(-0.070)^{n}\end{array}\right)\left(\begin{array}[]{c}-1.017\\ -0.908\\ -0.594\end{array}\right)$
	$\displaystyle\approx$	$\displaystyle-0.577\times-1.017+0.0484\times-0.908\times 0.570^{n}+0.914\times% -0.594\times(-0.070)^{n}$
	$\displaystyle\approx$	$\displaystyle 0.587-0.044\times 0.570^{n}-0.543\times(-0.070)^{n}.$

NB: Recall the general form: $[P^{n}]_{ij}=c_{1}\lambda_{1}^{n}+c_{2}\lambda_{2}^{n}+c_{3}\lambda_{3}^{n}$ , a linear combination of the powers of the eigenvalues.

R can be used to obtain the coefficients in the above example.

>   A[1,]*solve(A)[,2]
[1]  0.58695652 -0.04393975 -0.54301677

Example 4.10.5.

Consider a Markov chain with the following transition matrix

P=\left(\begin{array}[]{cccc}3/5&2/5&0\\ 1/2&3/10&1/5\\ 0&2/5&3/5\end{array}\right).

The eigenvalues of $P$ are $1,3/5,-1/10$ . Find the invariant distribution and explain why this is the asymptotic distribution. Hence find the general formula for $P^{(n)}_{22}$ .

Note that unlike the previous example, we are not given a decomposition. The Markov chain is irreducible and aperiodic so it has a limiting distribution which is the unique invariant distribution.

Since the matrix is tridiagonal we can use detailed balance to find the invariant distribution.

\pi_{1}\times 2/5=\pi_{2}\times 1/2\Rightarrow\pi_{2}=\frac{4}{5}\pi_{1}

\pi_{2}\times 1/5=\pi_{3}\times 2/5\Rightarrow\pi_{3}=\frac{1}{2}\pi_{2}=\frac% {2}{5}\pi_{1}.

So $\pi\propto(5,4,2)$ hence $\pi=(5/11,4/11,2/11)$ .

Now

[P^{(n)}]_{22}=[P^{n}]_{22}=a+b(3/5)^{n}+c(-1/10)^{n}.

But $\lim_{n\rightarrow\infty}[P^{(n)}]_{22}=4/11$ , $[P^{(0)}]_{22}=1$ , $[P^{(1)}]_{22}=3/10$ . So

$\displaystyle a$	$\displaystyle=$	$\displaystyle 4/11$
$\displaystyle a+b+c$	$\displaystyle=$	$\displaystyle 1$
$\displaystyle a+3b/5-c/10$	$\displaystyle=$	$\displaystyle 3/10.$

Thus $b+c=7/11$ and $3b/5-c/10=-7/110$ . The second equation simplifies to $6b-c=-7/11$ . Adding this to the first equation gives $b=0$ , from which $c=7/11$ . Therefore

[P^{(n)}]_{22}=\frac{4}{11}+\frac{7}{11}\left(\frac{-1}{10}\right)^{n}.

NB: If we had not wished to find the asymptotic distribution we could simple have found $[P^{2}]_{22}$ and solved the slightly different set of simultaneous equations.

4.10 More general N-step formulae