4 Markov chains

4.10 More general N-step formulae

Example 4.10.1.

Suppose that 0<a,b1. Show that

P:=(1-aab1-b)=ADA-1

where

A=1a+b(1a/b1-b/a)andD=(1001-(a+b)).

Hence, show that

Pn=(1-aab1-b)n=1a+b(b+aλ2na-aλ2nb+bλ2na-bλ2n),

where λ2=1-(a+b).

Firstly

A-1=a+b(1a/b1-b/a)-1=1a+b(baab-ab).

So

ADA-1 = 1a+b(1a/b1-b/a)(1001-(a+b))1a+b(baab-ab)
= 1a+b(1a/b1-b/a)(baab(1-(a+b))-ab(1-(a+b)))
= 1a+b(b+a(1-(a+b))a-a(1-(a+b))b-b(1-(a+b))a+b(1-(a+b)))
= (1-aab1-b)=P.

Next

Pn=(ADA-1)n=ADA-1ADA-1ADA-1=ADnA-1.

Therefore

Pn = 1a+b(1a/b1-b/a)(100(1-(a+b))n)1a+b(baab-ab)
= 1a+b(1a/b1-b/a)(baab(1-(a+b))n-ab(1-(a+b))n)
= 1a+b(b+a(1-(a+b))na-a(1-(a+b))nb-b(1-(a+b))na+b(1-(a+b))n).
Remark.
  • (a)

    Provided a<1 or b<1 , |1-(a+b)|<1 and so (1-(a+b))n0 and the n-step transition matrix converges to

    1a+b(baba).
  • (b)

    The rate of convergence is |1-(a+b)|. Discuss the behaviour of the chain when
    (i) a=b=0.9, (ii) a=b=0.1, and (iii) a=b=0.5.

    • (i)

      1-(a+b)=-0.8<0. Chain converges by alternating.

    • (ii)

      1-(a+b)=0.8>0. Chain converges monotonically.

    • (iii)

      1-(a+b)=0. Chain enters invariant distribution in 1 step.

Theorem 4.10.2.

Let the m-state transition kernel P have eigenvalues λ1,,λm. If either or both of the following hold

  • (a)

    the eigenvalues are distinct (i.e. λiλj for ij),

  • (b)

    P is reversible with respect to a stationary distribution π,

then it is possible to decompose P as P=ADA-1, where A is a square matrix and

D:=(λ1000λ2000λm).
Corollary 4.10.3.

If the transition matrix satisfies either or both of the conditions of Theorem 4.10.2 then the n-step transition matrix is

Pn=A(λ1n000λ2n000λmn)A-1.

This corollary is very powerful, but actually finding the eigenvalues and the matrix A can be time-consuming. Fortunately R (or other mathematical software) can do much of the hard work.

NB: Note the general form: [Pn]ij is a linear combination of λ1n,λ2n,,λmn.

Example 4.10.4.

Consider a Markov chain with the following transition matrix

P=(0.10.60.30.10.80.10.30.10.6).

Use R to find the geometric rate of convergence, the asymptotic distribution and the general formula for the probability that a chain which was started in state 1 will be in state 2 after n time-steps.

>   P<-matrix(data=c(0.1,0.6,0.3,0.1,0.8,0.1,0.3,0.1,0.6),byrow=T,nrow=3)
>   P
     [,1] [,2] [,3]
[1,]  0.1  0.6  0.3
[2,]  0.1  0.8  0.1
[3,]  0.3  0.1  0.6
>   a<-eigen(P) # find eigenvalues and matrix A
>   a$values
[1]  1.00000000  0.57015621 -0.07015621

The geometric rate of convergence is therefore 0.570.

>   A<-a$vectors
>   D<-diag(a$values)
>   A
           [,1]        [,2]        [,3]
[1,] -0.5773503  0.04837303  0.91438384
[2,] -0.5773503 -0.41610883 -0.05905442
[3,] -0.5773503  0.90802725 -0.40051813
>   D
     [,1]      [,2]        [,3]
[1,]    1 0.0000000  0.00000000
[2,]    0 0.5701562  0.00000000
[3,]    0 0.0000000 -0.07015621
>   solve(A)
           [,1]       [,2]       [,3]
[1,] -0.2635729 -1.0166385 -0.4518393
[2,]  0.2358878 -0.9083522  0.6724644
[3,]  0.9147313 -0.5938609 -0.3208704
>   A %*% D %*% solve(A)
     [,1] [,2] [,3]
[1,]  0.1  0.6  0.3
[2,]  0.1  0.8  0.1
[3,]  0.3  0.1  0.6
>

By definition the chain approaches its asymptotic distribution whatever the initial state. Without loss of generality let us assume that it starts in state 1.

As n

Dn(100000000)

so

(100)Pn (-0.5770.04840.914)(100000000)(-0.264-1.017-0.4520.236-0.9080.6720.915-0.594-0.321)
(-0.57700)(-0.264-1.017-0.4520.236-0.9080.6720.915-0.594-0.321)
= (-0.577×-0.264-0.577×-1.017-0.577×-0.452)
(0.1520.5870.261)π.

We can use R to do this calculation.

>   A[1,1]*solve(A)[1,]
[1] 0.1521739 0.5869565 0.2608696

Finally we consider the probability that a chain which started in state 1 will be in state 2 after n steps.

[Pn]12 = (100)Pn(010)=[top row of A]Dn[middlecolumnof A-1]
(-0.5770.04840.914)(10000.570n000(-0.070)n)(-1.017-0.908-0.594)
-0.577×-1.017+0.0484×-0.908×0.570n+0.914×-0.594×(-0.070)n
0.587-0.044×0.570n-0.543×(-0.070)n.

NB: Recall the general form: [Pn]ij=c1λ1n+c2λ2n+c3λ3n, a linear combination of the powers of the eigenvalues.

R can be used to obtain the coefficients in the above example.

>   A[1,]*solve(A)[,2]
[1]  0.58695652 -0.04393975 -0.54301677
Example 4.10.5.

Consider a Markov chain with the following transition matrix

P=(3/52/501/23/101/502/53/5).

The eigenvalues of P are 1,3/5,-1/10. Find the invariant distribution and explain why this is the asymptotic distribution. Hence find the general formula for P22(n).

Note that unlike the previous example, we are not given a decomposition. The Markov chain is irreducible and aperiodic so it has a limiting distribution which is the unique invariant distribution.

Since the matrix is tridiagonal we can use detailed balance to find the invariant distribution.

π1×2/5=π2×1/2π2=45π1
π2×1/5=π3×2/5π3=12π2=25π1.

So π(5,4,2) hence π=(5/11,4/11,2/11).

Now

[P(n)]22=[Pn]22=a+b(3/5)n+c(-1/10)n.

But limn[P(n)]22=4/11, [P(0)]22=1, [P(1)]22=3/10. So

a = 4/11
a+b+c = 1
a+3b/5-c/10 = 3/10.

Thus b+c=7/11 and 3b/5-c/10=-7/110. The second equation simplifies to 6b-c=-7/11. Adding this to the first equation gives b=0, from which c=7/11. Therefore

[P(n)]22=411+711(-110)n.

NB: If we had not wished to find the asymptotic distribution we could simple have found [P2]22 and solved the slightly different set of simultaneous equations.