MATH454/554: Project II

TO BE HANDED IN BY MONDAY 11/12/2017 (WEEK 10), 10:00.

This project will contribute 17% towards the final mark.

Submission: Upload the pdf of your answer and your R code file to the Moodle site. Your R code should be as .r or .txt file so that it can be copied and pasted to run. Submit also a printed copy of your answers (no need for a printed copy of the R code), together with a plagiarism cover sheet, to the MSc submissions pigeon hole. Please write your student ID on your answers, not your name.

This project looks at modelling the number of movements of a fetal lamb over 240 consecutive periods, each of 5 seconds. The data which is given in lamb.txt comes from Leroux and Puterman (1992) and has also been analysed in Fearnhead (2005).

Let 𝐱=(x1,x2,,x240) denote the 240 fetal lamb movements. It is assumed that the data arise from the following Markov-dependent mixture model defined in Fearnhead (2005), Section 3.3. There are two underlying (unobserved) states, which we will term states 1 and 2. Let yt denote the state at time t. Then it is assumed that

xt|yt{Po(λ1)if yt=1Po(λ2)if yt=2.

Furthermore, it is assumed that the unobserved states 𝐲=(y1,y2,,y240) follow a Markovian structure with y1=1 and for t>1,

P(yt=1|yt-1=1) = p1
P(yt=2|yt-1=1) = 1-p1
P(yt=1|yt-1=2) = 1-p2
P(yt=2|yt-1=2) = p2.

Thus the process {yt} is Markov chain with transition matrix

(p11-p11-p2p2).

We are interested in constructing a Gibbs sampler to obtain samples from π(𝜽|𝐱), where 𝜽=(λ1,λ2,p1,p2). We will use data augmentation of 𝐲 to achieve this. (Note that y1=1 is known and therefore fixed.)

Assume independent priors for the four parameters with a Gamma(1,1) prior for λ1 and λ2 and a Beta(1,1) prior (uniform prior) for p1 and p2.

  1. 1.

    Write down π(𝐱,𝐲|𝜽), the likelihood of the observed data (number of movements of the fetal lamb) and the augmented data (mixture component). [2]

  2. 2.

    For 1<t<240, show that [1]

    π(yt|𝐱,𝐲-t,𝜽)λytxtxt!exp(-λyt)pyt-1I(yt=yt-1)(1-pyt-1)I(ytyt-1)pytI(yt+1=yt)(1-pyt)I(yt+1yt).

    Hence, show that [2]

    P(yt=1|𝐱,𝐲-t,𝜽)=Q(1;yt-1,yt+1,xt,𝜽)Q(1;yt-1,yt+1,xt,𝜽)+Q(2;yt-1,yt+1,xt,𝜽),

    where for s=1,2,

    Q(s;yt-1,yt+1,xt,𝜽)=λsxtexp(-λs)pyt-1I(s=yt-1)(1-pyt-1)I(syt-1)psI(yt+1=s)(1-ps)I(yt+1s).

    Similarly,

    P(y240=1|𝐱,𝐲-240,𝜽)
    = λ1x240exp(-λ1)py239I(1=y239)(1-py239)I(1y239)λ1x240exp(-λ1)py239I(1=y239)(1-py239)I(1y239)+λ2x240exp(-λ2)py239I(2=y239)(1-py239)I(2y239).
  3. 3.

    For j=1,2, compute the conditional distribution of λj given 𝐱,𝐲 and 𝜽-λj. [2]

  4. 4.

    For j,k=1,2, let Mjk=t=2240I(yt-1=j)I(yt=k). Show that for j=1,2, that the conditional distribution of pj given 𝐱,𝐲 and 𝜽-pj satisfies [2]

    pj|𝐱,𝐲,𝜽-pjBeta(Mjj+1,Mj(3-j)+1).
  5. 5.

    Write an R routine to implement the Gibbs sampler. [5]

  6. 6.

    Run the R routine to obtain a sample of size 51000 from the posterior and discard the first 1000 iterations as burn-in. Compute the posterior means and standard deviations of the parameters. [3]

References

  • [1] Fearnhead, P. (2005) Direct simulation for discrete mixture distributions. Stats & Computing. 15, 125–133.
  • [2] Leroux B.G. and Puterman M.L. (1992) Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. Biometrics 48 545–558.