References

MATH454/554: Project II

TO BE HANDED IN BY MONDAY 11/12/2017 (WEEK 10), 10:00.

This project will contribute 17% towards the final mark.

Submission: Upload the pdf of your answer and your R code file to the Moodle site. Your R code should be as .r or .txt file so that it can be copied and pasted to run. Submit also a printed copy of your answers (no need for a printed copy of the R code), together with a plagiarism cover sheet, to the MSc submissions pigeon hole. Please write your student ID on your answers, not your name.

This project looks at modelling the number of movements of a fetal lamb over 240 consecutive periods, each of 5 seconds. The data which is given in lamb.txt comes from Leroux and Puterman (1992) and has also been analysed in Fearnhead (2005).

Let $\mathbf{x}=(x_{1},x_{2},\ldots,x_{240})$ denote the 240 fetal lamb movements. It is assumed that the data arise from the following Markov-dependent mixture model defined in Fearnhead (2005), Section 3.3. There are two underlying (unobserved) states, which we will term states 1 and 2. Let $y_{t}$ denote the state at time $t$ . Then it is assumed that

x_{t}|y_{t}\sim\left\{\begin{array}[]{ll}{\rm Po}(\lambda_{1})&\mbox{if }y_{t}% =1\\ {\rm Po}(\lambda_{2})&\mbox{if }y_{t}=2.\end{array}\right.

Furthermore, it is assumed that the unobserved states $\mathbf{y}=(y_{1},y_{2},\ldots,y_{240})$ follow a Markovian structure with $y_{1}=1$ and for $t>1$ ,

$\displaystyle P(y_{t}=1\|y_{t-1}=1)$	$\displaystyle=$	$\displaystyle p_{1}$
$\displaystyle P(y_{t}=2\|y_{t-1}=1)$	$\displaystyle=$	$\displaystyle 1-p_{1}$
$\displaystyle P(y_{t}=1\|y_{t-1}=2)$	$\displaystyle=$	$\displaystyle 1-p_{2}$
$\displaystyle P(y_{t}=2\|y_{t-1}=2)$	$\displaystyle=$	$\displaystyle p_{2}.$

Thus the process $\{y_{t}\}$ is Markov chain with transition matrix

\left(\begin{array}[]{cc}p_{1}&1-p_{1}\\ 1-p_{2}&p_{2}\end{array}\right).

We are interested in constructing a Gibbs sampler to obtain samples from $\pi(\mbox{\boldmath$\theta$}|\mathbf{x})$ , where $\mbox{\boldmath$\theta$}=(\lambda_{1},\lambda_{2},p_{1},p_{2})$ . We will use data augmentation of $\mathbf{y}$ to achieve this. (Note that $y_{1}=1$ is known and therefore fixed.)

Assume independent priors for the four parameters with a ${\rm Gamma}(1,1)$ prior for $\lambda_{1}$ and $\lambda_{2}$ and a ${\rm Beta}(1,1)$ prior (uniform prior) for $p_{1}$ and $p_{2}$ .

1.

Write down $\pi(\mathbf{x},\mathbf{y}|\mbox{\boldmath$\theta$})$ , the likelihood of the observed data (number of movements of the fetal lamb) and the augmented data (mixture component). [2]

For $1<t<240$ , show that [1]

\pi(y_{t}|\mathbf{x},\mathbf{y}_{-t},\mbox{\boldmath$\theta$})\propto\frac{% \lambda_{y_{t}}^{x_{t}}}{x_{t}!}\exp(-\lambda_{y_{t}})p_{y_{t-1}}^{I(y_{t}=y_{% t-1})}(1-p_{y_{t-1}})^{I(y_{t}\neq y_{t-1})}p_{y_{t}}^{I(y_{t+1}=y_{t})}(1-p_{% y_{t}})^{I(y_{t+1}\neq y_{t})}.

Hence, show that [2]

P(y_{t}=1|\mathbf{x},\mathbf{y}_{-t},\mbox{\boldmath$\theta$})=\frac{Q(1;y_{t-% 1},y_{t+1},x_{t},\mbox{\boldmath$\theta$})}{Q(1;y_{t-1},y_{t+1},x_{t},\mbox{% \boldmath$\theta$})+Q(2;y_{t-1},y_{t+1},x_{t},\mbox{\boldmath$\theta$})},

where for $s=1,2$ ,

Q(s;y_{t-1},y_{t+1},x_{t},\mbox{\boldmath$\theta$})=\lambda_{s}^{x_{t}}\exp(-% \lambda_{s})p_{y_{t-1}}^{I(s=y_{t-1})}(1-p_{y_{t-1}})^{I(s\neq y_{t-1})}p_{s}^% {I(y_{t+1}=s)}(1-p_{s})^{I(y_{t+1}\neq s)}.

Similarly,

			$\displaystyle P(y_{240}=1\|\mathbf{x},\mathbf{y}_{-240},\mbox{\boldmath$\theta$})$
		$\displaystyle=$	$\displaystyle\frac{\lambda_{1}^{x_{240}}\exp(-\lambda_{1})p_{y_{239}}^{I(1=y_{% 239})}(1-p_{y_{239}})^{I(1\neq y_{239})}}{\lambda_{1}^{x_{240}}\exp(-\lambda_{% 1})p_{y_{239}}^{I(1=y_{239})}(1-p_{y_{239}})^{I(1\neq y_{239})}+\lambda_{2}^{x% _{240}}\exp(-\lambda_{2})p_{y_{239}}^{I(2=y_{239})}(1-p_{y_{239}})^{I(2\neq y_% {239})}}.$

3.

For $j=1,2$ , compute the conditional distribution of $\lambda_{j}$ given $\mathbf{x},\mathbf{y}$ and $\mbox{\boldmath$\theta$}_{-\lambda_{j}}$ . [2]
4.

For $j,k=1,2$ , let $M_{jk}=\sum_{t=2}^{240}I(y_{t-1}=j)I(y_{t}=k)$ . Show that for $j=1,2$ , that the conditional distribution of $p_{j}$ given $\mathbf{x},\mathbf{y}$ and $\mbox{\boldmath$\theta$}_{-p_{j}}$ satisfies [2]

$p_{j}|\mathbf{x},\mathbf{y},\mbox{\boldmath$\theta$}_{-p_{j}}\sim{\rm Beta}(M_% {jj}+1,M_{j(3-j)}+1).$
5.

Write an R routine to implement the Gibbs sampler. [5]
6.

Run the R routine to obtain a sample of size 51000 from the posterior and discard the first 1000 iterations as burn-in. Compute the posterior means and standard deviations of the parameters. [3]

References

[1] Fearnhead, P. (2005) Direct simulation for discrete mixture distributions. Stats & Computing. 15, 125–133.
[2] Leroux B.G. and Puterman M.L. (1992) Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. Biometrics 48 545–558.