Math330 Exercises Week 5

There is no coursework this week but here are some exercises.

WS5.1

Suppose that two models are fit to some data $\bm{x}$ . Model A has 3 parameters and is nested in Model B which has 4 parameters. The log-likelihoods evaluated at their respective MLEs are given in the below table.

Model $\ell(\hat{\bm{\theta}})$

A -110.3

B -108.8

Carry out a likelihood ratio test to determine a preferred model. Which model is preferred by the AIC? How do you explain this?
WS5.2

Returning to the hospital example 6.2 in the notes. In fact, hospitals B, H, J and K are specialist hospitals, which therefore take riskier surgical cases than the remaining hospitals. Hence an intermediate hypothesis can be considered: $H_{2}:$ there are two mortality rates, one for the specialist hospitals and one for the other hospitals.

What would the MLEs for the mortality rates be under $H_{2}$ ?
The following R code is run with along with the code on page 94 of the notes (check you agree this corresponds to $H_{2}$ ):
```
spec<-c(0,1,0,0,0,0,0,1,0,1,1,0)
ts<-sum(r[spec==1])/sum(m[spec==1])
tr<-sum(r[spec==0])/sum(m[spec==0])

theta2<-c(tr,ts,tr,tr,tr,tr,tr,ts,tr,ts,ts,tr)
hosploglhd(theta=theta2,r,m)
```
and gives a log-likelihood of -30.54259.
Carry out likelihood ratio tests comparing all relevant hypotheses. Which model is to be preferred? Is the same conclusion reached by the AIC?

Note qchisq(0.95,df=10)=18.31.
WS5.3

In the UK lotto, 6 balls are drawn from a set of 49 consecutively numbered balls (we exclude the bonus ball here). Do odd numbered balls each have the same chance of being drawn as even numbered balls? The following table gives the number of odd balls drawn in each lotto up to 31/10/12:

Odd balls Frequency

6 19

5 188

4 428

3 543

2 436

1 129

0 16

For example, 19 draws had no even balls (all balls drawn were odd). In summary, there have been 1759 draws in total, and 5396 odd balls drawn. Interest lies in $\theta_{0}$ , the proportion of odd balls drawn.
- (a)
  
  Formulate appropriate null and alternative hypotheses for $\theta_{0}$ .
- (b)
  
  One proposal is to model the number of odd balls in each lotto as Binomial $(6,\theta)$ . Calculate the resulting log-likelihood and find the MLE. Calculate the deviance of $\theta$ assuming the null hypothesis is true and hence carry out the hypothesis test specified in (a).
- (c)
  
  Explain why the Binomial distribution is not exactly correct in this case. (HINT: What’s the probability that the second drawn ball is odd, given that the first ball was odd?).
- (d)
  
  [Extra: Not examinable] The exact distribution for this case is complex and the PMF is not available in closed form. One alternative is to simulate a large number $B$ times from the true null hypothesis (that odd balls are not preferred over even balls) and calculate MLEs for each of the $B$ scenarios.
  Concretely, consider the following R code:
```
B<-1000 #large number of samples

nOdd<-matrix(nrow=B,ncol=1759)
#this large matrix will store the number of odd balls in each draw
#rows: different samples
#cols: different draws within a sample

for(b in 1:B){ #for-loop over B samples
for(i in 1:1759){ #for-loop over all 1759 draws
  draw<-sample(1:49,6) #numbers drawn
  nOdd[b,i]<-sum(draw%%2) #this divides the drawn
  #numbers by 2 and sums the remainders
}}

thethat<-colMeans(nOdd)/6
#gives a vector of thetahats, one for each sample

quantile(thethat,probs=c(0.025,0.5,0.975))
#construct a confidence interval numerically
```
  This approach is called the parametric bootstrap.

Model	$\ell(\hat{\bm{\theta}})$
A	-110.3
B	-108.8

Odd balls	Frequency
6	19
5	188
4	428
3	543
2	436
1	129
0	16