Write down the likelihood of given , where denotes the total number of households with people infected.
The data is multinomial, in that, there are three possible outcomes for each household and the outcome in each household are independent and identically distributed. Hence,
The log-likelihood is
(Leave this question to the end, if you are short on time.) Find . Solve to find the MLE, .
This forms a useful check for the EM algorithm and should hopefully convince you of the advantages of using the EM algorithm for this problem.
Setting , we get
Solving the quadratic gives
Therefore since .
Let denote the total number of households where the initial infective infects both the other individuals in the household. i.e. Outcome occurs.
Write down the likelihood of and given .
The data is again multinomial, but there are now four outcomes correspond as final size 3 is split into two. Hence,
Simplify
and compute the MLE, for .
(This will
help form the M-step of the EM algorithm.)
From the previous answer,
Therefore
Setting gives
Hence
Notice the similarities with the genetic linkage example with
and .
What is the distribution of given and ?
Hint: Think about similarities between this problem and the
genetics example.
There are households where all three individuals are infected. For each of these households there are two possibilities and . The conditional probability of given that three people are infected in the household is
Hence, .
Write down .
(This will help form the E-step of the
EM algorithm.)
Note that is not important since it is not a function of p. Therefore we only need to find which using , gives .
Write an EM algorithm in R to find the MLE .
See R code solutions.
Calculate the standard error of the MLE .
Firstly note that
Therefore , we have that
Secondly, since , we have that
Therefore the standard error of is .
The likelihood of and is
where if and if
The log-likelihood of and given is
Now and . Note that if then , so . (Since only if .)
Therefore
(1.1) | |||||
as required.
To find the MLEs simply differentiate (1.1) with respect to and , set the derivative equal to 0 and solve.
Firstly,
Therefore satisfies
Therefore .
Secondly,
Therefore satisfies
Therefore .
This involves find for which is
For , and so .
For , and . Therefore
as required.
The EM algorithm alternates between:-
E-step: given the current estimate of p.
Note that
Therefore it suffices to compute which is derived in .
M-step: Maximizing with respect to and where is replaced by the corresponding expected values from the E-step (part ) and the MLEs are given in the part .
The likelihood is
Note that the log-likelihood satisfies
where is a constant not depending upon .
Hence
Setting yields
Note that implies that . Therefore follows conditioned to be positive.
Therefore
Choose an initial value for , say .
E-step: For each , compute . For , is given by (c) and can be computed similarly.
M-Step: Compute .
Stop when two consecutive estimates of agree to a predefined precision, say .