1 Week 6: EM Solutions

1.1 Lab Session

  1. 1.

    Write down the likelihood L(p;𝐱) of 𝐱=(x1,x2,x3) given p, where xi denotes the total number of households with i people infected.

    The data is multinomial, in that, there are three possible outcomes for each household and the outcome in each household are independent and identically distributed. Hence,

    L(p;𝐱) = (x1+x2+x3)!x1!x2!x3!((1-p)2)x1(2p(1-p)2)x2{2p2(1-p)+p2}x3
    = (x1+x2+x3)!x1!x2!x3!((1-p)2)x1(2p(1-p)2)x2{p2(3-2p)}x3
    = (x1+x2+x3)!x1!x2!x3!2x2(1-p)2x1+2x2px2+2x3(3-2p)x3

    The log-likelihood is

    l(p;𝐱)=K+2(x1+x2)log(1-p)+(x2+2x3)logp+x3log(3-2p).
  2. 2.

    (Leave this question to the end, if you are short on time.) Find ddpl(p;𝐱)=ddplogL(p;𝐱). Solve ddpl(p;𝐱)=0 to find the MLE, p^.

    This forms a useful check for the EM algorithm and should hopefully convince you of the advantages of using the EM algorithm for this problem.

    ddpl(p;𝐱) = -2(x1+x2)1-p+x2+2x3p-2x33-2p
    = -1181-p+575p-5503-2p

    Setting ddpl(p;x)=0, we get

    -118p(3-2p)+575(1-p)(3-2p)-550p(1-p) = 0
    -354p+236p2+1725-2875p+1150p2-550p+550p2 = 0
    1936p2-3779p+1725 = 0.

    Solving the quadratic gives

    p^=3779±(-3779)2-4×1936×17252*1936=1.2240 or 0.7279.

    Therefore p^=0.7279 since 0<p<1.

  3. 3.

    Let y denote the total number of households where the initial infective infects both the other individuals in the household. i.e. Outcome {1,2} occurs.

    Write down the likelihood L(p;𝐱,y) of 𝐱 and y given p.

    The data is again multinomial, but there are now four outcomes correspond as final size 3 is split into two. Hence,

    L(p;𝐱,y) = (x1+x2+x3)!x1!x2!(x3-y)!y!((1-p)2)x1(2p(1-p)2)x2(2p2(1-p))x3-y(p2)y
    = (x1+x2+x3)!x1!x2!(x3-y)!y!2x2+x3-y)(1-p)2x1+2x2+x3-ypx2+2x3
  4. 4.

    Simplify l(p;𝐱,y)=logL(p;𝐱,y) and compute the MLE, p^ for (𝐱,y).
    (This will help form the M-step of the EM algorithm.)

    From the previous answer,

    l(p;𝐱,y) = K+(2x1+2x2+x3-y)log(1-p)+(x2+2x3)logp.

    Therefore

    ddpl(p;𝐱,y)=-2x1+2x2+x3-y1-p+x2+2x3p.

    Setting ddpl(p;x,y)=0 gives

    -(2x1+2x2+x3-y)p+(1-p)(x2+2x3) = 0
    x2+2x3 = p{2x1+2x2+x3-y+x2+2x3}.

    Hence

    p^=x2+2x32x1+3x2+3x3-y.

    Notice the similarities with the genetic linkage example with

    l(p;𝐱,y)=K+Blog(1-p)+Alogp

    and p^=A/(A+B).

  5. 5.

    What is the distribution of y given 𝐱 and p?
    Hint: Think about similarities between this problem and the genetics example.

    There are x3=275 households where all three individuals are infected. For each of these households there are two possibilities {1,1,1} and {1,2}. The conditional probability of {1,2} given that three people are infected in the household is

    P({1,2}|{1,1,1}{1,2})=P({1,2})P({1,1,1})+P({1,2})=p22p2(1-p)+p2=13-2p.

    Hence, y|x,pBin(x3(=275),1/(3-2p)).

  6. 6.

    Write down Q(p,p*)=Ep*[l(p;𝐱,Y)|𝐱].
    (This will help form the E-step of the EM algorithm.)

    Q(p,p*) = Ep*[l(p;𝐱,Y)|𝐱]
    = Ep*[K+(2x1+2x2+x3-Y)log(1-p)+(x2+2x3)logp|𝐱]
    = Ep*[K|𝐱]+(2x1+2x2+x3-Ep*[Y|𝐱])log(1-p)+(x2+2x3)logp.

    Note that Ep*[K|x] is not important since it is not a function of p. Therefore we only need to find Ep*[Y|x] which using y|x,pBin(x3(=275),1/(3-2p)), gives Ep*[Y|x]=275/(3-2p).

  7. 7.

    Write an EM algorithm in R to find the MLE p^.

    See R code solutions.

  8. 8.

    Calculate the standard error of the MLE p^.

    Firstly note that

    Q(p,p*) = Ep*[l(p;𝐱,Y)|𝐱]
    = Ep*[K+(2x1+2x2+x3-Y)log(1-p)+(x2+2x3)logp|𝐱]
    = Ep*[K|𝐱]+(2x1+2x2+x3-Ep*[Y|𝐱])log(1-p)+(x2+2x3)logp
    = Ep*[K|𝐱]+(393-Ep*[Y|𝐱])log(1-p)+575logp.

    Therefore Ep*[Y|x]=275/(3-2p*), we have that

    -2Q(p,p*)p2|p=p=p^ = 393-275/(3-2p^)(1-p^)2+575p^2=3987.977.

    Secondly, since y|x,pBin(275,1/(3-2p)), we have that

    Var(logf(𝐱,Y|p)dp|𝐱,p^) = Var(-2x1+2x2+x3-Y1-p+x2+2x3p|𝐱,p^)
    = 1(1-p^)2Var(Y|𝐱,p^)
    = 1(1-p^)2×275×13-2p^×2(1-p^)3-2p^=847.6705

    Therefore the standard error of p^ is 1/(3987.977-847.6705)=0.01784.

1.2 Practice Questions

  1. 1.
    1. (a)

      The likelihood of p and λ is

      L(p,λ|𝐱)=i=1n{pδxi0+(1-p)λxixi!e-λ},

      where δij=1 if i=j and δij=0 if ij

    2. (b)

      The log-likelihood of p and λ given 𝐮 is

      l(p,λ|𝐱,𝐮) = i;ui=1{logp+logf(xi)}+i;ui=0{log(1-p)+logg(xi)}
      = i;ui=1{logp+logf(xi)}+i;ui=0{log(1-p)+xilogλ-logxi!-λ}

      Now m=i;ui=11 and n-m=i;ui=01. Note that if ui=1 then xi=0, so i;ui=0xi=i=1nxi=nμ. (Since Ui=1 only if xi=0.)

      Therefore

      l(p,λ|𝐱,𝐮) = mlogp+(n-m)log(1-p)+nμlogλ+i;ui=0logxi!-(n-m)λ (1.1)
      = mlogp+(n-m)log(1-p)+nμλ-(n-m)λ+A

      as required.

    3. (c)

      To find the MLEs simply differentiate (1.1) with respect to p and λ, set the derivative equal to 0 and solve.

      Firstly,

      lp = mp-n-m1-p

      Therefore p^ satisfies

      mp-n-m1-p = 0
      m(1-p)-(n-m)p = 0
      m-np = 0.

      Therefore p^=mn.

      Secondly,

      lλ = nμλ-(n-m)

      Therefore λ^ satisfies

      nμλ-(n-m) = 0
      nμ-(n-m)λ = 0.

      Therefore λ^=nn-mμ.

    4. (d)

      This involves find E[Ui|p,xi] for i=1,2,,n which is

      E[Ui|p,xi] = P(Ui=1,Xi=0|p,λ)P(Xi=0|p,λ)
      = pf(xi)pf(xi)+(1-p)g(xi).

      For xi>0, f(xi)=0 and so E[Ui|p,xi]=0.

      For xi=0, f(xi)=1 and g(xi)=exp(-λ). Therefore

      E[Ui|p,xi]=p×1p×1+(1-p)×exp(-λ)=pp+(1-p)exp(-λ)

      as required.

    5. (e)

      The EM algorithm alternates between:-

      • E-step: E[l(p|𝐱,𝐔)] given the current estimate of p.

        Note that

        l(p|𝐱,𝐔)=(i=1nUi)logp+(n-i=1nUi)log(1-p)+nμlogλ+(n-i=1nUi)λ+A.

        Therefore it suffices to compute E[Ui|p,xi] (i=1,2,,n) which is derived in (d).

      • M-step: Maximizing l(p,λ|𝐱,𝐮) with respect to p and λ where 𝐮 is replaced by the corresponding expected values from the E-step (part (d)) and the MLEs are given in the part (c).

  2. 2.
    1. (a)

      The likelihood is

      L(β;𝐲,𝐳) = f(𝐲,𝐳|β)
      = i=1nf(yi,zi|β)
      = i=1nf(yi|zi,β)f(zi|β)
      = i=1n{1{zi>0}yi1{zi0}1-yi}×12πexp(-1/2(zi-xiβ)2)
      = i=1n{1{zi>0}yi1{zi0}1-yi}(2π)-n/2i=1nexp(-1/2(zi-xiβ)2)
      = i=1n1{zi>0}yi1{zi0}1-yi(2π)-n/2exp(-12i=1n(zi-xiβ)2).
    2. (b)

      Note that the log-likelihood satisfies

      l(β;𝐲,𝐳)=K-12i=1n(zi-xiβ)2,

      where K is a constant not depending upon β.
      Hence

      dl(β;𝐲,𝐳)dβ = -12i=1n2(zi-xiβ)(-xi)
      = -i=1nzixi+βi=1nxi2.

      Setting dl(β;𝐲,𝐳)dβ=0 yields

      β~=i=1nzixii=1nxi2.
    3. (c)

      Note that yi=1 implies that zi>0. Therefore zi|yi=1,xi,β follows N(xiβ,1) conditioned to be positive.
      Therefore

      E[zi|yi=1,xi,β] = E[zi|zi>0]
      = xiβ+ϕ(-xiβ)1-Φ(-xiβ)
      = xiβ+ϕ(xiβ)Φ(xiβ).
    4. (d)
      • Choose an initial value for β, say β=0.

      • E-step: For each i, compute E[zi|yi,xi,β]. For yi=1, E[zi|yi,xi,β] is given by (c) and E[zi|yi=0,xi,β] can be computed similarly.

      • M-Step: Compute β~=i=1nzixii=1nxi2.

      • Stop when two consecutive estimates of β~ agree to a predefined precision, say 10-4.