3 Multi-Parameter likelihoods Model examples Examples of multi-parameter likelihoods

The likelihood function and maximum likelihood estimator

The definitions of likelihood given in Chapter 2 are unchanged in the multi-parameter case; we simply need to replace $\theta$ by $\vec{\theta}$ . In general, for observed data $\vec{X}$ , having probability (density or mass) function $\vec{f}(\vec{x}|\vec{\theta})$ , the likelihood is defined simply as

L(\vec{\theta})\propto\vec{f}(\vec{x}|\vec{\theta}),

where $\vec{x}$ is the observed value of $\vec{X}$ . In particular, for independent data $x_{1},x_{2},\ldots,x_{n}$ such that $x_{i}$ is the realisation of $X_{i}$ having probability (density or mass) function $f_{i}(x_{i}|\vec{\theta})$ , we define the likelihood

L(\vec{\theta})\propto\vec{f}(x_{1},\ldots,x_{n}|\vec{\theta})=\prod_{i=1}^{n}% f_{i}(x_{i}|\vec{\theta}).

Specializing further to the case where each of the $X_{i}$ has the same distribution $f(x|\vec{\theta})$ , the likelihood becomes

L(\vec{\theta})\propto\vec{f}(x_{1},\ldots,x_{n}|\vec{\theta})=\prod_{i=1}^{n}% f(x_{i}|\vec{\theta}).

As in the one-parameter case, it is often more convenient to work with the log-likelihood, which, in the latter case, becomes

\ell(\vec{\theta})=\log L(\vec{\theta})=\sum_{i=1}^{n}\log f(x_{i}|\vec{\theta% }).

The maximum likelihood estimator (MLE), $\hat{\vec{\theta}}$ of $\vec{\theta}$ , is the value that maximises $L(\vec{\theta})$ (or $\ell(\vec{\theta})$ ). Notice, though, that this now requires maximization in $d$ -dimensional space, where $d$ is the length of the parameter vector.

Calculating the MLE

For multiparameter models, we still maximise the log-likelihood $l(\vec{\theta})$ with respect to the parameters, $\vec{\theta}$ , to find the MLE, $\hat{\vec{\theta}}$ . Assuming the log-likelihood function is differentiable at the MLE then the maximum will be a turning point, and we find it by solving the set of simultaneous equations

\frac{\partial l(\vec{\hat{\theta}})}{\partial\theta_{i}}=0\mbox{ for }i=1,\ldots,d.

Note: For some models (particularly those where the support of the density function depends on one or more of the parameters) the log-likelihood function is not differentiable at the MLE (recall the Uniform $[0,\theta]$ example from MATH235). In these cases plotting the log-likelihood surface can help. Naturally, R is useful here.