Home page for accesible maths 10 Multivariate Normal Distributions 10.1.1 Conditional Distributions 10.3 Key definitions and Relationships

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

10.2 The Multivariate Normal Distribution

With the matrix notation set up the bivariate normal distribution is easily extended to higher dimensions. The vector $\boldsymbol{X}=(X_{1},\ldots,X_{d})^{\prime}$ is said to have a $d$ -dimensional normal distribution if its pdf is given for all $\boldsymbol{x}$ by

\displaystyle f_{\boldsymbol{X}}(\boldsymbol{x})=\frac{1}{(2\pi)^{d/2}\sqrt{% \det\Sigma}}\exp\left\{-\frac{1}{2}(\boldsymbol{x}-\boldsymbol{\mu})^{\prime}% \Sigma^{-1}(\boldsymbol{x}-\boldsymbol{\mu})\right\},

where $\boldsymbol{\mu}=(\mu_{1},\ldots,\mu_{d})^{\prime}$ is the vector of expectations and $\Sigma$ is the variance-covariance matrix with $(i,j)$ -th element $\sigma_{ij}$ being the pairwise covariance of variables $X_{i}$ and $X_{j}$ . As

\displaystyle\rho_{ij}={\operatorname{\mathsf{Corr}}}\left[{X_{i},X_{j}}\right% ]=\frac{{\operatorname{\mathsf{Cov}}}\left[{X_{i},X_{j}}\right]}{\sqrt{{% \operatorname{\mathsf{Var}}}\left[{X_{i}}\right]{\operatorname{\mathsf{Var}}}% \left[{X_{j}}\right]}}=\frac{\Sigma_{ij}}{\sigma_{i}\sigma_{j}},

the variance-covariance matrix can be written

\displaystyle\Sigma=\begin{bmatrix}\Sigma_{11}&\ldots&\ldots&\Sigma_{1d}\\ \ldots&\ldots&\Sigma_{ij}&\ldots\\ \ldots&\Sigma_{ji}&\ldots&\ldots\\ \Sigma_{d1}&\ldots&\ldots&\Sigma_{dd}\end{bmatrix}=\begin{bmatrix}\sigma_{1}^{% 2}&\ldots&\ldots&\sigma_{1}\sigma_{d}\rho_{1d}\\ \ldots&\ldots&\sigma_{i}\sigma_{j}\rho_{ij}&\ldots\\ \ldots&\sigma_{i}\sigma_{j}\rho_{ji}&\ldots&\ldots\\ \sigma_{1}\sigma_{d}\rho_{d1}&\ldots&\ldots&\sigma_{d}^{2}\end{bmatrix},

where $\sigma_{i}^{2}$ is the variance of $X_{i}$ , and $\rho_{ij}$ is the correlation of $X_{i}$ and $X_{j}$ .

We often denote the distribution of $\boldsymbol{X}$ by

\displaystyle\boldsymbol{X}\sim\operatorname{\mathsf{MVN}}_{d}(\boldsymbol{\mu% },\Sigma),

where $\operatorname{\mathsf{MVN}}_{d}$ stands for multivariate normal distribution of $d$ dimensions.

Proposition 10.1.1 showed that a multivariate normal vector can always be written as the sum of an expectation vector, $\boldsymbol{\mu}$ , and a linear combination of iid $\operatorname{\mathsf{N}}(0,1)$ random variables.

Therefore as in the bivariate case the univariate marginal distributions are all normal

\displaystyle X_{i}\sim N(\mu_{i},\sigma_{i}^{2}),

for $i=1,\ldots,d$ . In fact, any linear combination of the $X$ ’s will again be normal

\displaystyle\sum_{i=1}^{d}a_{i}X_{i}\sim N\left(\sum_{i=1}^{d}a_{i}\mu_{i},% \sum_{i=1}^{d}\sum_{j=1}^{d}a_{i}a_{j}\Sigma_{ij}\right),

where the expressions for the expectation and variance follow from Chapter 8; the fact that the distribution is normal is the convolution property of the normal distribution. Using that $\sigma_{ii}=\sigma_{i}^{2}$ and $\sigma_{ij}=\sigma_{ji}$ the variance can also be written as

\displaystyle\sum_{i=1}^{d}a_{i}^{2}\sigma_{i}^{2}+2\sum_{i=1}^{d}\sum_{j=1}^{% i-1}a_{i}a_{j}\Sigma_{ij}=\sum_{i=1}^{d}a_{i}^{2}{\operatorname{\mathsf{Var}}}% \left[{X_{i}}\right]+2\sum_{i=1}^{d}\sum_{j=1}^{i-1}a_{i}a_{j}{\operatorname{% \mathsf{Cov}}}\left[{X_{i},X_{j}}\right].

The bivariate marginal distribution of any pair of variables $(X_{i},X_{j})$ is also normal with expectation and variance given by,

\displaystyle\begin{bmatrix}X_{i}\\ X_{j}\end{bmatrix}\sim\operatorname{\mathsf{MVN}}_{2}\left(\begin{bmatrix}\mu_% {i}\\ \mu_{j}\end{bmatrix},\begin{bmatrix}\sigma_{i}^{2}&\rho_{ij}\sigma_{i}\sigma_{% j}\\ \rho_{ij}\sigma_{i}\sigma_{j}&\sigma_{j}^{2}\end{bmatrix}\right).

In general all the lower order marginal distributions, i.e. the distribution of any subset of the $X$ ’s, will be normal with expectation vector and variance matrix given by deleting the rows and columns corresponding to the variables we leave out.

Indeed, any multi-dimensional linear transformation of $\boldsymbol{X}$ will be normal

\displaystyle A\boldsymbol{X}\sim\operatorname{\mathsf{MVN}}_{m}(A\boldsymbol{% \mu},A\Sigma A^{\prime}),

where $A$ is an $m\times d$ matrix of constants.

It also turns out that all the conditional distributions of a subset of the $X$ ’s given the rest is again normal.

Example 10.2.1.

Let $(X_{1},X_{2},X_{3},X_{4},X_{5})^{\prime}$ be multivariate normal with

\displaystyle\begin{bmatrix}X_{1}\\ X_{2}\\ X_{3}\\ X_{4}\\ X_{5}\end{bmatrix}\sim\operatorname{\mathsf{MVN}}_{5}\left(\begin{bmatrix}0\\ 1\\ -2\\ 0\\ 4\end{bmatrix},\begin{bmatrix}26&15&15&-7&-1\\ 15&18&-20&13&0\\ 15&-20&99&-44&0\\ -7&13&-44&143&73\\ -1&0&0&73&90\end{bmatrix}\right).

Find the marginal distribution of $(X_{1},X_{3},X_{4})$ .

Solution. The marginal distribution is obtained by deleting rows ${\color[rgb]{0.76,0.01,0}2}$ and ${\color[rgb]{0.76,0.01,0}5}$ of the expectation vector, and rows ${\color[rgb]{0.76,0.01,0}2}$ and ${\color[rgb]{0.76,0.01,0}5}$ and columns ${\color[rgb]{0.76,0.01,0}2}$ and ${\color[rgb]{0.76,0.01,0}5}$ of the variance matrix. Thus

\displaystyle\begin{bmatrix}X_{1}\\ X_{3}\\ X_{4}\end{bmatrix}\sim\operatorname{\mathsf{MVN}}_{3}{\color[rgb]{0.76,0.01,0}% \left(\begin{bmatrix}{\color[rgb]{0.76,0.01,0}0}\\ {\color[rgb]{0.76,0.01,0}-2}\\ {\color[rgb]{0.76,0.01,0}0}\end{bmatrix},{\color[rgb]{0.76,0.01,0}\begin{% bmatrix}{\color[rgb]{0.76,0.01,0}26}&{\color[rgb]{0.76,0.01,0}15}&{\color[rgb]% {0.76,0.01,0}-7}\\ {\color[rgb]{0.76,0.01,0}15}&{\color[rgb]{0.76,0.01,0}99}&{\color[rgb]{% 0.76,0.01,0}-44}\\ {\color[rgb]{0.76,0.01,0}-7}&{\color[rgb]{0.76,0.01,0}-44}&{\color[rgb]{% 0.76,0.01,0}143}\end{bmatrix}}\right)}.

The most important special case is when $X_{1},\ldots,X_{d}$ are i.i.d. $N(\mu,\sigma^{2})$ random variables. In this case

\displaystyle\boldsymbol{\mu}=\begin{bmatrix}\mu\\ \vdots\\ \mu\end{bmatrix},

and

\displaystyle\Sigma=\begin{bmatrix}\sigma^{2}&0&\cdots&0\\ 0&\ddots&\ddots&\vdots\\ \vdots&\ddots&\ddots&0\\ 0&\cdots&0&\sigma^{2}\end{bmatrix}=\sigma^{2}I_{d},

where $I_{d}$ is the identity matrix in $d$ dimensions, i.e. a $d\times d$ matrix with ones in the diagonal and zeros everywhere else. The pdf becomes

\displaystyle f_{\boldsymbol{X}}(\boldsymbol{x})=\frac{1}{(2\pi)^{d/2}\sigma^{% d}}\exp\left\{-\frac{1}{2\sigma^{2}}\sum_{i=1}^{d}(x_{i}-\mu)^{2}\right\},

since $\Sigma^{-1}=\frac{1}{\sigma^{2}}I_{d}$ .