Home page for accesible maths 10 Multivariate Normal Distributions

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

10.1 The Bivariate Normal Distribution

Two continuous random variables X and Y are said to have a bivariate normal distribution with parameters 𝜽=(μX,μY,σX2,σY2,ρ), where σX2>0, σY2>0 and -1<ρ<1, if their joint pdf is given for all x and y by

fXY(x,y;𝜽)=12πσX2σY2(1-ρ2)×exp{-1211-ρ2Q(x,y)},

where

Q(x,y)=(x-μXσX)2-2ρ(x-μXσX)(y-μYσY)+(y-μYσY)2.

It will be shown below that the marginal distributions of X and Y are both normal with XN(μX,σX2), YN(μY,σY2) and 𝖢𝗈𝗋𝗋[X,Y]=ρ. This explains the notation and the restrictions on the parameters.

In the special case of ρ=0, i.e. no correlation, the joint pdf factorises into:

fXY(x,y;𝜽)=12πσX2exp{-12(x-μXσX)2}×12πσY2exp{-12(y-μYσY)2},

and so (X,Y) are independent. That is, for bivariate normal random variables, no correlation implies independence. Recall from Section 7.1 that this is NOT true in general.

Realisations from this bivariate distribution and contour plots of the corresponding pdfs are shown on Figure 10.1 (First Link, Second Link), Figure 10.2 (First Link, Second Link) and Figure 10.3 (First Link, Second Link) for μX=μY=0 and σX2=σY2=1 and varying values of ρ. Note that the contours of the pdf are ellipses centred at the origin with orientation given by ρ. In general the contours will be ellipses centred at (μX,μY).

Figure 10.1: First Link, Second Link, Caption: Left: 1000 realisations of a bivariate normal distribution with ρ=0. The marginal distribution is standard normal. Right: Contour plots of the corresponding pdf.
Figure 10.2: First Link, Second Link, Caption: Left: 1000 realisations of a bivariate normal distribution with ρ=0.8. The marginal distribution is standard normal. Right: Contour plots of the corresponding pdf.
Figure 10.3: First Link, Second Link, Caption: Left: 1000 realisations of a bivariate normal distribution with ρ=-0.6. The marginal distribution is standard normal. Right: Contour plots of the corresponding pdf.

The pdf is more conveniently expressed in matrix notation, which also makes the analogy with the univariate case clearer and the extension to higher dimensions easier. Set

Σ=[σX2ρσXσYρσXσYσY2]

so that

Σ-1=1σX2σY2(1-ρ2)[σY2-ρσXσY-ρσXσYσX2]=11-ρ2[1σX2-ρσXσY-ρσXσY1σY2].

Then

fXY(x,y;𝜽)=12πdetΣexp{-12[x-μXy-μY]Σ-1[x-μXy-μY]},

Generating the bivariate normal To derive properties of the bivariate normal distribution it is often a good idea to think of it as a transformation of two independent standard normal random variables U and V, say.

Now consider the linear transformation

[XY]=[μXμY]+A[UV], (10.1)

where

A=[σX0ρσX1-ρ2σY].

Or equivalently

X=μX+σXU (10.2)
Y=μY+ρσYU+(1-ρ2)σYV. (10.3)

Clearly

  1. 𝖤[X]=μX+σX𝖤[U]=μX,

  2. 𝖤[Y]=μY+ρσY𝖤[U]+(1-ρ2)σY𝖤[V]=μY.

Using the independence of U and V, the variances are

  1. 𝖵𝖺𝗋[X]=σX2𝖵𝖺𝗋[U]=σX2,

  2. 𝖵𝖺𝗋[Y]=ρ2σY2𝖵𝖺𝗋[U]+(1-ρ2)σY2𝖵𝖺𝗋[V]=σY2.

Finally, the covariance and correlation are (again using independence of U and V)

𝖢𝗈𝗏[X,Y] =𝖢𝗈𝗏[σXU,ρσYU]+𝖢𝗈𝗏[σXU,1-ρ2V]
=ρσXσY𝖵𝖺𝗋[U]=ρσXσY

and

𝖢𝗈𝗋𝗋[X,Y]=ρ.

We have shown that the parameters have the intuitive interpretation. We also need to show that (X,Y) have the correct joint distribution. We do this for a general 1-1 linear transformation of a vector of d iid 𝖭(0,1) rvs (i.e. not just for 2).

Proposition 10.1.1.

Let 𝝁 be a d×1 vector, A an invertible d×d matrix and U1,,Ud iid N(0,1) rvs with 𝑼=(U1,,Ud). Then 𝑿=𝝁+A𝑼 has density

f𝑿(𝒙)=1(2π)d/2(detΣ)1/2exp[-12(𝒙-𝝁)Σ-1(𝒙-𝝁)], (10.4)

where Σ=AA.

Proof.

By independence, the joint density of 𝑼 is

f𝑼(𝒖) =i=1d12πexp[-12ui2]=1(2π)d/2exp[-12i=1dui2]
=1(2π)d/2exp[-12𝒖𝒖]

Since A is invertible, the transformation 𝑼𝑿 is one-to-one and we may use the density method. Now

Xi=μi+j=1dAijUj

so

XiUj=Aij.

Hence

|det𝑼𝑿|=|det𝑿𝑼|-1=|detA|-1.

Thus

f𝑿(𝒙) =1(2π)d/2|detA|exp[-12(A-1(𝒙-𝝁))A-1(𝒙-𝝁)]
=1(2π)d/2|detA|exp[-12(𝒙-𝝁)(A-1)A-1(𝒙-𝝁)]
=1(2π)d/2|detA|exp[-12(𝒙-𝝁)(AA)-1(𝒙-𝝁)].

The result follows since Σ=AA and detΣ=detAdetA=(detA)2.

In our case with A as in (10.1),

Σ=[σX0ρσY(1-ρ2)σY2][σXρσY0(1-ρ2)σY2]=[σX2ρσXσYρσXσYσY2], (10.5)

as claimed. ∎

Marginal distributions

This shows that the marginal distributions of X and Y are both normal since X and Y are both linear combinations of the independent normal random variables U and V (the convolution property).

Linear transformations

Suppose X and Y are bivariate Normal random variables. Consider the linear transformation

[ST]=[c1c2]+B[XY],

and assume this is one-to-one i.e. det(B)0. Then

[ST]=[c1c2]+B([μXμY]+A[UV]).

But this can be written as

[ST]=𝝁*+A*[UV],

where

𝝁*=[c1c2]+B[μXμY]

and A*=BA.

Since det(A)0 and det(B)0, det(AB)0 and so (S,T) also have a bivariate normal distribution. i.e. one-to-one linear transformations of bivariate normal random variables are also bivariate normal. Since U and V are independent and follow a 𝖭(0,1) distribution, the expectation and variance are 𝝁* and A*A*.

Example 10.1.1.

The joint distribution of (X,Y) is bivariate Normal with expectation [12] and variance

[1224].

Find the distribution of T=aX+bY.

Solution.  As linear combinations of MVN variables are normally distributed, we just have to find the expectation and variance.

𝖤[T] =[ab]𝖤[[XY]]
=[ab][12]=a+2b,
𝖵𝖺𝗋[T] =[ab][1224][ab]
=a2+4ab+4b2=(a+2b)2.

Hence T𝖭(a+2b,(a+2b)2).