Home page for accesible maths 7 Linear transformations 7.3 Expectations of Linear Transformations 7.5 Key definitions and Relationships

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

7.4 Variances of Linear Transformations

We are interested in ${\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]$ where $\boldsymbol{Y}=A\boldsymbol{X}$ . First note that for any $n$ -vector $\boldsymbol{v}$ ,

\displaystyle\boldsymbol{v}\boldsymbol{v^{\prime}}=\begin{bmatrix}v_{1}v_{1}&v% _{1}v_{2}&\dots&v_{1}v_{n}\\ v_{2}v_{1}&v_{2}v_{2}&\dots&v_{2}v_{n}\\ \dots&\dots&\dots&\dots\\ v_{n}v_{1}&v_{n}v_{2}&\dots&v_{n}v_{n},\end{bmatrix}

i.e. $[\boldsymbol{v}\boldsymbol{v^{\prime}}]_{ij}=v_{i}v_{j}$ . Since ${\operatorname{\mathsf{Cov}}}\left[{Y_{i},Y_{j}}\right]=\operatorname{\mathsf{% E}}\left[{Y_{i}Y_{j}}\right]-\operatorname{\mathsf{E}}\left[{Y_{i}}\right]% \operatorname{\mathsf{E}}\left[{Y_{j}}\right]$ , we can, therefore, write

	$\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]_{ij}$	$\displaystyle=\operatorname{\mathsf{E}}\left[{(\boldsymbol{Y}\boldsymbol{Y^{% \prime}})_{ij}}\right]-(\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]% \operatorname{\mathsf{E}}\left[{\boldsymbol{Y^{\prime}}}\right])_{ij}$
		$\displaystyle=\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}\boldsymbol{Y^{% \prime}}}\right]_{ij}-(\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]% \operatorname{\mathsf{E}}\left[{\boldsymbol{Y^{\prime}}}\right])_{ij}$

by the definition of matrix expectation. Hence

\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]=% \operatorname{\mathsf{E}}\left[{\boldsymbol{Y}\boldsymbol{Y^{\prime}}}\right]-% \operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]\operatorname{\mathsf{E}% }\left[{\boldsymbol{Y}}\right]^{\prime},

and similarly ${\operatorname{\mathsf{Var}}}\left[{\boldsymbol{X}}\right]=\operatorname{% \mathsf{E}}\left[{\boldsymbol{X}\boldsymbol{X^{\prime}}}\right]-\operatorname{% \mathsf{E}}\left[{\boldsymbol{X}}\right]\operatorname{\mathsf{E}}\left[{% \boldsymbol{X}}\right]^{\prime}$ .

With the ground work above we can now find the variance matrix for $\boldsymbol{Y}=A\boldsymbol{X}$ .

Theorem 7.4.1.

\displaystyle{\operatorname{\mathsf{Var}}}\left[{A\boldsymbol{X}}\right]=A{% \operatorname{\mathsf{Var}}}\left[{\boldsymbol{X}}\right]A^{\prime}.

Proof.

	$\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]$	$\displaystyle=\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}\boldsymbol{Y^{% \prime}}}\right]-\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]% \operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]^{\prime}$
		$\displaystyle=\operatorname{\mathsf{E}}\left[{A\boldsymbol{X}(A\boldsymbol{X})% ^{\prime}}\right]-\operatorname{\mathsf{E}}\left[{A\boldsymbol{X}}\right]% \operatorname{\mathsf{E}}\left[{A\boldsymbol{X}}\right]^{\prime}$
		$\displaystyle=A\operatorname{\mathsf{E}}\left[{\boldsymbol{X}\boldsymbol{X^{% \prime}}}\right]A^{\prime}-A\operatorname{\mathsf{E}}\left[{\boldsymbol{X}}% \right]\operatorname{\mathsf{E}}\left[{\boldsymbol{X}}\right]^{\prime}A^{\prime}$
		$\displaystyle=A{\operatorname{\mathsf{Cov}}}\left[{\boldsymbol{X}}\right]A^{% \prime}.$

∎

Setting $A=\boldsymbol{a^{\prime}}$ gives

\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{a^{\prime}}% \boldsymbol{X}}\right]=\boldsymbol{a^{\prime}}{\operatorname{\mathsf{Var}}}% \left[{X}\right]\boldsymbol{a}=\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i}{\operatorname% {\mathsf{Var}}}\left[{X}\right]_{i,j}a_{j}=\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i}{% \operatorname{\mathsf{Cov}}}\left[{X_{i},X_{j}}\right]a_{j}.

For $n=1$ we get back the familiar expression

\displaystyle{\operatorname{\mathsf{Var}}}\left[{a_{1}X_{1}}\right]=a_{1}^{2}{% \operatorname{\mathsf{Var}}}\left[{X_{1}}\right].

For $n=2$ we get Equation (7.1), since, ${\operatorname{\mathsf{Var}}}\left[{\boldsymbol{a^{\prime}}\boldsymbol{X}}\right]$ is

	$\displaystyle{\operatorname{\mathsf{Var}}}\left[{a_{1}X_{1}+a_{2}X_{2}}\right]$	$\displaystyle=\begin{pmatrix}a_{1}&a_{2}\end{pmatrix}\begin{pmatrix}{% \operatorname{\mathsf{Var}}}\left[{X_{1}}\right]&{\operatorname{\mathsf{Cov}}}% \left[{X_{1},X_{2}}\right]\\ {\operatorname{\mathsf{Cov}}}\left[{X_{2},X_{1}}\right]&{\operatorname{\mathsf% {Var}}}\left[{X_{2}}\right]\end{pmatrix}\begin{pmatrix}a_{1}\\ a_{2}\end{pmatrix}$
		$\displaystyle=\sum_{i=1}^{2}\sum_{j=1}^{2}a_{i}a_{j}{\operatorname{\mathsf{Cov% }}}\left[{X_{i},X_{j}}\right]$
		$\displaystyle=a_{1}^{2}{\operatorname{\mathsf{Var}}}\left[{X_{1}}\right]+a_{2}% ^{2}{\operatorname{\mathsf{Var}}}\left[{X_{2}}\right]+2a_{1}a_{2}{% \operatorname{\mathsf{Cov}}}\left[{X_{1},X_{2}}\right].$

Finally, setting

\displaystyle A=\begin{bmatrix}\boldsymbol{a}_{1}^{\prime}\\ \boldsymbol{a}_{2}^{\prime}\end{bmatrix}

with $\boldsymbol{Y}=A\boldsymbol{X}$ gives

\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]=\begin% {bmatrix}\boldsymbol{a}_{1}^{\prime}\\ \boldsymbol{a}_{2}^{\prime}\end{bmatrix}{\operatorname{\mathsf{Var}}}\left[{% \boldsymbol{X}}\right]\begin{bmatrix}\boldsymbol{a}_{1}&\boldsymbol{a}_{2}\end% {bmatrix}

\displaystyle{\operatorname{\mathsf{Cov}}}\left[{\boldsymbol{a}_{1}^{\prime}% \boldsymbol{X},\boldsymbol{a}_{2}^{\prime}\boldsymbol{X}}\right]={% \operatorname{\mathsf{Cov}}}\left[{Y_{1},Y_{2}}\right]={\operatorname{\mathsf{% Var}}}\left[{\boldsymbol{Y}}\right]_{1,2}=\boldsymbol{a}_{1}^{\prime}{% \operatorname{\mathsf{Var}}}\left[{\boldsymbol{X}}\right]\boldsymbol{a}_{2}.

Example 7.4.1.

A square $d\times d$ matrix $M$ is positive semi-definite if for any $d$ -vector $\boldsymbol{a}$ , $\boldsymbol{a}^{T}M\boldsymbol{a}\geq 0$ . Why are all variance matrices positive semi-definite?

Solution. Let $M$ be a variance matrix for some random variable $\boldsymbol{X}$ . Then the variance of $\boldsymbol{a}^{T}\boldsymbol{X}$ is $\boldsymbol{a}^{T}M\boldsymbol{a}$ ; but variances cannot be negative.

Example 7.4.2.

Suppose the variance of $X_{1}$ and $X_{2}$ are both $1$ , and that their correlation is $\rho$ .

(a)

Show that the variance of $X_{1}+X_{2}$ lies between $0$ and $4$ .
(b)

What is the value of $\rho$ and hence, what is the relationship between $X_{1}$ and $X_{2}$ when ${\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]=2,4$ and $0$ ?

Solution.

(a)

$\displaystyle{\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]$ $\displaystyle={\operatorname{\mathsf{Var}}}\left[{X_{1}}\right]+{\operatorname% {\mathsf{Var}}}\left[{X_{2}}\right]+2{\operatorname{\mathsf{Cov}}}\left[{X_{1}% ,X_{2}}\right]$

$\displaystyle={\color[rgb]{0.76,0.01,0}1+1+2\times 1\times 1\times\rho}$

$\displaystyle={\color[rgb]{0.76,0.01,0}2(1+\rho).}$

The inequality follows as ${\color[rgb]{0.76,0.01,0}-1\leq\rho\leq 1.}$
(b)

For ${\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]=2$ , need ${\color[rgb]{0.76,0.01,0}\rho=0}$ (uncorrelated). For ${\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]=4$ , need ${\color[rgb]{0.76,0.01,0}\rho=1},$ i.e. ${\color[rgb]{0.76,0.01,0}X_{2}=X_{1}+c}$ ; ${\operatorname{\mathsf{Var}}}\left[{c+2X_{1}}\right]=4{\operatorname{\mathsf{% Var}}}\left[{X_{1}}\right]=4$ . For ${\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]=0$ , need ${\color[rgb]{0.76,0.01,0}\rho=-1},$ i.e. ${\color[rgb]{0.76,0.01,0}X_{2}=-X_{1}+c}$ ; ${\operatorname{\mathsf{Var}}}\left[{c}\right]=0$

Example 7.4.3.

Find ${\operatorname{\mathsf{Cov}}}\left[{X+Y,X-Y}\right]$ , when the variances are $\sigma^{2}_{X}$ and $\sigma^{2}_{Y}$ and their correlation is $\rho$ .

Solution.

(i)

Matrix multiplication:

$\displaystyle{\operatorname{\mathsf{Cov}}}\left[{X+Y,X-Y}\right]$ $\displaystyle={\color[rgb]{0.76,0.01,0}\begin{bmatrix}{\color[rgb]{0.76,0.01,0% }1}&{\color[rgb]{0.76,0.01,0}1}\end{bmatrix}}\begin{bmatrix}{\sigma_{X}^{2}}&{% \rho\sigma_{X}\sigma_{Y}}\\ {\rho\sigma_{X}\sigma_{Y}}&{\sigma_{Y}^{2}}\end{bmatrix}{\color[rgb]{% 0.76,0.01,0}\begin{bmatrix}{\color[rgb]{0.76,0.01,0}1}\\ {\color[rgb]{0.76,0.01,0}-1}\end{bmatrix}}$

$\displaystyle={\color[rgb]{0.76,0.01,0}\sigma^{2}_{X}-\sigma^{2}_{Y}.}$
(ii)

Bilinearity:

$\displaystyle{\operatorname{\mathsf{Cov}}}\left[{X+Y,X-Y}\right]$ $\displaystyle={\color[rgb]{0.76,0.01,0}{\operatorname{\mathsf{Cov}}}\left[{X,X% }\right]+{\operatorname{\mathsf{Cov}}}\left[{Y,X}\right]-{\operatorname{% \mathsf{Cov}}}\left[{X,Y}\right]-{\operatorname{\mathsf{Cov}}}\left[{Y,Y}% \right]}$

$\displaystyle={\color[rgb]{0.76,0.01,0}{\operatorname{\mathsf{Var}}}\left[{X}% \right]-{\operatorname{\mathsf{Var}}}\left[{Y}\right]=\sigma^{2}_{X}-\sigma^{2% }_{Y}}$

This uses ${\operatorname{\mathsf{Cov}}}\left[{X,Y}\right]={\operatorname{\mathsf{Cov}}}% \left[{Y,X}\right]$ , ${\operatorname{\mathsf{Var}}}\left[{X}\right]={\operatorname{\mathsf{Cov}}}% \left[{X,X}\right]$ , and ${\operatorname{\mathsf{Var}}}\left[{Y}\right]={\operatorname{\mathsf{Cov}}}% \left[{Y,Y}\right]$ .

Independence

When $X_{1},\ldots,X_{n}$ are independent and $Y=\boldsymbol{a^{\prime}}\boldsymbol{X}$ , the variance formula simplifies to

\displaystyle{\operatorname{\mathsf{Var}}}\left[{Y}\right]=\sum_{i=1}^{n}\sum_% {j=1}^{n}a_{i}a_{j}{\operatorname{\mathsf{Cov}}}\left[{X_{i},X_{j}}\right]=% \sum_{i=1}^{n}a_{i}a_{i}{\operatorname{\mathsf{Cov}}}\left[{X_{i},X_{i}}\right% ]=\sum_{i=1}^{n}a_{i}^{2}{\operatorname{\mathsf{Var}}}\left[{X_{i}}\right],

because ${\operatorname{\mathsf{Cov}}}\left[{X_{i},X_{j}}\right]=0$ for $i\neq j$ .

In particular, we get the following, which we will use repeatedly through the remainder of this module.

Corollary 7.4.2.

Let $X_{1},\dots,X_{n}$ be independent and define

$S_{n}=X_{1}+\dots+X_{n}$ ,
$\overline{X}_{n}=\frac{1}{n}S_{n}$ .

Then

${\operatorname{\mathsf{Var}}}\left[{S_{n}}\right]={\operatorname{\mathsf{Var}}% }\left[{X_{1}}\right]+\ldots+{\operatorname{\mathsf{Var}}}\left[{X_{n}}\right]$ ,
${\operatorname{\mathsf{Var}}}\left[{\overline{X}_{n}}\right]=\frac{1}{n^{2}}% \sum_{i=1}^{n}{\operatorname{\mathsf{Var}}}\left[{X_{i}}\right]$ .

The variance of the sum is the sum of the variances, when $X_{1},\ldots,X_{n}$ are independent.

Further, if $X_{1},\ldots,X_{n}$ have the same variance, $\sigma^{2}$ , this simplifies to

${\operatorname{\mathsf{Var}}}\left[{S_{n}}\right]=n\sigma^{2}$ ,
${\operatorname{\mathsf{Var}}}\left[{\overline{X}_{n}}\right]=\frac{\sigma^{2}}% {n}$ .

In particular, these formulae hold when $X_{1},\ldots,X_{n}$ are i.i.d. (independent, identically distributed).

Example 7.4.4.

$X_{1},\dots,X_{n}$ are independent and for $i=1,\dots,n$ , $X_{i}\sim N(1/i,1/i)$ ; what is ${\operatorname{\mathsf{Var}}}\left[{\sum_{i=1}^{n}iX_{i}}\right]$ ? Do the $X_{i}$ need to be independent for this result to always hold?

Solution.

\displaystyle{\operatorname{\mathsf{Var}}}\left[{\sum_{i=1}^{n}iX_{i}}\right]=% {\color[rgb]{0.76,0.01,0}\sum_{i=1}^{n}i^{2}{\operatorname{\mathsf{Var}}}\left% [{X_{i}}\right]=\sum_{i=1}^{n}i^{2}/i=n(n+1)/2.}

Independence is required in general.

Example 7.4.5.

Two packs of batteries are for sale: pack A contains $4$ batteries each exponentially distributed with expected lifetime $5$ hours; pack B contains $2$ batteries each exponentially distributed with expected lifetime $10$ hours. The batteries in a pack are used consecutively. Show that the expected lifetimes for the packs are the same; which is the most reliable pack?

Solution. For pack A the total lifetime $T_{A}=X_{1}+\dots+X_{4}$ where $X_{i}\sim\operatorname{\mathsf{Exp}}(1/5)$ . So $\operatorname{\mathsf{E}}\left[{X_{i}}\right]={\color[rgb]{0.76,0.01,0}5}$ and ${\operatorname{\mathsf{Var}}}\left[{X_{i}}\right]={\color[rgb]{0.76,0.01,0}5^{% 2}}$ .

Hence $\operatorname{\mathsf{E}}\left[{T_{A}}\right]={\color[rgb]{0.76,0.01,0}4\times 5% =20}$ hours, and ${\operatorname{\mathsf{Var}}}\left[{T_{A}}\right]={\color[rgb]{0.76,0.01,0}4{% \operatorname{\mathsf{Var}}}\left[{X_{1}}\right]={4}\times{5^{2}}=100}$ .

For pack B the total lifetime $T_{B}=Y_{1}+Y_{2}$ , where $Y_{i}\sim\operatorname{\mathsf{Exp}}(1/10)$ . So $\operatorname{\mathsf{E}}\left[{Y_{i}}\right]={\color[rgb]{0.76,0.01,0}10}$ and ${\operatorname{\mathsf{Var}}}\left[{Y_{i}}\right]={\color[rgb]{0.76,0.01,0}10^% {2}}$ . Hence $\operatorname{\mathsf{E}}\left[{T_{B}}\right]={\color[rgb]{0.76,0.01,0}2\times 1% 0=20}$ hours, the same as A, and ${\operatorname{\mathsf{Var}}}\left[{T_{B}}\right]={\color[rgb]{0.76,0.01,0}2{% \operatorname{\mathsf{Var}}}\left[{Y_{1}}\right]=2\times 10^{2}}$ . If the expectations are the same, then reliable is equivalent to small variance. So choose pack A.

Example 7.4.6.

$\boldsymbol{X}=(X_{1},X_{2},X_{3})^{\prime}$ has expectation vector and variance matrix given by

$\operatorname{\mathsf{E}}\left[{\boldsymbol{X}}\right]=\begin{pmatrix}1\\ 2\\ -1\end{pmatrix}$ ,
${\operatorname{\mathsf{Var}}}\left[{\boldsymbol{X}}\right]=\begin{pmatrix}1&0&% 1\\ 0&3&2\\ 1&2&5\end{pmatrix}$ .

Find the expectations, variances and covariance of $Y_{1}=2X_{1}+4X_{2}$ and $Y_{2}=X_{1}-X_{2}+X_{3}$ .

Solution. We have

\displaystyle\begin{pmatrix}Y_{1}\\ Y_{2}\end{pmatrix}=A\begin{pmatrix}X_{1}\\ X_{2}\\ X_{3}\end{pmatrix},

where

\displaystyle A=\begin{pmatrix}2&4&0\\ 1&-1&1\end{pmatrix}.

Thus

\displaystyle\operatorname{\mathsf{E}}\left[{\boldsymbol{Y}}\right]=\begin{% pmatrix}\operatorname{\mathsf{E}}\left[{Y_{1}}\right]\\ \operatorname{\mathsf{E}}\left[{Y_{2}}\right]\end{pmatrix}=A\operatorname{% \mathsf{E}}\left[{\boldsymbol{X}}\right]=\begin{pmatrix}2&4&0\\ 1&-1&1\end{pmatrix}\begin{pmatrix}\operatorname{\mathsf{E}}\left[{(X+Y)Z}% \right]-\operatorname{\mathsf{E}}\left[{X+Y}\right]\operatorname{\mathsf{E}}% \left[{Z}\right]1\\ 2\\ -1\end{pmatrix}=\begin{pmatrix}10\\ -2\end{pmatrix},

and

	$\displaystyle{\operatorname{\mathsf{Var}}}\left[{\boldsymbol{Y}}\right]$	$\displaystyle=\begin{pmatrix}{\operatorname{\mathsf{Var}}}\left[{Y_{1}}\right]% &{\operatorname{\mathsf{Cov}}}\left[{Y_{1},Y_{2}}\right]\\ {\operatorname{\mathsf{Cov}}}\left[{Y_{2},Y_{1}}\right]&{\operatorname{\mathsf% {Var}}}\left[{Y_{2}}\right]\end{pmatrix}=A{\operatorname{\mathsf{Var}}}\left[{% \boldsymbol{X}}\right]A^{\prime}$
		$\displaystyle=\begin{pmatrix}2&4&0\\ 1&-1&1\end{pmatrix}\begin{pmatrix}1&0&1\\ 0&3&2\\ 1&2&5\end{pmatrix}\begin{pmatrix}2&1\\ 4&-1\\ 0&1\end{pmatrix}=\begin{pmatrix}52&0\\ 0&7\end{pmatrix}.$

Note that $Y_{1}$ and $Y_{2}$ are uncorrelated even though the $X$ ’s are not.

	$\displaystyle{\operatorname{\mathsf{Var}}}\left[{X_{1}+X_{2}}\right]$	$\displaystyle={\operatorname{\mathsf{Var}}}\left[{X_{1}}\right]+{\operatorname% {\mathsf{Var}}}\left[{X_{2}}\right]+2{\operatorname{\mathsf{Cov}}}\left[{X_{1}% ,X_{2}}\right]$
		$\displaystyle={\color[rgb]{0.76,0.01,0}1+1+2\times 1\times 1\times\rho}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}2(1+\rho).}$

	$\displaystyle{\operatorname{\mathsf{Cov}}}\left[{X+Y,X-Y}\right]$	$\displaystyle={\color[rgb]{0.76,0.01,0}\begin{bmatrix}{\color[rgb]{0.76,0.01,0% }1}&{\color[rgb]{0.76,0.01,0}1}\end{bmatrix}}\begin{bmatrix}{\sigma_{X}^{2}}&{% \rho\sigma_{X}\sigma_{Y}}\\ {\rho\sigma_{X}\sigma_{Y}}&{\sigma_{Y}^{2}}\end{bmatrix}{\color[rgb]{% 0.76,0.01,0}\begin{bmatrix}{\color[rgb]{0.76,0.01,0}1}\\ {\color[rgb]{0.76,0.01,0}-1}\end{bmatrix}}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}\sigma^{2}_{X}-\sigma^{2}_{Y}.}$

	$\displaystyle{\operatorname{\mathsf{Cov}}}\left[{X+Y,X-Y}\right]$	$\displaystyle={\color[rgb]{0.76,0.01,0}{\operatorname{\mathsf{Cov}}}\left[{X,X% }\right]+{\operatorname{\mathsf{Cov}}}\left[{Y,X}\right]-{\operatorname{% \mathsf{Cov}}}\left[{X,Y}\right]-{\operatorname{\mathsf{Cov}}}\left[{Y,Y}% \right]}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}{\operatorname{\mathsf{Var}}}\left[{X}% \right]-{\operatorname{\mathsf{Var}}}\left[{Y}\right]=\sigma^{2}_{X}-\sigma^{2% }_{Y}}$