Home page for accesible maths 4 Linear transformations 4 Linear transformations 4.B Eigenvalues and eigenvectors

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

4.A The matrix of a linear transformation

Throughout this Chapter we will use the letter $F$ to denote any field; but usually, in exercises and applications, it will mean either $F={\mathbb{R}}$ or $F={\mathbb{C}}$ . The notion of a linear transformation was introduced in MATH105 as a function from ${\mathbb{R}}^{n}$ to ${\mathbb{R}}^{m}$ . We will restate the definition here, in terms of arbitrary vector spaces.

Definition 4.1:

Let $V$ and $W$ be vector spaces over the same field $F$ . A function $T:V\to W$ is called a linear transformation if it satisfies the following two conditions:

T1

$T(\vec{v}+\vec{w})=T(\vec{v})+T(\vec{w})$ for any $\vec{v},\vec{w}\in V$ ,
T2

$T(\alpha\vec{v})=\alpha T(\vec{v})$ for any $\vec{v}\in V$ and $\alpha\in F$ .

Here $V$ is the domain of $T$ , and $W$ is the codomain of $T$ .

Example 4.2.

Let $A\in\operatorname{M}_{{n\times m}}({F})$ for a field $F$ . Then the function $T:F^{m}\to F^{n}$ defined as follows is a linear transformation:

T(\vec{x}):=A\vec{x}

for all $\vec{x}\in F^{m}$ . Here we consider elements of $F^{m}$ as $m\times 1$ column vectors.

As we have seen in MATH105 for $F=\mathbb{R}$ , every linear transformation $T:F^{m}\to F^{n}$ can be expressed as $T(\vec{x})=A\vec{x}$ for some matrix $A$ .

[Caution: The phrase “Linear transformation” is used differently in MATH230. In that module the functions of the form $T(\vec{x})=A\vec{x}+\vec{b}$ , where $\vec{b}$ is a non-zero vector are also considered “linear transformations” (unlike this module). Also, other sources sometimes prefer the name “linear map” or “vector space morphism”.]

Example 4.3.

Let $T:{\mathbb{R}}^{3}\to{\mathbb{R}}^{2}$ be defined by $T(x,y,z):=(x+2y,y-z)$ . Find a matrix $A$ such that $T(\vec{v})=A\vec{v}$ .

Solution: Write $\vec{e_{1}},\vec{e_{2}},\vec{e_{3}}$ for the standard basis of $\mathbb{R}^{3}$ , and to avoid duplicating notation, we write $\vec{f_{1}},\vec{f_{2}}$ for the standard basis of $\mathbb{R}^{2}$ . Then we compute that

	$\displaystyle T(\vec{e_{1}})$	$\displaystyle=(1,0)=\vec{f_{1}}+0\vec{f_{2}}$
	$\displaystyle T(\vec{e_{2}})$	$\displaystyle=(2,1)=2\vec{f_{1}}+\vec{f_{2}}$
	$\displaystyle T(\vec{e_{3}})$	$\displaystyle=(0,-1)=0\vec{f_{1}}-\vec{f_{2}}.$

Finally, create $A$ by taking the columns to be the coordinates of $T(\vec{e_{i}})$ with respect to the standard basis of $\mathbb{R}^{2}$ . So $A=\begin{bmatrix}1&2&0\\ 0&1&-1\end{bmatrix}$ .

The above example should be familiar from MATH105. It makes use of the standard basis of $\mathbb{R}^{n}$ . The following generalization allows for non-standard bases as well.

Definition 4.4:

Let $V$ , $W$ be vector spaces over the same field $F$ , and assume:

\mathcal{B}=(\vec{b_{1}},\cdots,\vec{b_{m}})

is a basis of $V$ , and

\mathcal{C}=(\vec{c_{1}},\cdots,\vec{c_{n}})

is basis of $W$ . If $T:V\to W$ is a linear transformation, then the matrix of $T$ with domain basis $\mathcal{B}$ and codomain basis $\mathcal{C}$ is constructed as follows:

{}_{\mathcal{C}}[T]_{\mathcal{B}}=\begin{bmatrix}[T(\vec{b_{1}})]_{\mathcal{C}% }&\cdots&[T(\vec{b_{m}})]_{\mathcal{C}}\end{bmatrix}\in\operatorname{M}_{{n% \times m}}({F}).

In other words, the columns are the coordinates of $T(\vec{b_{i}})$ with respect to the basis $\mathcal{C}$ . In the case when $\mathcal{B}=\mathcal{C}$ we also simply write:

[T]_{\mathcal{B}}:={{}_{\mathcal{B}}[}T]_{\mathcal{B}}.

If no basis is specified, then the matrix of a linear transformation $T:F^{m}\to F^{n}$ is defined as above, but using the standard basis for $F^{n}$ and $F^{m}$ , as in Example 4.3.

Example 4.5.

Let $T:{\mathbb{R}}^{2}\to{\mathbb{R}}^{2}$ be defined by $T(x,y):=(4y,-x-4y)$ , and let $\mathcal{B}=((2,-1),(1,0))$ be a basis for ${\mathbb{R}}^{2}$ . Compute the matrix of $T$ with respect to the basis $\mathcal{B}$ in the domain and codomain.

Solution: We compute the coordinates as $T(\vec{b_{i}})$ as follows:

	$\displaystyle T(\vec{b_{1}})$	$\displaystyle=(-4,2)=-2\vec{b_{1}}+0\vec{b_{2}}$
	$\displaystyle T(\vec{b_{2}})$	$\displaystyle=(0,-1)=\vec{b_{1}}-2\vec{b_{2}}.$

Using these coordinates as the column vectors, we find ${}_{\mathcal{B}}[T]_{\mathcal{B}}=\begin{bmatrix}-2&1\\ 0&-2\end{bmatrix}$ .

Exercise 4.6:

Consider the linear transformation $T:{\mathbb{R}}^{2}\to{\mathbb{R}}^{2}$ defined by $T(x,y):=(-x+2y,-6x+6y)$ . Prove that the matrix of $T$ with respect to the basis $\mathcal{B}=((2,3),(1,2))$ in both the domain and codomain is:

{}_{\mathcal{B}}[T]_{\mathcal{B}}=\begin{bmatrix}2&0\\ 0&3\end{bmatrix}.

[End of Exercise]

Theorem 4.7.

Let $T:V\to W$ be a linear transformation, and $\mathcal{B},\mathcal{C}$ bases for $V$ and $W$ respectively. Then for any vector $\vec{v}\in V$ we have

(_{\mathcal{C}}[T]_{\mathcal{B}})[\vec{v}]_{\mathcal{B}}=[T(\vec{v})]_{% \mathcal{C}}.

Recall that $[\vec{v}]_{\mathcal{B}}$ is the column vector of coordinates of $\vec{v}$ with respect to $\mathcal{B}$ , and $[T(\vec{v})]_{\mathcal{C}}$ is the column vector of coordinates of $T(\vec{v})$ with respect to $\mathcal{C}$ .

In other words, the matrix ${}_{\mathcal{C}}[T]_{\mathcal{B}}$ transforms the coordinate vector $[\vec{v}]_{\mathcal{B}}$ to $[T(\vec{v})]_{\mathcal{C}}$ . The following exercise verifies this theorem is some specific cases.

Exercise 4.8:

Let $T((x,y,z)):=(x,x+y,x+y+z)$ , and $\vec{v}=(1,0,0)$ , and let $\mathcal{C}$ be the standard basis of ${\mathbb{R}}^{3}$ . For each of the following bases, compute ${{}_{\mathcal{C}}[}T]_{\mathcal{B}}$ and $[\vec{v}]_{\mathcal{B}}$ . Hence verify Theorem 4.7 for the vector $\vec{v}$ in each case:

i.

$\mathcal{B}$ is the standard basis of ${\mathbb{R}}^{3}$ .
ii.

$\mathcal{B}=((0,1,0),(1,-1,0),(0,1,3))$ .
iii.

$\mathcal{B}=((0,1,1),(1,0,0),(-2,0,1))$ .

[End of Exercise]

Corollary 4.9.

If $\mathcal{B}$ , $\mathcal{C}$ , and $\mathcal{D}$ are all bases of $V$ , and $T,S:V\to V$ are linear transformations, then we have

({{}_{\mathcal{D}}[}T]_{\mathcal{C}})({{}_{\mathcal{C}}[}S]_{\mathcal{B}})={{}% _{\mathcal{D}}[}T\circ S]_{\mathcal{B}}.

Proof.

The proof repeatedly uses Theorem 4.7. For any $\vec{v}\in V$ we have:

({{}_{\mathcal{D}}[}T]_{\mathcal{C}})({{}_{\mathcal{C}}[}S]_{\mathcal{B}})[% \vec{v}]_{\mathcal{B}}=({{}_{\mathcal{D}}[}T]_{\mathcal{C}})[S(\vec{v})]_{% \mathcal{C}}=[T(S(\vec{v})]_{\mathcal{D}}=({{}_{\mathcal{D}}[}T\circ S]_{% \mathcal{B}})[\vec{v}]_{\mathcal{B}}.

But if $P[\vec{v}]_{\mathcal{B}}=Q[\vec{v}]_{\mathcal{B}}$ for all vectors $\vec{v}$ , then $P=Q$ . The result follows. ∎

It’s as if the neighbouring “ $\mathcal{C}$ ”s cancel each other out. This is the reason for writing the notation as it is, and is a good trick for manipulating these matrices.