Home page for accesible maths 1 Matrices 1.3 Vector multiplication 1.5 The transpose of a matrix

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

1.4. Matrix multiplication

The next arithmetic operation that we discuss is the multiplication of two matrices. Unlike addition, the good notion of product of matrices, that is, the one that is useful in practice, is not the obvious one, given by multiplying the coefficients $(a_{ij}b_{ij})$ . An explanation for why it is defined this way is given in Section 6.2, where multiplication corresponds to composing linear transformations. The multiplication of two matrices is defined as follows:

Definition 1.4.1 (Matrix multiplication).

Let $A=(a_{ij})\in\operatorname{M}_{{n\times m}}({{\mathbb{R}}})$ and $B=(b_{ij})\in\operatorname{M}_{{p\times q}}({{\mathbb{R}}})$ , for positive integers $n, m, p, q$ . Then,

(i)

The product $A B$ exists if and only if $m=p$ .
(ii)

Assume $m=p$ , and define coefficients

$c_{ij}=\sum_{1\leq k\leq m}a_{ik}b_{kj}\quad\hbox{for all}\quad 1\leq i\leq n% \quad\hbox{and}\quad 1\leq j\leq q\;.$

Then the product of $A$ and $B$ is defined to be the matrix $AB=(c_{ij})$ . We write the product as $A B$ or $A\cdot B$ , and note that it lives in $\operatorname{M}_{{n\times q}}({{\mathbb{R}}})$ .

We use the exponential notation as usual $A^{2}=A\cdot A$ and $A^{3}=A\cdot A\cdot A$ , etc.

Example 1.4.2.

Let $A=\begin{pmatrix}1&2\\ 0&3\\ 1&0\end{pmatrix}\quad\hbox{,}\quad B=\begin{pmatrix}2&-1\\ 1&0\end{pmatrix}$ . Then, the product $A B$ is defined, and it is

$AB=\begin{pmatrix}1\cdot 2+2\cdot 1&1\cdot(-1)+2\cdot 0\\ 0\cdot 2+3\cdot 1&0\cdot(-1)+3\cdot 0\\ 1\cdot 2+0\cdot 1&1\cdot(-1)+0\cdot 0\end{pmatrix}\;=\;\begin{pmatrix}4&-1\\ 3&0\\ 2&-1\end{pmatrix}\in\operatorname{M}_{{3\times 2}}({{\mathbb{R}}}).$

Notice that the product $B A$ is not defined.

Remark 1.4.3.

(i)

This definition generalises the definition of the product of a matrix and a column vector, in the sense that if $B\in{\mathbb{R}}^{p}$ , then the product $A B$ is as given in Definition 1.3.1.
(ii)

Observe that $A^{2}$ exists if and only if $A$ is a square matrix.
(iii)

Given matrices $A$ and $B$ such that $A B$ and $B A$ both are defined, it is not true that $AB=BA$ , in general. That is, matrix multiplication is not a commutative operation (see Example 1.4).

Example 1.4.4.

Let $A=\begin{pmatrix}0&0\\ 1&-1\end{pmatrix}$ and $B=\begin{pmatrix}2&0\\ 1&0\end{pmatrix}.$ We calculate

$AB=\begin{pmatrix}0&0\\ 1&0\end{pmatrix}\quad\hbox{and}\quad BA=\begin{pmatrix}0&0\\ 0&0\end{pmatrix}.$

For real numbers, $x,y\in{\mathbb{R}}$ , if $x\cdot y=0$ , then either $x=0$ or $y=0$ . This is not true for matrices; indeed the above example shows there are non-zero matrices $X, Y$ such that $X\cdot Y=0$ .

Remark 1.4.5.

If $A=(a_{i})\in\operatorname{M}_{{1\times n}}({{\mathbb{R}}})$ is a row vector and $B=(b_{i})\in{\mathbb{R}}^{n}$ is a column vector, then $AB\in{\mathbb{R}}$ is a $1\times 1$ matrix whose single entry is the scalar product $\sum_{i=1}^{n}a_{i}b_{i}$ of the vectors corresponding to $A$ and $B$ . The product $B A$ is an $n\times n$ matrix whose $(i,j)$ coefficient is $b_{i}a_{j}$ .

Example 1.4.6. (Zero Matrix)

See also Example 1.2. For any matrix $A$ , matrix multiplication gives $0A=0$ and $A0=0$ , where $0$ is the zero matrix of the appropriate dimensions.

Example 1.4.7. (Identity Matrix)

Define the $n\times n$ identity matrix as

$I_{n}=\begin{pmatrix}1&&&\textup{\Huge 0}\\ &1&&\\ &&\ddots&\\ \textup{\Huge 0}&&&1\end{pmatrix},\quad\quad\hbox{ so that }\quad\quad(I_{n})_% {ij}=\left\{\begin{array}[]{rl}1&\quad\hbox{if $i=j$}\\ 0&\quad\hbox{otherwise.}\end{array}\right.$

Then for any $n\times n$ matrix $A$ , matrix multiplication gives $AI_{n}=A$ and $I_{n}A=A$ .

Before we establish further results, we need one more piece of notation which we will use throughout.

Definition 1.4.8.

Let $n\in{\mathbb{N}}$ . For each integer $j$ with $1\leq j\leq n$ , we define the standard basis vectors:

e_{j}=\begin{pmatrix}0\\ \vdots\\ 0\\ 1\\ 0\\ \vdots\\ 0\end{pmatrix}\in{\mathbb{R}}^{n}\quad\hbox{and}\quad e_{j}^{t}=(0\ \dots\ 0\ % 1\ 0\dots\ 0)\in\operatorname{M}_{{1\times n}}({{\mathbb{R}}})

where in both cases the only non-zero coefficient is the $j$ -th one.

For example, if $n=2$ , then we consider vectors of size $2$ and we have

e_{1}=\begin{pmatrix}1\\ 0\end{pmatrix}\quad\hbox{,}\quad e_{2}=\begin{pmatrix}0\\ 1\end{pmatrix}\quad\hbox{,}\quad e_{1}^{t}=(1\ 0)\quad\hbox{and}\quad e_{2}^{t% }=(0\ 1).

The vectors $e_{j}$ are ubiquitous, in the sense that we can express any vector of ${\mathbb{R}}^{n}$ , for any $n\geq 1$ , as a linear combination of the $e_{j}$ ’s. In other words, if

v=\begin{pmatrix}v_{1}\\ v_{2}\\ \vdots\\ v_{n}\end{pmatrix}\in{\mathbb{R}}^{n}\quad\hbox{, then}\quad

v=v_{1}\begin{pmatrix}1\\ 0\\ \vdots\\ 0\end{pmatrix}+v_{2}\begin{pmatrix}0\\ 1\\ \vdots\\ 0\end{pmatrix}+\dots+v_{n}\begin{pmatrix}0\\ 0\\ \vdots\\ 1\end{pmatrix}=v_{1}e_{1}+v_{2}e_{2}+\dots+v_{n}e_{n}.

The next result is an instance that illustrates the usefulness of the $e_{i}$ ’s.

Proposition 1.4.9.

Given positive integers $n$ and $m$ , and a matrix $A\in\operatorname{M}_{{n\times m}}({{\mathbb{R}}})$ , the following hold.

(i)

For all integers $1\leq j\leq m$ the column vector $Ae_{j}\in{\mathbb{R}}^{n}$ is the $j$ -th column of $A$ ,
(ii)

For all integers $1\leq i\leq n$ , the row vector $e_{i}^{t}A\in\operatorname{M}_{{1\times m}}({{\mathbb{R}}})$ is the $i$ -th row of $A$ .
(iii)

If $B\in\operatorname{M}_{{n\times m}}({{\mathbb{R}}})$ is such that $Au=Bu$ for all column vectors $u\in{\mathbb{R}}^{m}$ , then $A=B$ .

Proof.

(i) First, we check that the sizes are correct, i.e.

Ae_{j}\in{\mathbb{R}}^{n}\quad\hbox{and}\quad e_{i}^{t}A\in\operatorname{M}_{{% 1\times m}}({{\mathbb{R}}}).

By definition of matrix-vector multiplication, the $p$ -th coefficient $(Ae_{j})_{p}$ of $Ae_{j}$ is

(Ae_{j})_{p}=\sum_{1\leq k\leq m}a_{pk}(e_{j})_{k}=a_{pj}\;,

because all the coefficients $(e_{j})_{k}$ are zero except the $j$ -th one. It follows that

Ae_{j}=\begin{pmatrix}(Ae_{j})_{1}\\ \vdots\\ (Ae_{j})_{p}\\ \vdots\\ (Ae_{j})_{n}\end{pmatrix}=\begin{pmatrix}a_{1j}\\ \vdots\\ a_{pj}\\ \vdots\\ a_{nj}\end{pmatrix}

is the $j$ -th column of $A$ , as required.

(ii) The row version is proven likewise. Try it!

For (iii), assume that we are given a matrix $B\in\operatorname{M}_{{n\times m}}({{\mathbb{R}}})$ with $Au=Bu$ for all $u\in{\mathbb{R}}^{m}$ . Since any column vector

u=\begin{pmatrix}u_{1}\\ \vdots\\ u_{m}\end{pmatrix}\in{\mathbb{R}}^{m}\quad\hbox{can be written as}\quad

u=u_{1}e_{1}+\dots+u_{m}e_{m}=\sum_{1\leq j\leq m}u_{j}e_{j},

we only consider the vectors $e_{j}$ . Now, the assumption also says that we have $Ae_{j}=Be_{j}$ for all $j$ . Therefore, by (i), the $j$ -th columns of $A$ and $B$ are equal, for all $1\leq j\leq m$ , and hence $A$ and $B$ are equal. ∎

Although matrix multiplication is not commutative (i.e. $AB\neq BA$ in general), it is associative and distributes over addition. In other words:

Lemma 1.4.10.

Let $A\in\operatorname{M}_{{n\times m}}({{\mathbb{R}}})$ , let $B,B_{1},B_{2}\in\operatorname{M}_{{m\times p}}({{\mathbb{R}}})$ and let $C\in\operatorname{M}_{{p\times q}}({{\mathbb{R}}})$ . The following properties hold.

(i)

Associativity: $A(BC)=(AB)C$ .
(ii)

Distributivity: $\left\{\begin{array}[]{rcl}A(B_{1}+B_{2})&=&AB_{1}+AB_{2}\quad\hbox{and}\\ (B_{1}+B_{2})C&=&B_{1}C+B_{2}C.\end{array}\right.$

Proof.

For convenience in this proof, let us write $X_{ij}$ for the $(i,j)$ coefficient of a matrix $X$ . We need to show that

(i)

$((AB)C)_{ij}=(A(BC))_{ij}$ , for arbitrary $i, j$ with $1\leq i\leq n$ and $1\leq j\leq q$ . By definition,

$((AB)C)_{ij}=\sum_{1\leq k\leq p}(AB)_{ik}C_{kj}=\sum_{1\leq k\leq p}\big{(}% \sum_{1\leq l\leq m}A_{il}B_{lk}\big{)}C_{kj}=(*).$

From associativity, commutativity and distributivity properties of the operations in ${\mathbb{R}}$ , we may rearrange $(*)$ as

$(*)=\sum_{1\leq l\leq m}A_{il}\big{(}\sum_{1\leq k\leq p}B_{lk}C_{kj}\big{)}=(% A(BC))_{ij},$

and so, $A(BC)=(AB)C$ as claimed.
(ii)

We show one equality and leave the other as an exercise. We have

$\big{(}A(B_{1}+B_{2})\big{)}_{ij}=\sum_{1\leq k\leq m}A_{ik}(B_{1}+B_{2})_{kj}% =\sum_{1\leq k\leq m}A_{ik}((B_{1})_{kj}+(B_{2})_{kj})=(*),$

where this latter equality holds by definition of matrix addition. As in (i), we use the well-known properties of the operations in ${\mathbb{R}}$ in order to recast $(*)$ as

$(*)=\sum_{1\leq k\leq m}A_{ik}(B_{1})_{kj}+\sum_{1\leq k\leq m}A_{ik}(B_{2})_{% kj}=(AB_{1})_{ij}+(AB_{2})_{ij}=(AB_{1}+AB_{2})_{ij},$

Therefore, $A(B_{1}+B_{2})=AB_{1}+AB_{2}$ as required.

∎

An important consequence of this result is that for products of any number of matrices, brackets are unimportant. Instead of $(AB)C$ or $A(BC)$ we just write $A B C$ . Similarly we write $A B C D$ to denote any product like $(AB)(CD)$ or $A\big{(}(BC)D\big{)}$ .