The next arithmetic operation that we discuss is the multiplication of two matrices. Unlike addition, the good notion of product of matrices, that is, the one that is useful in practice, is not the obvious one, given by multiplying the coefficients . An explanation for why it is defined this way is given in Section 6.2, where multiplication corresponds to composing linear transformations. The multiplication of two matrices is defined as follows:
Let and , for positive integers . Then,
The product exists if and only if .
Assume , and define coefficients
Then the product of and is defined to be the matrix . We write the product as or , and note that it lives in .
We use the exponential notation as usual and , etc.
Example 1.4.2.
Let . Then, the product is defined, and it is
Notice that the product is not defined.
This definition generalises the definition of the product of a matrix and a column vector, in the sense that if , then the product is as given in Definition 1.3.1.
Observe that exists if and only if is a square matrix.
Given matrices and such that and both are defined, it is not true that , in general. That is, matrix multiplication is not a commutative operation (see Example 1.4).
Example 1.4.4.
Let and We calculate
For real numbers, , if , then either or . This is not true for matrices; indeed the above example shows there are non-zero matrices such that .
If is a row vector and is a column vector, then is a matrix whose single entry is the scalar product of the vectors corresponding to and . The product is an matrix whose coefficient is .
Example 1.4.6. (Zero Matrix)
See also Example 1.2. For any matrix , matrix multiplication gives and , where is the zero matrix of the appropriate dimensions.
Example 1.4.7. (Identity Matrix)
Define the identity matrix as
Then for any matrix , matrix multiplication gives and .
Before we establish further results, we need one more piece of notation which we will use throughout.
Let . For each integer with , we define the standard basis vectors:
where in both cases the only non-zero coefficient is the -th one.
For example, if , then we consider vectors of size and we have
The vectors are ubiquitous, in the sense that we can express any vector of , for any , as a linear combination of the ’s. In other words, if
The next result is an instance that illustrates the usefulness of the ’s.
Given positive integers and , and a matrix , the following hold.
For all integers the column vector is the -th column of ,
For all integers , the row vector is the -th row of .
If is such that for all column vectors , then .
(i) First, we check that the sizes are correct, i.e.
By definition of matrix-vector multiplication, the -th coefficient of is
because all the coefficients are zero except the -th one. It follows that
is the -th column of , as required.
(ii) The row version is proven likewise. Try it!
For (iii), assume that we are given a matrix with for all . Since any column vector
we only consider the vectors . Now, the assumption also says that we have for all . Therefore, by (i), the -th columns of and are equal, for all , and hence and are equal. ∎
Although matrix multiplication is not commutative (i.e. in general), it is associative and distributes over addition. In other words:
Let , let and let . The following properties hold.
Associativity: .
Distributivity:
For convenience in this proof, let us write for the coefficient of a matrix . We need to show that
, for arbitrary with and . By definition,
From associativity, commutativity and distributivity properties of the operations in , we may rearrange as
and so, as claimed.
We show one equality and leave the other as an exercise. We have
where this latter equality holds by definition of matrix addition. As in (i), we use the well-known properties of the operations in in order to recast as
Therefore, as required.
∎
An important consequence of this result is that for products of any number of matrices, brackets are unimportant. Instead of or we just write . Similarly we write to denote any product like or .