In this section, we use the elementary row operations on matrices to put them into a particularly simple form, called reduced echelon form. This process, known as row reduction of a matrix (or sometimes, Gaussian elimination) consists of a sequence of elementary row operations that reduce it into a much simpler form, while keeping the essential information in the matrix.
The leading coefficient (also called a pivot) of a non-zero row in a matrix is the leftmost coefficient which is non-zero.
A matrix is said to be in row echelon form (or simply echelon form) if for every non-zero row , the row immediately above is also non-zero, and the leading coefficient of is to the left of the leading coefficient of .
Example 2.2.2.
The following matrices are in echelon form:
The following matrices are not in echelon form:
A matrix is said to be in reduced row echelon form (or simply reduced echelon form) if the following conditions hold:
It is in echelon form,
Every leading coefficient is 1, and
Every other coefficient in a column containing a leading coefficient is zero.
Example 2.2.4.
The following matrices are in reduced row echelon form:
The following matrices are not in reduced row echelon form, but they are in echelon form:
We now introduce a general algorithm to put a matrix into reduced row echelon form by using row operations. We have already used part of this algorithm in Example 2.1 to put the matrix into echelon form. To describe how it works for an arbitrary matrix, let be any matrix, say,
From to the echelon form of . Look at the leftmost non-zero column. Use the swapping operation to ensure the top row of that column has a non-zero entry. Now, by successive row operations (for appropriate ), we bring into the form:
Next, focus on the part of the matrix from downwards, and repeat the above step. That is, look at the leftmost column which is not zero from downwards, and swap rows to ensure the entry of that column is non-zero. Then, using and elementary row operations of the form (for appropriate ), we bring into the form:
Now focus on the part of the matrix from downwards, and repeat the above steps. The algorithm terminates when we reach the last line, at which point the result should be in echelon form. At this point you should see a flight of stairs of ’s forming the bottom left corner of the matrix which goes down (from left to right) at most one step at a time.
…This is the midpoint of the algorithm …
From echelon form of to the reduced echelon form of . Say our matrix is in echelon form, as follows:
Then, we use the last non-zero row, here , in order to create zeros in the -th column, which is where the first non-zero term of occurs. This term, here is the leading coefficient, as in Definition 2.2.1. We also scale that row, that is we multiply by , so that the leading coefficient becomes . The result looks like this:
Next, we use the row just above the previous one, namely in our case, and elementary row operations to create zeros above the leading coefficient in . Here, the leading coefficient is the coefficient for some . So, using , we need to annihilate all the coefficients , for all . Therefore, we scale so that the leading coefficient becomes . And so on until we reach the first row, and the end of the algorithm. Once completed, the matrix should be the reduced echelon form of .
…This is the end of the algorithm.
The algorithm is best understood by playing around with the row operations until your matrix is in reduced echelon form. One may notice at each stage there may be several possible row operations to choose from. For this reason, the echelon form of is not uniquely defined. But the reduced row echelon form is always unique:
The reduced echelon form of a matrix is unique. In other words, different sequences of row operations will produce the same final answer.
The proof of this theorem is not obvious, but it is not too complicated either. We will omit the proof; for the basic idea, the reader may wish to look at the proof of Theorem 5.2.4.
As before, we use the following convention: we write for the elementary operation which consists in replacing the -th row of the given matrix by . Similarly for the elementary operations of the two other types. Recall that this elementary operation is performed by multiplication on the left by the matrix .
Example 2.2.6.
Find the reduced echelon form of . We begin by doing the following row operations to put the matrix into echelon form.
which is an echelon form of .
Before proceeding further, we need to explain the notation
used above. Recall that elementary operations are not commutative, in the sense that the order in which we perform them matters. Thus, the above notation means: “do first and then ”.
To obtain now the reduced echelon form of , we may start by scaling all the rows, so that their leading coefficients become . So, we multiply by and by . We write this as follows:
Then, we use to annihilate the coefficients and in the third column (i.e. above the leading coefficient of ). To do that, the elementary row operations are as follows:
Finally, we use to kill the coefficient, by doing . The result is the reduced echelon form of , namely:
Using the elementary matrices defined in Definition 2.1.3, we can keep track of the succession of elementary row operations, which we will do in the next example.
Example 2.2.7.
Let be the matrix
We want to find a sequence of elementary matrices such that the matrix is in reduced echelon form.
Therefore, we first find an echelon form of using elementary row operations, recording the step-by-step operations that we do:
Since the leading coefficients in are all , we do not need to scale the rows, and can work out the reduced echelon form of , using a single elementary row operation:
Now, from to , we use Definition 2.1.3 to determine the associated elementary matrices. The first operation was which corresponds to the matrix . Next, corresponds to the matrix and gives (notice the order of these matrices). Finally, corresponds to . The final answer is:
It’s a good idea to check this answer by multiplying:
Note that when you perform the operations and , it doesn’t matter which one you do first; in mathematics we say that these two operations commute (because you can move one past the other). Equivalently, the matrices and commute, i.e.
It is however not true in general that elementary matrices commute with each other. For instance and in do not commute:
whereas,