Home page for accesible maths 2 Row operations on matrices

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.2. Row reduction of matrices to echelon form

In this section, we use the elementary row operations on matrices to put them into a particularly simple form, called reduced echelon form. This process, known as row reduction of a matrix (or sometimes, Gaussian elimination) consists of a sequence of elementary row operations that reduce it into a much simpler form, while keeping the essential information in the matrix.

Definition 2.2.1.
  • The leading coefficient (also called a pivot) of a non-zero row in a matrix is the leftmost coefficient which is non-zero.

  • A matrix is said to be in row echelon form (or simply echelon form) if for every non-zero row Ri, the row immediately above Ri-1 is also non-zero, and the leading coefficient of Ri-1 is to the left of the leading coefficient of Ri.

Example 2.2.2.

  • The following matrices are in echelon form:

    (123001000),(103011004),(203001000).

    The following matrices are not in echelon form:

    (123002001),(010123000),(123001010).
Definition 2.2.3.

A matrix is said to be in reduced row echelon form (or simply reduced echelon form) if the following conditions hold:

  1. (i)

    It is in echelon form,

  2. (ii)

    Every leading coefficient is 1, and

  3. (iii)

    Every other coefficient in a column containing a leading coefficient is zero.

Example 2.2.4.

  • The following matrices are in reduced row echelon form:

    (103011000),(010001000),(100001000).

    The following matrices are not in reduced row echelon form, but they are in echelon form:

    (123001000),(200013000),(103011001).

We now introduce a general algorithm to put a matrix into reduced row echelon form by using row operations. We have already used part of this algorithm in Example 2.1 to put the 3×3 matrix A into echelon form. To describe how it works for an arbitrary matrix, let A be any n×m matrix, say,

A=(*********)with all the coefficients *.
  • Ech.

    From A to the echelon form of A.  Look at the leftmost non-zero column. Use the swapping operation to ensure the top row of that column has a non-zero entry. Now, by successive row operations Rj=rj-λjr1 (for appropriate λj), we bring A into the form:

    A1=(***0**0**).

    Next, focus on the part of the matrix from R2 downwards, and repeat the above step. That is, look at the leftmost column which is not zero from R2 downwards, and swap rows to ensure the R2 entry of that column is non-zero. Then, using R2 and elementary row operations of the form Rj=rj-μjR2 (for appropriate μj), we bring A1 into the form:

    A2=(****0***00**00**).

    Now focus on the part of the matrix from R3 downwards, and repeat the above steps. The algorithm terminates when we reach the last line, at which point the result should be in echelon form. At this point you should see a flight of stairs of 0’s forming the bottom left corner of the matrix which goes down (from left to right) at most one step at a time.

    This is the midpoint of the algorithm

  • Red.

    From echelon form of A to the reduced echelon form of A.  Say our matrix is in echelon form, as follows:

    Ae=(x11x1jx1,j+1x1m00xkjxk,j+1xkm0000),where kj and xkj0.

    Then, we use the last non-zero row, here Rk, in order to create zeros in the j-th column, which is where the first non-zero term of Rk occurs. This term, here xkj is the leading coefficient, as in Definition 2.2.1. We also scale that row, that is we multiply Rk by xkj-1, so that the leading coefficient becomes 1. The result looks like this:

    Ae=(10x1,j+1x1m000xk-1,j+1xk-1,m001xk,j+1xkm00000).

    Next, we use the row just above the previous one, namely Rk-1 in our case, and elementary row operations to create zeros above the leading coefficient in Rk-1. Here, the leading coefficient is the coefficient xk-1,i for some i<j. So, using Rk-1, we need to annihilate all the coefficients xli, for all 1l(k-2). Therefore, we scale Rk-1 so that the leading coefficient becomes 1. And so on until we reach the first row, and the end of the algorithm. Once completed, the matrix should be the reduced echelon form of A.

This is the end of the algorithm.

The algorithm is best understood by playing around with the row operations until your matrix is in reduced echelon form. One may notice at each stage there may be several possible row operations to choose from. For this reason, the echelon form of A is not uniquely defined. But the reduced row echelon form is always unique:

Theorem 2.2.5.

The reduced echelon form of a matrix A is unique. In other words, different sequences of row operations will produce the same final answer.

The proof of this theorem is not obvious, but it is not too complicated either. We will omit the proof; for the basic idea, the reader may wish to look at the proof of Theorem 5.2.4.

As before, we use the following convention: we write Rj=rj+λri for the elementary operation which consists in replacing the j-th row Rj of the given matrix by Rj+λRi. Similarly for the elementary operations of the two other types. Recall that this elementary operation is performed by multiplication on the left by the matrix Eji(λ).

Example 2.2.6.

  • Find the reduced echelon form of A=(1116201713-36). We begin by doing the following row operations to put the matrix into echelon form.

    (1116201713-36)R2=r2-2r1R3=r3-r1(11160-2-1-502-40)
    R3=r3+r2(11160-2-1-500-5-5)

    which is an echelon form of A.

    Before proceeding further, we need to explain the notation

    R2=r2-2r1R3=r3-r1

    used above. Recall that elementary operations are not commutative, in the sense that the order in which we perform them matters. Thus, the above notation means: “do first R2=r2-2r1 and then R3=r3-r1”.

    To obtain now the reduced echelon form of A, we may start by scaling all the rows, so that their leading coefficients become 1. So, we multiply R2 by -12 and R3 by -15. We write this as follows:

    R2=-12r2R3=-15r3(11160112520011).

    Then, we use R3 to annihilate the coefficients 1 and 12 in the third column (i.e. above the leading coefficient of R3). To do that, the elementary row operations are as follows:

    R1=r1-r3R2=r2-12r3(110501020011)

    Finally, we use R2 to kill the (1,2) coefficient, by doing R1=r1-r2. The result is the reduced echelon form of A, namely:

    R1=r1-r2(100301020011).

Using the elementary matrices defined in Definition 2.1.3, we can keep track of the succession of elementary row operations, which we will do in the next example.

Example 2.2.7.

  • Let A be the matrix

    A=(123135246).

    We want to find a sequence L1,,Lk of elementary matrices such that the matrix LkL1A is in reduced echelon form.

    Therefore, we first find an echelon form of A using elementary row operations, recording the step-by-step operations that we do:

    (123135246)R2=r2-r1R3=r3-2r1(123012000)=Aewhich is echelon.

    Since the leading coefficients in Ae are all 1, we do not need to scale the rows, and can work out the reduced echelon form of A, using a single elementary row operation:

    AeR1=r1-2r2(10-1012000)=Arwhich is the reduced echelon form of A.

    Now, from A to Ar, we use Definition 2.1.3 to determine the associated elementary matrices. The first operation was R2=r2-r1 which corresponds to the matrix E21(-1). Next, R3=r3-2r1 corresponds to the matrix E31(-2) and gives Ae=E31(-2)E21(-1)A (notice the order of these matrices). Finally, R1=r1-2r2 corresponds to E12(-2). The final answer is:

    Ar=E12(-2)E31(-2)E21(-1)A.

    It’s a good idea to check this answer by multiplying:

    (10-1012000)Ar=(1-20010001)E12(-2)(100010-201)E31(-2)(100-110001)E21(-1)(123135246)A

Note that when you perform the operations R2=r2-r1 and R3=r3-2r1, it doesn’t matter which one you do first; in mathematics we say that these two operations commute (because you can move one past the other). Equivalently, the matrices E21(-1) and E31(-2) commute, i.e.

E21(-1)E31(-2)=E31(-2)E21(-1).

It is however not true in general that elementary matrices commute with each other. For instance E21(-1) and E12(-2) in M2() do not commute:

E21(-1)E12(-2)=(10-11)(1-201)=(1-2-13),

whereas,

E12(-2)E21(-1)=(1-201)(10-11)=(3-2-11).