Home page for accesible maths 2 Row operations on matrices 2.1 Elementary matrices 2.3 Rank of a matrix

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.2. Row reduction of matrices to echelon form

In this section, we use the elementary row operations on matrices to put them into a particularly simple form, called reduced echelon form. This process, known as row reduction of a matrix (or sometimes, Gaussian elimination) consists of a sequence of elementary row operations that reduce it into a much simpler form, while keeping the essential information in the matrix.

Definition 2.2.1.

•

The leading coefficient (also called a pivot) of a non-zero row in a matrix is the leftmost coefficient which is non-zero.
•

A matrix is said to be in row echelon form (or simply echelon form) if for every non-zero row $R_{i}$ , the row immediately above $R_{i-1}$ is also non-zero, and the leading coefficient of $R_{i-1}$ is to the left of the leading coefficient of $R_{i}$ .

Example 2.2.2.

The following matrices are in echelon form:

$\begin{pmatrix}1&2&3\\ 0&0&1\\ 0&0&0\end{pmatrix},\begin{pmatrix}1&0&3\\ 0&1&1\\ 0&0&4\end{pmatrix},\begin{pmatrix}2&0&3\\ 0&0&1\\ 0&0&0\end{pmatrix}.$

The following matrices are not in echelon form:

$\begin{pmatrix}1&2&3\\ 0&0&2\\ 0&0&1\end{pmatrix},\begin{pmatrix}0&1&0\\ 1&2&3\\ 0&0&0\end{pmatrix},\begin{pmatrix}1&2&3\\ 0&0&1\\ 0&1&0\end{pmatrix}.$

Definition 2.2.3.

A matrix is said to be in reduced row echelon form (or simply reduced echelon form) if the following conditions hold:

(i)

It is in echelon form,
(ii)

Every leading coefficient is 1, and
(iii)

Every other coefficient in a column containing a leading coefficient is zero.

Example 2.2.4.

The following matrices are in reduced row echelon form:

$\begin{pmatrix}1&0&3\\ 0&1&1\\ 0&0&0\end{pmatrix},\begin{pmatrix}0&1&0\\ 0&0&1\\ 0&0&0\end{pmatrix},\begin{pmatrix}1&0&0\\ 0&0&1\\ 0&0&0\end{pmatrix}.$

The following matrices are not in reduced row echelon form, but they are in echelon form:

$\begin{pmatrix}1&2&3\\ 0&0&1\\ 0&0&0\end{pmatrix},\begin{pmatrix}2&0&0\\ 0&1&3\\ 0&0&0\end{pmatrix},\begin{pmatrix}1&0&3\\ 0&1&1\\ 0&0&1\end{pmatrix}.$

We now introduce a general algorithm to put a matrix into reduced row echelon form by using row operations. We have already used part of this algorithm in Example 2.1 to put the $3\times 3$ matrix $A$ into echelon form. To describe how it works for an arbitrary matrix, let $A$ be any $n\times m$ matrix, say,

A=\begin{pmatrix}*&*&\dots&*\\ *&*&\dots&*\\ \vdots&\vdots&\dots&\vdots\\ *&*&\dots&*\end{pmatrix}\quad\hbox{with all the coefficients $*\in{\mathbb{R}}$.}\quad

Ech.

From $A$ to the echelon form of $A$ . Look at the leftmost non-zero column. Use the swapping operation to ensure the top row of that column has a non-zero entry. Now, by successive row operations $R_{j}=r_{j}-\lambda_{j}r_{1}$ (for appropriate $\lambda_{j}\in{\mathbb{R}}$ ), we bring $A$ into the form:

$A_{1}=\begin{pmatrix}*&*&\dots&*\\ 0&*&\dots&*\\ \vdots&\vdots&\dots&\vdots\\ 0&*&\dots&*\end{pmatrix}.$

Next, focus on the part of the matrix from $R_{2}$ downwards, and repeat the above step. That is, look at the leftmost column which is not zero from $R_{2}$ downwards, and swap rows to ensure the $R_{2}$ entry of that column is non-zero. Then, using $R_{2}$ and elementary row operations of the form $R_{j}=r_{j}-\mu_{j}R_{2}$ (for appropriate $\mu_{j}\in{\mathbb{R}}$ ), we bring $A_{1}$ into the form:

$A_{2}=\begin{pmatrix}*&*&*&\dots&*\\ 0&*&*&\dots&*\\ 0&0&*&\dots&*\\ \vdots&\vdots&\vdots&\dots&\vdots\\ 0&0&*&\dots&*\end{pmatrix}.$

Now focus on the part of the matrix from $R_{3}$ downwards, and repeat the above steps. The algorithm terminates when we reach the last line, at which point the result should be in echelon form. At this point you should see a flight of stairs of $0$ ’s forming the bottom left corner of the matrix which goes down (from left to right) at most one step at a time.

…This is the midpoint of the algorithm …
Red.

From echelon form of $A$ to the reduced echelon form of $A$ . Say our matrix is in echelon form, as follows:

$A_{e}=\begin{pmatrix}x_{11}&\dots&\dots&x_{1j}&x_{1,j+1}&\dots&x_{1m}\\ \vdots&\ddots&\ddots&\vdots&\vdots&&\vdots\\ 0&\dots&0&x_{kj}&x_{k,j+1}&\dots&x_{km}\\ 0&\dots&0&0&\dots&0\end{pmatrix},\quad\hbox{where $k\leq j$ and $x_{kj}\neq 0$% .}\quad$

Then, we use the last non-zero row, here $R_{k}$ , in order to create zeros in the $j$ -th column, which is where the first non-zero term of $R_{k}$ occurs. This term, here $x_{kj}$ is the leading coefficient, as in Definition 2.2.1. We also scale that row, that is we multiply $R_{k}$ by ${x_{kj}}^{-1}$ , so that the leading coefficient becomes $1$ . The result looks like this:

$A^{\prime}_{e}=\begin{pmatrix}1&\dots&\dots&0&x^{\prime}_{1,j+1}&\dots&\dots&x% ^{\prime}_{1m}\\ 0&\ddots&\ldots&\vdots&\ldots&\ldots&\ldots&\vdots\\ \vdots&\dots&\ldots&0&\ldots&\ldots&\ldots&\vdots\\ \vdots&\dots&\dots&0&x^{\prime}_{k-1,j+1}&\ldots&\ldots&x^{\prime}_{k-1,m}\\ 0&\dots&0&1&x^{\prime}_{k,j+1}&\ldots&\ldots&x^{\prime}_{km}\\ 0&\dots&0&0&0&\dots&\dots&0\end{pmatrix}.$

Next, we use the row just above the previous one, namely $R_{k-1}$ in our case, and elementary row operations to create zeros above the leading coefficient in $R_{k-1}$ . Here, the leading coefficient is the coefficient $x^{\prime}_{k-1,i}$ for some $i<j$ . So, using $R_{k-1}$ , we need to annihilate all the coefficients $x^{\prime}_{li}$ , for all $1\leq l\leq(k-2)$ . Therefore, we scale $R_{k-1}$ so that the leading coefficient becomes $1$ . And so on until we reach the first row, and the end of the algorithm. Once completed, the matrix should be the reduced echelon form of $A$ .

…This is the end of the algorithm.

The algorithm is best understood by playing around with the row operations until your matrix is in reduced echelon form. One may notice at each stage there may be several possible row operations to choose from. For this reason, the echelon form of $A$ is not uniquely defined. But the reduced row echelon form is always unique:

Theorem 2.2.5.

The reduced echelon form of a matrix $A$ is unique. In other words, different sequences of row operations will produce the same final answer.

The proof of this theorem is not obvious, but it is not too complicated either. We will omit the proof; for the basic idea, the reader may wish to look at the proof of Theorem 5.2.4.

As before, we use the following convention: we write $R_{j}=r_{j}+\lambda r_{i}$ for the elementary operation which consists in replacing the $j$ -th row $R_{j}$ of the given matrix by $R_{j}+\lambda R_{i}$ . Similarly for the elementary operations of the two other types. Recall that this elementary operation is performed by multiplication on the left by the matrix $E_{ji}(\lambda)$ .

Example 2.2.6.

Find the reduced echelon form of $A=\begin{pmatrix}1&1&1&6\\ 2&0&1&7\\ 1&3&-3&6\end{pmatrix}$ . We begin by doing the following row operations to put the matrix into echelon form.

$\begin{pmatrix}1&1&1&6\\ 2&0&1&7\\ 1&3&-3&6\end{pmatrix}\quad\xrightarrow{\begin{array}[]{c}R_{2}=r_{2}-2r_{1}\\ R_{3}=r_{3}-r_{1}\end{array}}\begin{pmatrix}1&1&1&6\\ 0&-2&-1&-5\\ 0&2&-4&0\end{pmatrix}\to$

${}\xrightarrow{R_{3}=r_{3}+r_{2}}\begin{pmatrix}1&1&1&6\\ 0&-2&-1&-5\\ 0&0&-5&-5\end{pmatrix}$

which is an echelon form of $A$ .

Before proceeding further, we need to explain the notation

$\xrightarrow{\begin{array}[]{c}R_{2}=r_{2}-2r_{1}\\ R_{3}=r_{3}-r_{1}\end{array}}$

used above. Recall that elementary operations are not commutative, in the sense that the order in which we perform them matters. Thus, the above notation means: “do first $R_{2}=r_{2}-2r_{1}$ and then $R_{3}=r_{3}-r_{1}$ ”.

To obtain now the reduced echelon form of $A$ , we may start by scaling all the rows, so that their leading coefficients become $1$ . So, we multiply $R_{2}$ by $-\frac{1}{2}$ and $R_{3}$ by $-\frac{1}{5}$ . We write this as follows:

$\xrightarrow{\begin{array}[]{l}R_{2}=-\frac{1}{2}r_{2}\\ R_{3}=-\frac{1}{5}r_{3}\end{array}}\begin{pmatrix}1&1&1&6\\ 0&1&\frac{1}{2}&\frac{5}{2}\\ 0&0&1&1\end{pmatrix}.$

Then, we use $R_{3}$ to annihilate the coefficients $1$ and $\frac{1}{2}$ in the third column (i.e. above the leading coefficient of $R_{3}$ ). To do that, the elementary row operations are as follows:

$\xrightarrow{\begin{array}[]{l}R_{1}=r_{1}-r_{3}\\ R_{2}=r_{2}-\frac{1}{2}r_{3}\end{array}}\begin{pmatrix}1&1&0&5\\ 0&1&0&2\\ 0&0&1&1\end{pmatrix}$

Finally, we use $R_{2}$ to kill the $(1,2)$ coefficient, by doing $R_{1}=r_{1}-r_{2}$ . The result is the reduced echelon form of $A$ , namely:

$\xrightarrow{R_{1}=r_{1}-r_{2}}\begin{pmatrix}1&0&0&3\\ 0&1&0&2\\ 0&0&1&1\end{pmatrix}.$

Using the elementary matrices defined in Definition 2.1.3, we can keep track of the succession of elementary row operations, which we will do in the next example.

Example 2.2.7.

Let $A$ be the matrix

$A=\begin{pmatrix}1&2&3\\ 1&3&5\\ 2&4&6\end{pmatrix}.$

We want to find a sequence $L_{1},\dots,L_{k}$ of elementary matrices such that the matrix $L_{k}\dots L_{1}A$ is in reduced echelon form.

Therefore, we first find an echelon form of $A$ using elementary row operations, recording the step-by-step operations that we do:

$\begin{pmatrix}1&2&3\\ 1&3&5\\ 2&4&6\end{pmatrix}\quad\xrightarrow{\begin{array}[]{c}R_{2}=r_{2}-r_{1}\\ R_{3}=r_{3}-2r_{1}\end{array}}\begin{pmatrix}1&2&3\\ 0&1&2\\ 0&0&0\end{pmatrix}=A_{e}\quad\hbox{which is echelon.}\quad$

Since the leading coefficients in $A_{e}$ are all $1$ , we do not need to scale the rows, and can work out the reduced echelon form of $A$ , using a single elementary row operation:

$A_{e}\xrightarrow{R_{1}=r_{1}-2r_{2}}\begin{pmatrix}1&0&-1\\ 0&1&2\\ 0&0&0\end{pmatrix}=A_{r}\quad\hbox{which is the reduced echelon form of $A$.}\quad$

Now, from $A$ to $A_{r}$ , we use Definition 2.1.3 to determine the associated elementary matrices. The first operation was $R_{2}=r_{2}-r_{1}$ which corresponds to the matrix $E_{21}(-1)$ . Next, $R_{3}=r_{3}-2r_{1}$ corresponds to the matrix $E_{31}(-2)$ and gives $A_{e}=E_{31}(-2)E_{21}(-1)A$ (notice the order of these matrices). Finally, $R_{1}=r_{1}-2r_{2}$ corresponds to $E_{12}(-2)$ . The final answer is:

$A_{r}=E_{12}(-2)E_{31}(-2)E_{21}(-1)A.$

It’s a good idea to check this answer by multiplying:

$\underbrace{\begin{pmatrix}1&0&-1\\ 0&1&2\\ 0&0&0\end{pmatrix}}_{A_{\mathrm{r}}}=\underbrace{\begin{pmatrix}1&-2&0\\ 0&1&0\\ 0&0&1\end{pmatrix}}_{E_{12}(-2)}\underbrace{\begin{pmatrix}1&0&0\\ 0&1&0\\ -2&0&1\end{pmatrix}}_{E_{31}(-2)}\underbrace{\begin{pmatrix}1&0&0\\ -1&1&0\\ 0&0&1\end{pmatrix}}_{E_{21}(-1)}\underbrace{\begin{pmatrix}1&2&3\\ 1&3&5\\ 2&4&6\end{pmatrix}}_{A}$

Note that when you perform the operations $R_{2}=r_{2}-r_{1}$ and $R_{3}=r_{3}-2r_{1}$ , it doesn’t matter which one you do first; in mathematics we say that these two operations commute (because you can move one past the other). Equivalently, the matrices $E_{21}(-1)$ and $E_{31}(-2)$ commute, i.e.

E_{21}(-1)E_{31}(-2)=E_{31}(-2)E_{21}(-1).

It is however not true in general that elementary matrices commute with each other. For instance $E_{21}(-1)$ and $E_{12}(-2)$ in $\operatorname{M}_{{2}}({{\mathbb{R}}})$ do not commute:

E_{21}(-1)E_{12}(-2)=\begin{pmatrix}1&0\\ -1&1\end{pmatrix}\begin{pmatrix}1&-2\\ 0&1\end{pmatrix}=\begin{pmatrix}1&-2\\ -1&3\end{pmatrix},

whereas,

E_{12}(-2)E_{21}(-1)=\begin{pmatrix}1&-2\\ 0&1\end{pmatrix}\begin{pmatrix}1&0\\ -1&1\end{pmatrix}=\begin{pmatrix}3&-2\\ -1&1\end{pmatrix}.