Home page for accesible maths 7 Polynomials 7 Polynomials 7.2 HCFs and the Euclidean algorithm revisited

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

7.1 Polynomial arithmetic

Our first definition will be familiar to most, although our notation and terminology may be a little different.

Definition 7.1.1

A (real) polynomial in the indeterminate $X$ is an expression of the form

a_{0}+a_{1}X+a_{2}X^{2}+\cdots+a_{n}X^{n},

where $n\in\mathbb{N}_{0}$ and $a_{0},a_{1},a_{2},\dots,a_{n}\in\mathbb{R}$ . The numbers $a_{0},a_{1},a_{2},\dots,a_{n}$ are called the coefficients of the polynomial.

We shall often write $f(X)$ , $g(X)$ , etc., to represent polynomials.

Example 7.1.2

$f(X)=1+2X+3X^{2}$ and $g(X)=\frac{3}{2}-7X$ are both polynomials.

Remark 7.1.3

(i)

In a polynomial the terms may be written in any order; for instance, in Example 7.1.2 we could also write

$f(X)=2X+3X^{2}+1,\qquad\text{or}\qquad f(X)=3X^{2}+2X+1.$
(ii)

Any term with coefficient zero can be omitted, so that $2+0X+3X^{2}$ is the same polynomial as $2+3X^{2}$ . The zero polynomial is simply written $0$ ; by removing or inserting zero terms if necessary, any non-zero polynomial can be written in the form $a_{0}+a_{1}X+a_{2}X^{2}+\cdots+a_{n}X^{n}$ with $a_{n}\neq 0$ .

Definition 7.1.4

If $f(X)=a_{0}+a_{1}X+a_{2}X^{2}+\cdots+a_{n}X^{n}$ is a non-zero polynomial with $a_{n}\neq 0$ , then $n$ is the degree of $f(X)$ , written $\deg f(X)$ , and $a_{n}X^{n}$ is its highest (or leading) term. The zero polynomial has no degree and no highest term.

Example 7.1.5

The polynomial $1-5X+2X^{3}$ has degree $3$ and highest term $2X^{3}.$

Note that polynomials of degree $0$ are just non-zero real numbers, such as $-6$ or $\frac{7}{3}$ (called constants, or scalars). Polynomials of degree $1$ , such as $X+3$ or $4X-7$ , are called linear polynomials (because the graph of the function $f(x)=ax+b$ is a straight line); those of degree $2$ , such as $X^{2}-3X+7$ , are quadratic; those of degree $3$ , such as $2X^{3}-3X+5$ , are cubic.

Remark 7.1.6

We can now state precisely what it means for two polynomials $f(X)$ and $g(X)$ to be equal. Indeed, $f(X)=g(X)$ if and only if either $f(X)$ and $g(X)$ are both the zero polynomial, or they have the same degree, say $n\in\mathbb{N}_{0}$ , and when they are written in the form

f(X)=a_{0}+a_{1}X+a_{2}X^{2}+\cdots+a_{n}X^{n}\quad\ \text{and}\quad\ g(X)=b_{% 0}+b_{1}X+b_{2}X^{2}+\cdots+b_{n}X^{n},

we have $a_{j}=b_{j}$ for each $j\in\{0,1,\ldots,n\}.$

Polynomials have an arithmetic which is rather similar to that of the integers. We can always add or subtract two polynomials to obtain another, by adding or subtracting ‘‘like terms’’ (in which the power of $X$ is the same). We can also multiply any two polynomials to obtain another; this is more complicated, but just involves multiplying ‘‘term by term’’ according to the rule $a_{j}X^{j}\cdot b_{k}X^{k}=a_{j}b_{k}X^{j+k}$ for $j,k\in\mathbb{N}_{0}$ , and then gathering together like terms.

Example 7.1.7

Let $f(X)=2+3X$ and $g(X)=1+4X-6X^{2}$ . Then

	$\displaystyle f(X)+g(X)$	$\displaystyle=3+7X-6X^{2},\qquad\qquad f(X)-g(X)=1-X+6X^{2},$
	$\displaystyle f(X)g(X)$	$\displaystyle=(2+3X)(1+4X-6X^{2})$
		$\displaystyle=2+8X-12X^{2}+3X+12X^{2}-18X^{3}=2+11X-18X^{3}.$

Note that if $f(X)$ and $g(X)$ are non-zero polynomials with highest terms $a_{m}X^{m}$ and $b_{n}X^{n}$ , respectively, then the highest term of the product $f(X)g(X)$ is $a_{m}b_{n}X^{m+n}.$ Thus the degree of the product $f(X)g(X)$ is the sum of the degrees of $f(X)$ and $g(X)$ :

\deg\bigl(f(X)g(X)\bigr)=\deg f(X)+\deg g(X).

(7.1.1)

This easy observation has an important consequence. We saw in the example above that when multiplying two polynomials it may sometimes happen that certain terms ‘‘cancel out’’ (as $-12X^{2}$ and $12X^{2}$ ). However, it is not possible for all terms to do so, because the highest term cannot. Thus the product of two non-zero polynomials is non-zero; or to put it another way,

\hbox{if}\quad f(X)g(X)=0,\quad\hbox{then}\quad f(X)=0\quad\hbox{or}\quad g(X)% =0.

(7.1.2)

As in the case of $\mathbb{Z}$ , we refer to this fact by saying that the polynomials have no zero divisors.

So far everything has been fairly straightforward; just as with the integers, however, it is when we come to division that things get difficult. If $g(X)$ divides $f(X)$ , meaning that $f(X)=q(X)g(X)$ for some $q(X)$ , we shall write $g(X)|f(X)$ ; but our observation above on highest terms in products makes it easy to see that this does not always happen. For example, let $f(X)=X+1$ and $g(X)=X^{2}+2$ . If $f(X)=q(X)g(X)$ for some $q(X)$ , then $q(X)$ would have to be non-zero (as $f(X)$ is), so that the degree of the right-hand side would be at least $2$ (the degree of $g(X)$ ), while the degree of the left-hand side is $1$ — a contradiction.

Essentially the problem here is that we are trying to divide something ‘‘small’’ by something ‘‘large’’; $X^{2}+2$ is ‘‘too big’’ to divide $X+1$ . We met a similar situation when we first came across division of integers: $8$ is ‘‘too big’’ to divide $3$ , for example. In that setting we found that there are two possible ways forward. One is to accept that, in dividing by a number $d$ , we cannot divide anything smaller than $d$ ; thus we look for an answer of the form ‘‘quotient with remainder’’ (as we did in Chapter 4). The other is to enlarge our number system by introducing fractions (as we did when constructing $\mathbb{Q}$ from $\mathbb{Z}$ in Section 6.4), which then allows us to divide any number by any other, provided of course that the second is non-zero.

Each of these two approaches has its counterpart in the world of polynomials. Here we shall take the first. Thus, given two polynomials $f(X)$ and $g(X)$ (with $g(X)\neq 0$ ), we seek to divide $f(X)$ by $g(X)$ to obtain a quotient $q(X)$ and a remainder $r(X)$ satisfying

f(X)=q(X)g(X)+r(X).

The aim is to do this in such a way that $r(X)$ is ‘‘smaller’’ than $g(X)$ ; we use the degree of a polynomial as a measure of its size, so we want the degree of $r(X)$ to be less than that of $g(X)$ (or $r(X)=0$ ). It is not immediately obvious that we can always do this, but it is in fact possible. Briefly, the trick is to do the division step by step, reducing the degree of the polynomial being divided at each stage. We call this method of division with remainder the division algorithm for polynomials. Before we formally state and prove this theorem, let us consider an example.

Example 7.1.8

Use the division algorithm to divide $6X^{3}+7X^{2}+4X-1$ by $3X+2$ .

Solution.

\begin{array}[]{r|rrrr}\omit&&2X^{2}&\!\!\!\!\!{}+\phantom{4}X&\!\!\!\!\!{}+% \frac{2}{3}\\ \cline{2-5}3X+2&6X^{3}&\!\!\!\!\!{}+7X^{2}&\!\!\!\!\!{}+4X&\!\!\!\!\!{}-1\\ \omit&6X^{3}&\!\!\!\!\!{}+4X^{2}&&\\ \cline{2-3}\omit&&\!\!\!\!\!3X^{2}&\!\!\!\!\!{}+4X&\\ \omit&&\!\!\!\!\!3X^{2}&\!\!\!\!\!{}+2X&\\ \cline{3-4}\omit&&&\!\!\!\!\!2X&\!\!\!\!\!{}-1\\ \omit&&&\!\!\!\!\!2X&\!\!\!\!\!{}+\frac{4}{3}\\ \cline{4-5}\omit&&&&\!\!\!\!\!{}-\frac{7}{3}{\text{\makebox[0.0pt][l]{$;$}}}\\ \end{array}

thus $6X^{3}+7X^{2}+4X-1=(2X^{2}+X+\frac{2}{3})(3X+2)+\bigl(-\frac{7}{3}\bigr).$

The following lemma formalizes the ‘‘reduction-of-degree’’ approach that the division algorithm relies on.

Lemma 7.1.9

Let $f(X)$ and $g(X)$ be non-zero polynomials with $\deg f(X)\geqslant\deg gX),$ say

f(X)=a_{0}+a_{1}X+a_{2}X^{2}+\cdots+a_{m}X^{m}\quad\text{and}\quad g(X)=b_{0}+% b_{1}X+b_{2}X^{2}+\cdots+b_{n}X^{n},

where $a_{m}\neq 0$ and $b_{n}\neq 0$ . Define $h(X)=\frac{a_{m}}{b_{n}}X^{m-n}$ . Then either $f(X)=h(X)g(X)$ or

\deg\big(f(X)-h(X)g(X)\bigr)<\deg f(X).

Proof. We have

	$\displaystyle h(X)g(X)$	$\displaystyle=\frac{a_{m}}{b_{n}}X^{m-n}(b_{n}X^{n}+b_{n-1}X^{n-1}+\cdots+b_{1% }X+b_{0})$
		$\displaystyle=a_{m}X^{m}+\frac{a_{m}b_{n-1}}{b_{n}}X^{m-1}+\cdots+\frac{a_{m}b% _{1}}{b_{n}}X^{m-n+1}+\frac{a_{m}b_{0}}{b_{n}}X^{m-n},$

so that

	$\displaystyle f(X)-h(X)g(X)$	$\displaystyle=(a_{m}X^{m}+a_{m-1}X^{m-1}+\cdots+a_{1}X+a_{0})$
		$\displaystyle\quad\ \mbox{}-\Bigl(a_{m}X^{m}+\frac{a_{m}b_{n-1}}{b_{n}}X^{m-1}% +\cdots+\frac{a_{m}b_{1}}{b_{n}}X^{m-n+1}+\frac{a_{m}b_{0}}{b_{n}}X^{m-n}\Bigr)$
		$\displaystyle=\Bigl(a_{m-1}-\frac{a_{m}b_{n-1}}{b_{n}}\Bigr)X^{m-1}+\cdots.$

This shows that either $f(X)-h(X)g(X)=0$ (if all the coefficients on the right-hand side vanish), or this polynomial has degree at most $m-1.$ $\Box$

Theorem 7.1.10

(Polynomial division with remainder.) Let $f(X)$ and $g(X)$ be polynomials, and suppose that $g(X)\neq 0$ . Then there are unique polynomials $q(X)$ and $r(X)$ such that $f(X)=q(X)g(X)+r(X)$ and either $r(X)=0$ or $\deg r(X)<\deg g(X)$ .

Terminology: the polynomial $q(X)$ is called the quotient, while $r(X)$ is the remainder.

Proof. We begin with the existence part. In the special case where $g(X)$ divides $f(X)$ , we have $f(X)=q(X)g(X)$ for some polynomial $q(X)$ , and we may simply take $r(X)=0$ . Note: this case covers $f(X)=0$ (because all polynomials divide $0$ ), and also the case where $g(X)$ has degree $0$ (because if $g(X)=c$ for some $c\in\mathbb{R}\setminus\{0\}$ , then we have $f(X)=(c^{-1}f(X))g(X)$ ). Hence we may suppose that $f(X)\neq 0$ , so that $m:=\deg f(X)\in\mathbb{N}_{0}$ is defined, and that $n:=\deg g(X)\geqslant 1$ . In this case we establish the existence of $q(X)$ and $r(X)$ by induction on $m$ , using the Generalized (or Second) Principle of Induction, where the inductive step consists of showing that if the result holds for each $k<m$ (not just for $k=m-1$ ), then it also holds for $m$ .

The base step ( $m=0$ ) is easy. Since $m=0<n$ , $q(X)=0$ and $r(X)=f(X)$ have the required properties.

Now let $m\geqslant 1$ and assume inductively that, for each non-zero polynomial $f_{0}(X)$ of degree less than $m$ , there are polynomials $q_{0}(X)$ and $r_{0}(X)$ such that

f_{0}(X)=q_{0}(X)g(X)+r_{0}(X)\quad\text{and either}\quad r_{0}(X)=0\quad\text% {or}\quad\deg r_{0}(X)<n.

(7.1.3)

If $m<n$ , then we can take $q(X)=0$ and $r(X)=f(X)$ . Otherwise we choose $h(X)$ as in Lemma 7.1.9 and define $f_{0}(X)=f(X)-h(X)g(X)$ , so that either $f_{0}(X)=0$ or $\deg f_{0}(X)<m$ . In the first case, we can take $q(X)=h(X)$ and $r(X)=0$ . In the second case, the induction hypothesis implies that there are polynomials $q_{0}(X)$ and $r_{0}(X)$ satisfying (7.1.3), and we have

	$\displaystyle f(X)=f_{0}(X)+h(X)g(X)$	$\displaystyle=q_{0}(X)g(X)+r_{0}(X)+h(X)g(X)$
		$\displaystyle=\bigl(q_{0}(X)+h(X)\bigr)g(X)+r_{0}(X).$

Hence the identity $f(X)=q(X)g(X)+r(X)$ holds if we define $q(X)=q_{0}(X)+h(X)$ and $r(X)=r_{0}(X)$ , and the second half of (7.1.3) ensures that this choice of $r(X)$ has the required properties. Thus the induction continues. Note that this inductive argument is essentially just a formal version of the division algorithm that we used in Example 7.1.8.

To prove that the polynomials $q(X)$ and $r(X)$ are unique, suppose that $q_{1}(X)$ , $q_{2}(X)$ , $r_{1}(X)$ and $r_{2}(X)$ are polynomials such that $f(X)=q_{j}(X)g(X)+r_{j}(X)$ and either we have $r_{j}(X)=0$ or $\deg r_{j}(X)<\deg g(X)$ for $j=1,2$ . Our aim is to show that $q_{1}(X)=q_{2}(X)$ and $r_{1}(X)=r_{2}(X)$ . Rearranging the defining equations, we see that

r_{j}(X)=f(X)-q_{j}(X)g(X)\qquad\text{for}\qquad j=1,2,

which implies that

$\displaystyle r_{1}(X)-r_{2}(X)$	$\displaystyle=\bigl(f(X)-q_{1}(X)g(X)\bigr)-\bigl(f(X)-q_{2}(X)g(X)\bigr)$
	$\displaystyle=f(X)-q_{1}(X)g(X)-f(X)+q_{2}(X)g(X)$
	$\displaystyle=q_{2}(X)g(X)-q_{1}(X)g(X)=\bigl(q_{2}(X)-q_{1}(X)\bigr)g(X).$	(7.1.4)

Assume towards a contradiction that $r_{1}(X)\neq r_{2}(X)$ , so that the left-hand side of (7.1) is non-zero. On the one hand, this implies that

\deg\bigl(r_{1}(X)-r_{2}(X)\bigr)=\deg\bigl(q_{1}(X)-q_{2}(X)\bigr)+\deg g(X)% \geqslant\deg g(X)

by (7.1.1). On the other hand, $\deg\bigl(r_{1}(X)-r_{2}(X)\bigr)<\deg g(X)$ because $r_{1}(X)$ and $r_{2}(X)$ are either $0$ or of smaller degree than $g(X)$ . This is clearly impossible, so we must have $r_{1}(X)=r_{2}(X)$ . Combining this with (7.1), we obtain $\bigl(q_{2}(X)-q_{1}(X)\bigr)g(X)=0$ . Since $g(X)\neq 0$ , we deduce that $q_{1}(X)=q_{2}(X)$ by (7.1.2). $\Box$

We conclude this section with two remarks about polynomials in general.

Remark 7.1.11

(i)

One can consider polynomials over number systems other than $\mathbb{R}$ , such as $\mathbb{Z}$ (called ‘‘integral polynomials’’), $\mathbb{Q}$ (‘‘rational polynomials’’) or $\mathbb{C}$ (‘‘complex polynomials’’), or even $\mathbb{Z}_{m}$ (called ‘‘polynomials over $\mathbb{Z}_{m}$ ’’) for $m\in\{2,3,4,\ldots\}$ .
(ii)

We do not think of polynomials as functions (that is, ‘‘machines’’ that map one real number to another). This is why we refer to $X$ as an ‘‘indeterminate’’, not a ‘‘variable’’. To emphasize this difference, we use a capital $X$ rather than a small $x$ .

To exemplify the difference between polynomials and functions, consider the polynomials $f(X)=X$ and $g(X)=X^{2}$ over $\mathbb{Z}_{2}$ . They are, by definition, distinct polynomials; however, we have

$f(\widehat{0})=\widehat{0}=g(\widehat{0})\qquad\hbox{and}\qquad f(\widehat{1})% =\widehat{1}=g(\widehat{1})$

so if we regarded $f$ and $g$ as functions from $\mathbb{Z}_{2}$ to $\mathbb{Z}_{2}$ , they would actually be equal!