Home page for accesible maths 7 Polynomials

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

7.2 Highest common factors and the Euclidean algorithm revisited

We have already seen that polynomials have an arithmetic quite similar to that of \mathbb{Z}: we have addition, subtraction and multiplication; division causes problems, but as in \mathbb{Z} we have the concept of ‘‘division with remainder’’, performed using the division algorithm. In this section we shall continue to see how far we can take the analogy.

The next thing we did in \mathbb{Z} was to take two integers and consider the set of their common factors, found by taking the two sets of their factors separately and then forming their intersection. In theory we can do the same thing for two polynomials; but there is a complication here. Given a non-zero integer nn, any factor dd must be smaller than nn in the sense that |d||n||d|\leqslant|n|; thus there are only finitely many possible factors dd (as they must all lie between -|n|-|n| and |n||n|), and in principle we can check each one to determine the set of factors, which must thus be finite.

In contrast, given a non-zero polynomial f(X)f(X), any factor d(X)d(X) must be smaller than f(X)f(X) in the sense that degd(X)degf(X)\deg d(X)\leqslant\deg f(X); but there are infinitely many possible such factors d(X)d(X), so checking them all is rather tricky! Indeed, the set of factors will certainly be infinite, because if d(X)d(X) is a factor of f(X)f(X), so that f(X)=d(X)q(X)f(X)=d(X)q(X) for some q(X)q(X), then for any r{0}r\in\mathbb{R}\setminus\{0\} we have

f(X)=rd(X)1rq(X),f(X)=rd(X)\cdot\frac{1}{r}q(X),

so that rd(X)rd(X) is also a factor of f(X)f(X).

The fact that it may be hard to determine the set of factors of a given polynomial may make it difficult to do examples, but it should not prevent us from trying to develop a theory (indeed, we shall see that ultimately this is not too severe an obstacle). However, the point about being able to multiply factors by arbitrary elements of {0}\mathbb{R}\setminus\{0\} can and should be addressed. One approach would be to define an equivalence relation on the set of polynomials by setting d(X)rd(X)d(X)\sim rd(X) for all r{0}r\in\mathbb{R}\setminus\{0\}; we would then work with equivalence classes of polynomials when it came to considering factors. Alternatively, we could choose the class representative whose leading term has coefficient 11; such a polynomial is called monic (for example, X3+2X^{3}+2 is monic but 3X2+13X^{2}+1 is not). Less formally, we can just regard d(X)d(X) and rd(X)rd(X) as being ‘‘equally good’’ factors, and allow ourselves to interchange between them freely.

Now the reason we considered the set of common factors of two integers was that it enabled us to define the highest common factor as the largest one. Here there will be no unique ‘‘largest’’ common factor (meaning one of greatest degree) because of the ‘‘scalars’’ issue; even if we were to demand that we choose a monic polynomial, uniqueness would not be clear. We need to take a slightly different approach, based on the equivalence of conditions (a) and (c) of Proposition 4.4.1.

Definition 7.2.1

Given two polynomials f(X)f(X) and g(X)g(X) (not both zero), a polynomial d(X)d(X) is called a highest common factor (or hcf) of them if

  1. (i)

    d(X)d(X) is a common factor of f(X)f(X) and g(X);g(X); and

  2. (ii)

    any other common factor of f(X)f(X) and g(X)g(X) is a factor of d(X)d(X).

This definition makes no claim for the uniqueness of d(X)d(X). However, if c(X)c(X) and d(X)d(X) are both highest common factors, then c(X)|d(X)c(X)|d(X) so that d(X)=q(X)c(X)d(X)=q(X)c(X) for some polynomial q(X)q(X), and

degd(X)=degq(X)+degc(X)degc(X).\deg d(X)=\deg q(X)+\deg c(X)\geqslant\deg c(X).

On the other hand, we also have d(X)|c(X)d(X)|c(X), and thus degd(X)degc(X).\deg d(X)\leqslant\deg c(X). Hence we conclude that degd(X)=degc(X)\deg d(X)=\deg c(X), so that degq(X)=0\deg q(X)=0, and therefore q(X)q(X) is a non-zero scalar. Thus if there is a highest common factor, it is unique up to multiplication by non-zero scalars, so in this sense we may speak of ‘‘the’’ highest common factor.

However, it is far from clear that there will be any such polynomial d(X)d(X). To show that there is, we use linear combinations of polynomials and imitate the proof of Theorem 4.2.17.

Definition 7.2.2

A (polynomial) linear combination of two polynomials f(X)f(X) and g(X)g(X) is a polynomial d(X)d(X) which can be written in the form d(X)=h(X)f(X)+k(X)g(X)d(X)=h(X)f(X)+k(X)g(X) for some polynomials h(X)h(X) and k(X)k(X).

Example 7.2.3

Let f(X)=X2+X-6f(X)=X^{2}+X-6 and g(X)=X3-5X+2g(X)=X^{3}-5X+2. Then

(X2+1)\displaystyle(X^{2}+1) f(X)-(X+1)g(X)=(X2+1)(X2+X-6)-(X+1)(X3-5X+2)\displaystyle f(X)-(X+1)g(X)=(X^{2}+1)(X^{2}+X-6)-(X+1)(X^{3}-5X+2)
=X4+X3-6X2+X2+X-6-(X4-5X2+2X+X3-5X+2)\displaystyle=X^{4}+X^{3}-6X^{2}+X^{2}+X-6-(X^{4}-5X^{2}+2X+X^{3}-5X+2)
=4X-8\displaystyle=4X-8

is a polynomial linear combination of f(X)f(X) and g(X)g(X), corresponding to the choice h(X)=X2+1h(X)=X^{2}+1 and k(X)=-(X+1)k(X)=-(X+1) in the definition above.

As in the case of \mathbb{Z}, we have the following result.

Lemma 7.2.4

Let c(X)c(X) be a common factor of the polynomials f(X)f(X) and g(X)g(X). Then c(X)c(X) is a factor of every polynomial linear combination of f(X)f(X) and g(X)g(X).

Proof. Suppose that p(X)p(X) is a polynomial linear combination of f(X)f(X) and g(X)g(X), say

p(X)=h(X)f(X)+k(X)g(X)p(X)=h(X)f(X)+k(X)g(X)

for some polynomials h(X)h(X) and k(X)k(X). By assumption, we can write f(X)=c(X)q(X)f(X)=c(X)q(X) and g(X)=c(X)r(X)g(X)=c(X)r(X) for some polynomials q(X)q(X) and r(X)r(X). Hence we have

p(X)\displaystyle p(X) =h(X)f(X)+k(X)g(X)\displaystyle=h(X)f(X)+k(X)g(X)
=h(X)c(X)q(X)+k(X)c(X)r(X)=c(X)(h(X)q(X)+k(X)r(X)),\displaystyle=h(X)c(X)q(X)+k(X)c(X)r(X)=c(X)\bigl(h(X)q(X)+k(X)r(X)\bigr),

so that c(X)c(X) divides p(X)p(X) as desired. \Box

Corollary 7.2.5

Let f(X)f(X) and g(X)g(X) be non-zero polynomials, and suppose that d(X)d(X) is a common factor of f(X)f(X) and g(X)g(X) and also a polynomial linear combination of them. Then d(X)d(X) is a highest common factor of f(X)f(X) and g(X)g(X).

Example 7.2.6

Let f(X)=X2+X-6f(X)=X^{2}+X-6 and g(X)=X3-5X+2g(X)=X^{3}-5X+2 as in Example 7.2.3. We saw there that

4X-8=(X2+1)f(X)-(X+1)g(X),4X-8=(X^{2}+1)f(X)-(X+1)g(X),

and hence

X-2=14(X2+1)f(X)-14(X+1)g(X)X-2=\frac{1}{4}(X^{2}+1)f(X)-\frac{1}{4}(X+1)g(X)

is a polynomial linear combination of f(X)f(X) and g(X)g(X). Since

f(2)=4+2-6=0  and  g(2)=8-10+2=0,f(2)=4+2-6=0\qquad\text{and}\qquad g(2)=8-10+2=0,

the Factor Theorem implies that X-2X-2 is also a common factor of f(X)f(X) and g(X)g(X), so it must be a highest common factor.

Proof. Lemma 7.2.4 implies that every common factor of f(X)f(X) and g(X)g(X) is a factor of d(X)d(X), so d(X)d(X) is a highest common factor. \Box

Theorem 7.2.7

Let f(X)f(X) and g(X)g(X) be non-zero polynomials. Then there exists a non-zero polynomial linear combination d(X)d(X) of them of the smallest possible degree, and d(X)d(X) is a highest common factor of f(X)f(X) and g(X)g(X).

Proof. Let LL be the set of non-zero polynomial linear combinations of f(X)f(X) and g(X)g(X):

L={p(X):p(X)0andp(X)=h(X)f(X)+k(X)g(X)for some polynomialsh(X),k(X)}.L=\bigl\{p(X):p(X)\neq 0\ \text{and}\ p(X)=h(X)f(X)+k(X)g(X)\\ \text{for some polynomials}\ h(X),k(X)\bigr\}.

We note that LL is non-empty (as f(X)=1f(X)+0g(X)Lf(X)=1\cdot f(X)+0\cdot g(X)\in L), so it contains a polynomial d(X)d(X) of the smallest possible degree. Choose polynomials h(X)h(X) and k(X)k(X) such that d(X)=h(X)f(X)+k(X)g(X)d(X)=h(X)f(X)+k(X)g(X). We shall prove that d(X)d(X) is a highest common factor of f(X)f(X) and g(X)g(X).

To see that d(X)|f(X)d(X)|f(X), we use polynomial division with remainder (Theorem 7.1.10) to find polynomials q(X)q(X) and r(X)r(X) such that f(X)=q(X)d(X)+r(X)f(X)=q(X)d(X)+r(X) and either r(X)=0r(X)=0 or degr(X)<degd(X)\deg r(X)<\deg d(X). Then we have

r(X)\displaystyle r(X) =f(X)-q(X)d(X)=f(X)-q(X)(h(X)f(X)+k(X)g(X))\displaystyle=f(X)-q(X)d(X)=f(X)-q(X)\bigl(h(X)f(X)+k(X)g(X)\bigr)
=f(X)-q(X)h(X)f(X)-q(X)k(X)g(X)\displaystyle=f(X)-q(X)h(X)f(X)-q(X)k(X)g(X)
=(1-q(X)h(X))f(X)-(q(X)k(X))g(X).\displaystyle=\bigl(1-q(X)h(X)\bigr)f(X)-\bigl(q(X)k(X)\bigr)g(X).

Hence if r(X)0r(X)\neq 0 we would have r(X)Lr(X)\in L, contrary to the minimality of the degree of d(X)d(X); so we must have r(X)=0r(X)=0, and so f(X)=q(X)d(X)f(X)=q(X)d(X), i.e., d(X)|f(X)d(X)|f(X).

Similarly we see that d(X)|g(X)d(X)|g(X), and so d(X)d(X) is a common factor of f(X)f(X) and g(X)g(X). Since d(X)d(X) is also a polynomial linear combination of f(X)f(X) and g(X)g(X), Corollary 7.2.5 shows that d(X)d(X) is a highest common factor. \Box

In Examples 7.2.3 and 7.2.6 the polynomials h(X)h(X) and k(X)k(X) appeared to come out of thin air; we need a systematic way to determine a highest common factor of two given polynomials and to express it as a polynomial linear combination of them. Just as in \mathbb{Z}, we use the Euclidean algorithm. It operates exactly as before, but to perform each division we may use the division algorithm.

Example 7.2.8

Find a highest common factor of f(X)=6X3+20X2+7X-12f(X)=6X^{3}+20X^{2}+7X-12 and g(X)=6X5+8X4-27X3-4X2+35X-15g(X)=6X^{5}+8X^{4}-27X^{3}-4X^{2}+35X-15, and express it as a polynomial linear combination of f(X)f(X) and g(X)g(X).

Solution. Since degf(X)<degg(X)\deg f(X)<\deg g(X), we begin by dividing g(X)g(X) by f(X)f(X):

X2-2X+16X3+20X2+7X-126X5+8X4-27X3-4X2+35X-156X5+20X4+7X3-12X2-12X4-34X3+8X2+35X-12X4-40X3-14X2+24X6X3+22X2+11X-156X3+20X2+7X-122X2+4X-3\begin{array}[]{r|rrrrrr}\omit&&&&X^{2}&\!\!\!\!\!{}-2X&\!\!\!\!\!{}+\phantom{% 0}1\\ \cline{2-7}6X^{3}+20X^{2}+7X-12&6X^{5}&\!\!\!\!\!{}+\phantom{2}8X^{4}&\!\!\!\!% \!{}-27X^{3}&\!\!\!\!\!{}-\phantom{2}4X^{2}&\!\!\!\!\!{}+35X&\!\!\!\!\!{}-15\\ \omit&6X^{5}&\!\!\!\!\!{}+20X^{4}&\!\!\!\!\!{}+\phantom{2}7X^{3}&\!\!\!\!\!{}-% 12X^{2}\\ \cline{2-5}\omit&&\!\!\!\!\!{}-12X^{4}&\!\!\!\!\!{}-34X^{3}&\!\!\!\!\!{}+% \phantom{1}8X^{2}&\!\!\!\!\!{}+35X\\ \omit&&\!\!\!\!\!{}-12X^{4}&\!\!\!\!\!{}-40X^{3}&\!\!\!\!\!{}-14X^{2}&\!\!\!\!% \!{}+24X\\ \cline{3-6}\omit&&&\!\!\!\!\!\phantom{4}6X^{3}&\!\!\!\!\!{}+22X^{2}&\!\!\!\!\!% {}+11X&\!\!\!\!\!{}-15\\ \omit&&&\!\!\!\!\!\phantom{4}6X^{3}&\!\!\!\!\!{}+20X^{2}&\!\!\!\!\!{}+\phantom% {1}7X&\!\!\!\!\!{}-12\\ \cline{4-7}\omit&&&&\!\!\!\!\!\phantom{2}2X^{2}&\!\!\!\!\!{}+\phantom{1}4X&\!% \!\!\!\!{}-\phantom{1}3\end{array}

so that

g(X)=(X2-2X+1)f(X)+2X2+4X-3.g(X)=(X^{2}-2X+1)f(X)+2X^{2}+4X-3. (7.2.1)

Next, we divide f(X)f(X) by the remainder:

3X+42X2+4X-36X3+20X2+7X-126X3+12X2-9X8X2+16X-128X2+16X-120.\begin{array}[]{r|rrrr}\omit&&&\phantom{1}3X&\!\!\!\!\!{}+\phantom{1}4\\ \cline{2-5}2X^{2}+4X-3&6X^{3}&\!\!\!\!\!{}+20X^{2}&\!\!\!\!\!{}+\phantom{1}7X&% \!\!\!\!\!{}-12\\ \omit&6X^{3}&\!\!\!\!\!{}+12X^{2}&\!\!\!\!\!{}-\phantom{1}9X\\ \cline{2-4}\omit&&\!\!\!\!\!\phantom{2}8X^{2}&\!\!\!\!\!{}+16X&\!\!\!\!\!{}-12% \\ \omit&&\!\!\!\!\!\phantom{2}8X^{2}&\!\!\!\!\!{}+16X&\!\!\!\!\!{}-12\\ \cline{3-5}\omit&&&&\!\!\!\!\!\phantom{+1}0{\text{\makebox[0.0pt][l]{$.$}}}% \end{array}

Hence the algorithm has terminated, so the last non-zero remainder, 2X2+4X-32X^{2}+4X-3, is a highest common factor of f(X)f(X) and g(X)g(X). To write this highest common factor as a polynomial linear combination of f(X)f(X) and g(X)g(X), we simply rearrange (7.2.1) to obtain

2X2+4X-3=g(X)-(X2-2X+1)f(X).2X^{2}+4X-3=g(X)-(X^{2}-2X+1)f(X).
Example 7.2.9

Find a highest common factor of f(X)=X4+3X3-5X2-18X-9f(X)=X^{4}+3X^{3}-5X^{2}-18X-9 and g(X)=X3+X2-8X-6g(X)=X^{3}+X^{2}-8X-6, and express it as a polynomial linear combination of f(X)f(X) and g(X)g(X).

Solution. Since degg(X)<degf(X)\deg g(X)<\deg f(X), we first divide f(X)f(X) by g(X)g(X):

X+2X3+X2-8X-6X4+3X3-5X2-18X-9X4+X3-8X2-6X2X3+3X2-12X-92X3+2X2-16X-12X2+4X+3,\begin{array}[]{r|rrrrr}\omit&&&&X&\!\!\!\!\!{}+\phantom{0}2\\ \cline{2-6}X^{3}+X^{2}-8X-6&X^{4}&\!\!\!\!\!{}+3X^{3}&\!\!\!\!\!{}-5X^{2}&\!\!% \!\!\!{}-18X&\!\!\!\!\!{}-\phantom{0}9\\ \omit&X^{4}&\!\!\!\!\!{}+\phantom{1}X^{3}&\!\!\!\!\!{}-8X^{2}&\!\!\!\!\!{}-% \phantom{0}6X&\\ \cline{2-5}\omit&&\!\!\!\!\!2X^{3}&\!\!\!\!\!{}+3X^{2}&\!\!\!\!\!{}-12X&\!\!\!% \!\!{}-\phantom{0}9\\ \omit&&\!\!\!\!\!2X^{3}&\!\!\!\!\!{}+2X^{2}&\!\!\!\!\!{}-16X&\!\!\!\!\!{}-12\\ \cline{3-6}\omit&&&\!\!\!\!\!X^{2}&\!\!\!\!\!{}+\phantom{0}4X&\!\!\!\!\!{}+% \phantom{0}3{\text{\makebox[0.0pt][l]{$,$}}}\end{array}

so that f(X)=(X+2)g(X)+X2+4X+3.f(X)=(X+2)g(X)+X^{2}+4X+3. Next, we divide g(X)g(X) by the remainder:

X-3X2+4X+3X3+X2-8X-6X3+4X2+3X-3X2-11X-6-3X2-12X-9X+3,\begin{array}[]{r|rrrr}\omit&&&X&\!\!\!\!\!{}-3\\ \cline{2-5}X^{2}+4X+3&X^{3}&\!\!\!\!\!{}+\phantom{1}X^{2}&\!\!\!\!\!{}-% \phantom{0}8X&\!\!\!\!\!{}-6\\ \omit&X^{3}&\!\!\!\!\!{}+4X^{2}&\!\!\!\!\!{}+\phantom{0}3X&\\ \cline{2-4}\omit&&\!\!\!\!\!{}-3X^{2}&\!\!\!\!\!{}-11X&\!\!\!\!\!{}-6\\ \omit&&\!\!\!\!\!{}-3X^{2}&\!\!\!\!\!{}-12X&\!\!\!\!\!{}-9\\ \cline{3-5}\omit&&&\!\!\!\!\!X&\!\!\!\!\!{}+3{\text{\makebox[0.0pt][l]{$,$}}}% \end{array}

so that g(X)=(X-3)(X2+4X+3)+(X+3).g(X)=(X-3)(X^{2}+4X+3)+(X+3). Since X2+4X+3=(X+1)(X+3)X^{2}+4X+3=(X+1)(X+3), we conclude that X+3X+3 is a highest common factor.

Working backwards, we obtain

X+3\displaystyle X+3 =\displaystyle\!\!\!= g(X)-(X-3)(X2+4X+3)\displaystyle g(X)-(X-3)(X^{2}+4X+3)
=\displaystyle\!\!\!= g(X)-(X-3)(f(X)-(X+2)g(X))\displaystyle g(X)-(X-3)(f(X)-(X+2)g(X))
=\displaystyle\!\!\!= (1+(X-3)(X+2))g(X)-(X-3)f(X)\displaystyle(1+(X-3)(X+2))g(X)-(X-3)f(X)
=\displaystyle\!\!\!= (X2-X-5)g(X)-(X-3)f(X).\displaystyle(X^{2}-X-5)g(X)-(X-3)f(X).