Home page for accesible maths 3 Functions of two or more variables

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

3.5 Constrained maxima and minima

Problem: Find the nearest point to (0,0) on the curve x2+8xy+7y2=225. In other words, find the point on the curve where x2+y2 is smallest.

This is one example of problems of the following type:

Find maxima and minima of a function f(x,y) on the curve defined by g(x,y)=0.

Two possible methods are as follows.

First method. It may be easy to solve g(x,y)=0 to express y as y(x). For example, if  g(x,y)=xy-1=0, then  y=1/x. In this case, we can just substitute y=y(x) and look for extrema of  f[x,y(x)], a function of x only. However, in most cases, this isn’t easy!

Example of the first method. Find the minimum of x3+3y2 on the curve xy=1, x>0.

Answer:

Second method. Instead of actually solving g(x,y)=0 to give y=y(x), just imagine that it has been done (it’s at least possible in principle!). This means that  g[x,y(x)]=0  for all x. Differentiating by the chain rule (with “t” equal to x), we deduce that

gx+gyy(x)=0  for all x. (1)

As we saw above, we are looking for stationary points of f[x,y(x)]. Again by the chain rule, such points satisfy

fx+fyy(x)=0. (2)

To eliminate y(x), multiply (1) by fy and (2) by gy and subtract. We obtain  fygx-gyfx=0, or:

fxgy=fygx. (3)

We solve (3) together with g(x,y)=0 to find the points. We don’t need to know y(x) explicitly!

Aside on geometry

Equation (3) has a simple geometrical interpretation. It says that (fxfy)=λ(gxgy) for some λ (assuming (fxfy)(0  0)). In other words, f and g are parallel. Hence at the points in question, the curves g(x,y)=0 and f(x,y)=c (for the appropriate c) touch each other. When you think about it, this is what you would expect:

\curve(0,50,10,49,20,45.8,25,43.4,30,40,35.4,35.4,40,30,43.4,25,45.8,20,49,10,50 ,0) \curve(-50,0,-49,-10,-45.8,-20,-43.4,-25,-40,-30,-35.4,-35.4,-30,-40 ,-25,-43.4,-20,-45.8,-10,-49,0,-50)f(x,y)=cf(x,y)=c′′g(x,y)=0f(x,y)=c\curvedashes[2mm]0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 \csdiameter\curve(10,70,55,10) \curve(-50,0,-49,-10,-45.8,-20,-43.4,-25,-40,-30,-35.4,-35.4,-30,-40 ,-25,-43.4,-20,-45.8,-10,-49,0,-50)\curve(-50,0,-49,-10,-45.8,-20,-43.4,-25,-40,-30,-35.4,-35.4,-30, -40,-25,-43.4,-20,-45.8,-10,-49,0,-50)

Changing the parameter c moves the curve f(x,y)=c around. In the diagram above, we might start with the value c for f(x,y), and see how far we can increase or decrease c. We want to do so in such a way that the curves f(x,y)=c and g(x,y)=0 still intersect - the point being that if these two curves intersect, it means that there exist x,y satisfying g(x,y)=0 and f(x,y)=c (that is, c is a value of f(x,y) which occurs on the curve g(x,y)=0). Suppose we keep on increasing or decreasing c until we reach a value c such that if we carry on moving in the same direction then the curves f(x,y)=c and g(x,y)=0 no longer have any points of intersection. This is demonstrated by the curve f(x,y)=c in the diagram above. Then we have pushed (either increased or decreased) c as far as it will go; locally, c is an extreme (either maximum or minimum) value of f(x,y) on the curve g(x,y)=0. This is what it means for the curves f(x,y)=c and g(x,y)=0 to have a common tangent.

The auxiliary function

Let’s look a little bit more at the formulation f=λg. Consider the function:

Λ(x,y,λ)=f(x,y)-λg(x,y)

Think of λ as a variable, and let’s try to find maxima and minima of Λ(x,y,λ) as x,y and λ vary. Then all partial derivates are zero, hence: fx-λgx=fy-λgy=0, and g(x,y)=0. The final equation is just the constraint that we want to impose on x and y from the outset; the first two equations together imply f=λg, which is exactly what we want. This indicates a useful method for finding the (constrained) maxima and minima of f(x,y), by introducing the Lagrange multiplier λ and looking for stationary points of the auxiliary function Λ(x,y,λ). In a moment we will see some examples that should make things a bit clearer; however, first we will see that a very similar method allows us to find constrained maxima and minima for functions of three or more variables.


Note that this method identifies stationary points, but gives no test to determine whether they are maxima or minima. Just as for a function of one variable, we can find stationary points which are neither maxima nor minima. To see whether a point is a local maximum or a local minimum, you have to use the special features of any particular problem.


Three variables. To find extreme values of f(x,y,z) subject to the conditiong(x,y,z)=0.

In principle, the equation g(x,y,z)=0 can be solved to give z=z(x,y). This means that

g[x,y,z(x,y)]=0

for all x,y. Taking partial derivatives with respect to x and y by the chain rule, we obtain

gx+gzzx=0,gy+gzzy=0. (4)

We are looking for stationary points of f[x,y,z(x,y)] (note that this is a function of just x and y). At such points, the partial derivatives with respect to x and y will be 0, in other words:

fx+fzzx=0,fy+fzzy=0. (5)

From the x-equations in (4) and (5), eliminating the second term as before, we obtain fxgz=fzgx. From the y-equations, we obtain fygz=fzgy. Taken together, these two identities say

fxgx=fygy=fzgz,

and we denote this common value by λ. We can rewrite this as

fx=λgx,fy=λgy,fz=λgz. (6)

or more concisely as f=λg.

Just as for functions of two variables, we can find the solutions by looking for the stationary points of the auxiliary function Λ(x,y,z,λ)=f(x,y,z)-λg(x,y,z).

Geometric aside. For functions of two variables, we interpreted the statement: f=λg to mean that the curves f(x,y)=c and g(x,y)=0 just touched each other (they have parallel normal vectors) at the stationary point. In three variables we have a similar statement: since f and g are parallel, this means that the surfaces g(x,y,z)=0 and f(x,y,z)=c have the same tangent plane. The value of λ itself may or may not be of interest.

Let’s now apply this method of Lagrange multipliers to the example given in the beginning of this section.

Example 3.12

Find the least value of x2+y2 on the curve with equation x2+8xy+7y2-225=0.

The auxiliary equation is Λ(x,y,λ)=x2+y2-λ(x2+8xy+7y2-225). Now find the partial derivatives:

Λx=             Λy=            

and Λλ=-(x2+8xy+7y2-225). So if (x,y) is a stationary point, then

Thus y(9λ-1)(λ+1)=0, and so we have 3 cases: (i) y=0 (ii) λ=1/9 and (iii) λ=-1.

Case (i): y=0.

Then, since Λλ=0, we have x2=225, and hence x=±15. However, plugging (15,0) or (-15,0) into Λx=0 gives λ=1, and into Λy=0 gives λ=0, a contradiction.


Case (ii): λ=1/9.

Then (1-λ)x=

So y=2x. Now we can determine possible values of x and y using the final equation Λλ, that is, g(x,y)=x2+8xy+7y2-225=0. We obtain

and hence x=±5. Thus we obtain two stationary points for λ=1/9: (5,25) and (-5,-25). For both points, x2+y2=25.

Case (iii): λ=-1.

Then (1-λ)x=

So x=-2y. Now g(x,y)=

hence -5y2=225, which has no solution.

We have therefore found all of the stationary points of Λ(x,y,λ): the two points ±(5,25). We don’t yet know whether they are maxima or minima. An ad hoc way to see whether they are maxima or minima is: the curve x2+8xy+7y2-225=0 passes through the point (15,0), for which x2+y2=225. Thus 25 must be the minimum (not maximum) value of x2+y2. From a strict mathematical standpoint, this argument isn’t really good enough! As mentioned above, not all stationary points are maxima or minima: so perhaps it’s possible that there are some points on the curve x2+8xy+y2=225 for which x2+y2>25, and other points for which x2+y2<25? However, for the purposes of this course, such ad hoc arguments will be considered acceptable. (We don’t want you to get bogged down in details of the nature of the stationary points.)

For those who are interested, there is (in this case) an easy trick to be really sure that 25 is a minimum value: suppose there exist x,y such that x2+8xy+7y2=225 and x2+y2=d<25. Then 9(x2+y2)-(x2+8xy+7y2)=9d-225<0. But 9(x2+y2)-(x2+8xy+7y2)=8x2-8xy+2y2=2(2x-y)20, hence this is impossible.

Example 3.13

Consider an open-top box with side lengths x,y,z and volume 4a3. Find the minimum possible surface area of the box.

First of all, let’s write the equations down:

xyz-4a3=0,A(x,y,z)=2xy+2xz+yz

There are three ways to find the minimum value of A.

First method:

Since xyz=4a3, we can replace z by 4a3/xy, hence A(x,y,z)=2xy+8a3/y+4a3/x. Now look for the stationary points of B(x,y)=2xy+8a3/y+4a3/x; we have

hence there is one stationary point: (a,2a). We get a value of 12a2 for B(a,2a). To see that this is a minimum value, one could check the double derivatives of B(x,y).

Second method:

The second method is based on the following useful result:

Proposition: Let a1,a2,,an be positive numbers. Let

A=1n(a1++an)(the average, or “arithmetic mean”),
G=(a1a2an)1/n(the “geometric mean”).

Then GA. Equivalently, a1a2anAn.

Proof

Index the numbers so that a1a2.an. Unless they are all equal, we have a1>A>an. Form a new set of numbers as follows: replace a1 and an by A and a1+an-A. The sum (hence the arithmetic mean) is unchanged, since  A+(a1+an-A)=a1+an. However, the product is increased, since

A(a1+an-A)-a1an=(a1-A)(A-an)>0.

Order the new set of numbers b1b2.bn. At least one of them is A. Unless they are all equal, repeat the process. The product is increased again, and at least two of the numbers are now A. Continue: after at most n-1 processes, the numbers are all equal to A, so their product (which is more than the original product a1a2an) is An. So we have shown that a1a2anAn.

Consider 2xy, 2xz, yz. These are positive numbers, hence their arithmetic mean is greater than their geometric mean:

2xy+2xz+yz3(4x2y2z2)1/3=(64a6)1/3=4a2

Thus 2xy+2xz+yz12a2, and equality holds when 2xy=2xz=yz (i.e. for (x,y,z)=(a,2a,2a)). This is by far the easiest of the three methods! It also has the advantage that we know instantly that (a,2a,2a) is a minimum. However, this method can only be used in certain cases. Keep your eyes peeled for cases where it does work! Similarly, there are some cases where the Cauchy-Schwarz inequality quickly gives us a minimum (constrained) value. The method of Lagrange multipliers will also work, but will take more time and effort.

Third method:

We can use Lagrange multipliers to find the minimum value of 2xy+2xz+yz. In this case we have the auxiliary function Λ(x,y,z,λ)=2xy+2xz+yz-λ(xyz-4a3). Thus we would need to find the points where the four partial derivatives are zero. These partial derivatives are: 2y+2z-λyz, 2x+z-λxz, 2x+y-λyz and xyz-4a3. Without going any further, you can probably already see that this will be more difficult than the first method, and much more arduous than the second! Lagrange multipliers are very important for cases where substitution isn’t possible, and neither Cauchy-Schwarz, nor the theorem on arithmetic and geometric means can be used.