Problem: Find the nearest point to on the curve . In other words, find the point on the curve where is smallest.
This is one example of problems of the following type:
Find maxima and minima of a function on the curve defined by .
Two possible methods are as follows.
First method. It may be easy to solve to express as . For example, if , then . In this case, we can just substitute and look for extrema of , a function of only. However, in most cases, this isn’t easy!
Example of the first method. Find the minimum of on the curve , .
Answer:
Second method. Instead of actually solving to give , just imagine that it has been done (it’s at least possible in principle!). This means that for all . Differentiating by the chain rule (with “” equal to ), we deduce that
(1) |
As we saw above, we are looking for stationary points of . Again by the chain rule, such points satisfy
(2) |
To eliminate , multiply (1) by and (2) by and subtract. We obtain , or:
(3) |
We solve (3) together with to find the points. We don’t need to know explicitly!
Aside on geometry
Equation (3) has a simple geometrical interpretation. It says that for some (assuming ). In other words, and are parallel. Hence at the points in question, the curves and (for the appropriate ) touch each other. When you think about it, this is what you would expect:
Changing the parameter moves the curve around. In the diagram above, we might start with the value for , and see how far we can increase or decrease . We want to do so in such a way that the curves and still intersect - the point being that if these two curves intersect, it means that there exist satisfying and (that is, is a value of which occurs on the curve ). Suppose we keep on increasing or decreasing until we reach a value such that if we carry on moving in the same direction then the curves and no longer have any points of intersection. This is demonstrated by the curve in the diagram above. Then we have pushed (either increased or decreased) as far as it will go; locally, is an extreme (either maximum or minimum) value of on the curve . This is what it means for the curves and to have a common tangent.
The auxiliary function
Let’s look a little bit more at the formulation . Consider the function:
Think of as a variable, and let’s try to find maxima and minima of as and vary. Then all partial derivates are zero, hence: , and . The final equation is just the constraint that we want to impose on and from the outset; the first two equations together imply , which is exactly what we want. This indicates a useful method for finding the (constrained) maxima and minima of , by introducing the Lagrange multiplier and looking for stationary points of the auxiliary function . In a moment we will see some examples that should make things a bit clearer; however, first we will see that a very similar method allows us to find constrained maxima and minima for functions of three or more variables.
Note that this method identifies stationary points, but gives no test to determine whether they are maxima or minima. Just as for a function of one variable, we can find stationary points which are neither maxima nor minima. To see whether a point is a local maximum or a local minimum, you have to use the special features of any particular problem.
Three variables. To find extreme values of subject to the condition.
In principle, the equation can be solved to give . This means that
for all . Taking partial derivatives with respect to and by the chain rule, we obtain
(4) |
We are looking for stationary points of (note that this is a function of just and ). At such points, the partial derivatives with respect to and will be 0, in other words:
(5) |
From the -equations in (4) and (5), eliminating the second term as before, we obtain . From the -equations, we obtain . Taken together, these two identities say
and we denote this common value by . We can rewrite this as
(6) |
or more concisely as .
Just as for functions of two variables, we can find the solutions by looking for the stationary points of the auxiliary function .
Geometric aside. For functions of two variables, we interpreted the statement: to mean that the curves and just touched each other (they have parallel normal vectors) at the stationary point. In three variables we have a similar statement: since and are parallel, this means that the surfaces and have the same tangent plane. The value of itself may or may not be of interest.
Let’s now apply this method of Lagrange multipliers to the example given in the beginning of this section.
Find the least value of on the curve with equation .
The auxiliary equation is . Now find the partial derivatives:
and . So if is a stationary point, then
Thus , and so we have 3 cases: (i) (ii) and (iii) .
Case (i): .
Then, since , we have , and hence . However, plugging or into gives , and into gives , a contradiction.
Case (ii): .
Then
So . Now we can determine possible values of and using the final equation , that is, . We obtain
and hence . Thus we obtain two stationary points for : and . For both points, .
Case (iii): .
Then
So . Now
hence , which has no solution.
We have therefore found all of the stationary points of : the two points . We don’t yet know whether they are maxima or minima. An ad hoc way to see whether they are maxima or minima is: the curve passes through the point , for which . Thus must be the minimum (not maximum) value of . From a strict mathematical standpoint, this argument isn’t really good enough! As mentioned above, not all stationary points are maxima or minima: so perhaps it’s possible that there are some points on the curve for which , and other points for which ? However, for the purposes of this course, such ad hoc arguments will be considered acceptable. (We don’t want you to get bogged down in details of the nature of the stationary points.)
For those who are interested, there is (in this case) an easy trick to be really sure that is a minimum value: suppose there exist such that and . Then . But , hence this is impossible.
Consider an open-top box with side lengths and volume . Find the minimum possible surface area of the box.
First of all, let’s write the equations down:
There are three ways to find the minimum value of .
First method:
Since , we can replace by , hence . Now look for the stationary points of ; we have
hence there is one stationary point: . We get a value of for . To see that this is a minimum value, one could check the double derivatives of .
Second method:
The second method is based on the following useful result:
Proposition: Let , be positive numbers. Let
Then . Equivalently, .
Index the numbers so that . Unless they are all equal, we have . Form a new set of numbers as follows: replace and by and . The sum (hence the arithmetic mean) is unchanged, since . However, the product is increased, since
Order the new set of numbers . At least one
of them is . Unless they are all equal, repeat the process. The product
is increased again, and at least two of the numbers are now .
Continue: after at most processes, the numbers are all equal
to , so their product (which is more than the original product
) is . So we have shown that .
Consider , , . These are positive numbers, hence their arithmetic mean is greater than their geometric mean:
Thus , and equality holds when (i.e. for ). This is by far the easiest of the three methods! It also has the advantage that we know instantly that is a minimum. However, this method can only be used in certain cases. Keep your eyes peeled for cases where it does work! Similarly, there are some cases where the Cauchy-Schwarz inequality quickly gives us a minimum (constrained) value. The method of Lagrange multipliers will also work, but will take more time and effort.
Third method:
We can use Lagrange multipliers to find the minimum value of . In this case we have the auxiliary function . Thus we would need to find the points where the four partial derivatives are zero. These partial derivatives are: , , and . Without going any further, you can probably already see that this will be more difficult than the first method, and much more arduous than the second! Lagrange multipliers are very important for cases where substitution isn’t possible, and neither Cauchy-Schwarz, nor the theorem on arithmetic and geometric means can be used.