Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C (523184), страница 96
Текст из файла (страница 96)
The function is evaluatedat the parabola’s minimum, 4, which replaces point 3. A new parabola (dotted line) is drawn throughpoints 1,4,2. The minimum of this parabola is at 5, which is close to the minimum of the function.away). Note, however, that (10.2.1) is as happy jumping to a parabolic maximumas to a minimum. No minimization scheme that depends solely on (10.2.1) is likelyto succeed in practice.The exacting task is to invent a scheme that relies on a sure-but-slow technique,like golden section search, when the function is not cooperative, but that switchesover to (10.2.1) when the function allows.
The task is nontrivial for severalreasons, including these: (i) The housekeeping needed to avoid unnecessary functionevaluations in switching between the two methods can be complicated. (ii) Carefulattention must be given to the “endgame,” where the function is being evaluatedvery near to the roundoff limit of equation (10.1.2). (iii) The scheme for detecting acooperative versus noncooperative function must be very robust.Brent’s method [1] is up to the task in all particulars. At any particular stage,it is keeping track of six function points (not necessarily all distinct), a, b, u, v,w and x, defined as follows: the minimum is bracketed between a and b; x is thepoint with the very least function value found so far (or the most recent one incase of a tie); w is the point with the second least function value; v is the previousvalue of w; u is the point at which the function was evaluated most recently.
Alsoappearing in the algorithm is the point xm , the midpoint between a and b; however,the function is not evaluated there.You can read the code below to understand the method’s logical organization.Mention of a few general principles here may, however, be helpful: Parabolicinterpolation is attempted, fitting through the points x, v, and w. To be acceptable,the parabolic step must (i) fall within the bounding interval (a, b), and (ii) imply amovement from the best current value x that is less than half the movement of thestep before last. This second criterion insures that the parabolic steps are actuallyconverging to something, rather than, say, bouncing around in some nonconvergentlimit cycle.
In the worst possible case, where the parabolic steps are acceptable but404Chapter 10.Minimization or Maximization of Functionsuseless, the method will approximately alternate between parabolic steps and goldensections, converging in due course by virtue of the latter. The reason for comparingto the step before last seems essentially heuristic: Experience shows that it is betternot to “punish” the algorithm for a single bad step if it can make it up on the next one.Another principle exemplified in the code is never to evaluate the function lessthan a distance tol from a point already evaluated (or from a known bracketingpoint).
The reason is that, as we saw in equation (10.1.2), there is simply noinformation content in doing so: the function will differ from the value alreadyevaluated only by an amount of order the roundoff error. Therefore in the code belowyou will find several tests and modifications of a potential new point, imposing thisrestriction. This restriction also interacts subtly with the test for “doneness,” whichthe method takes into account.A typical ending configuration for Brent’s method is that a and b are 2 × x × tolapart, with x (the best abscissa) at the midpoint of a and b, and therefore fractionallyaccurate to ±tol.Indulge us a final reminder that tol should generally be no smaller than thesquare root of your machine’s floating-point precision.#include <math.h>#include "nrutil.h"#define ITMAX 100#define CGOLD 0.3819660#define ZEPS 1.0e-10Here ITMAX is the maximum allowed number of iterations; CGOLD is the golden ratio; ZEPS isa small number that protects against trying to achieve fractional accuracy for a minimum thathappens to be exactly zero.#define SHFT(a,b,c,d) (a)=(b);(b)=(c);(c)=(d);float brent(float ax, float bx, float cx, float (*f)(float), float tol,float *xmin)Given a function f, and given a bracketing triplet of abscissas ax, bx, cx (such that bx isbetween ax and cx, and f(bx) is less than both f(ax) and f(cx)), this routine isolatesthe minimum to a fractional precision of about tol using Brent’s method.
The abscissa ofthe minimum is returned as xmin, and the minimum function value is returned as brent, thereturned function value.{int iter;float a,b,d,etemp,fu,fv,fw,fx,p,q,r,tol1,tol2,u,v,w,x,xm;float e=0.0;This will be the distance moved onthe step before last.a=(ax < cx ? ax : cx);a and b must be in ascending order,b=(ax > cx ? ax : cx);but input abscissas need not be.x=w=v=bx;Initializations...fw=fv=fx=(*f)(x);for (iter=1;iter<=ITMAX;iter++) {Main program loop.xm=0.5*(a+b);tol2=2.0*(tol1=tol*fabs(x)+ZEPS);if (fabs(x-xm) <= (tol2-0.5*(b-a))) {Test for done here.*xmin=x;return fx;}if (fabs(e) > tol1) {Construct a trial parabolic fit.r=(x-w)*(fx-fv);q=(x-v)*(fx-fw);p=(x-v)*q-(x-w)*r;q=2.0*(q-r);if (q > 0.0) p = -p;q=fabs(q);10.3 One-Dimensional Search with First Derivatives405etemp=e;e=d;if (fabs(p) >= fabs(0.5*q*etemp) || p <= q*(a-x) || p >= q*(b-x))d=CGOLD*(e=(x >= xm ? a-x : b-x));The above conditions determine the acceptability of the parabolic fit.
Here wetake the golden section step into the larger of the two segments.else {d=p/q;Take the parabolic step.u=x+d;if (u-a < tol2 || b-u < tol2)d=SIGN(tol1,xm-x);}} else {d=CGOLD*(e=(x >= xm ? a-x : b-x));}u=(fabs(d) >= tol1 ? x+d : x+SIGN(tol1,d));fu=(*f)(u);This is the one function evaluation per iteration.if (fu <= fx) {Now decide what to do with our funcif (u >= x) a=x; else b=x;tion evaluation.SHFT(v,w,x,u)Housekeeping follows:SHFT(fv,fw,fx,fu)} else {if (u < x) a=u; else b=u;if (fu <= fw || w == x) {v=w;w=u;fv=fw;fw=fu;} else if (fu <= fv || v == x || v == w) {v=u;fv=fu;}}Done with housekeeping.
Back for}another iteration.nrerror("Too many iterations in brent");*xmin=x;Never get here.return fx;}CITED REFERENCES AND FURTHER READING:Brent, R.P. 1973, Algorithms for Minimization without Derivatives (Englewood Cliffs, NJ: PrenticeHall), Chapter 5.
[1]Forsythe, G.E., Malcolm, M.A., and Moler, C.B. 1977, Computer Methods for MathematicalComputations (Englewood Cliffs, NJ: Prentice-Hall), §8.2.10.3 One-Dimensional Search with FirstDerivativesHere we want to accomplish precisely the same goal as in the previoussection, namely to isolate a functional minimum that is bracketed by the triplet ofabscissas (a, b, c), but utilizing an additional capability to compute the function’sfirst derivative as well as its value.406Chapter 10.Minimization or Maximization of FunctionsIn principle, we might simply search for a zero of the derivative, ignoring thefunction value information, using a root finder like rtflsp or zbrent (§§9.2–9.3).It doesn’t take long to reject that idea: How do we distinguish maxima from minima?Where do we go from initial conditions where the derivatives on one or both ofthe outer bracketing points indicate that “downhill” is in the direction out of thebracketed interval?We don’t want to give up our strategy of maintaining a rigorous bracket on theminimum at all times.
The only way to keep such a bracket is to update it usingfunction (not derivative) information, with the central point in the bracketing tripletalways that with the lowest function value. Therefore the role of the derivatives canonly be to help us choose new trial points within the bracket.One school of thought is to “use everything you’ve got”: Compute a polynomialof relatively high order (cubic or above) that agrees with some number of previousfunction and derivative evaluations. For example, there is a unique cubic that agreeswith function and derivative at two points, and one can jump to the interpolatedminimum of that cubic (if there is a minimum within the bracket).
Suggested byDavidon and others, formulas for this tactic are given in [1].We like to be more conservative than this. Once superlinear convergence setsin, it hardly matters whether its order is moderately lower or higher. In practicalproblems that we have met, most function evaluations are spent in getting globallyclose enough to the minimum for superlinear convergence to commence. So we aremore worried about all the funny “stiff” things that high-order polynomials can do(cf. Figure 3.0.1b), and about their sensitivities to roundoff error.This leads us to use derivative information only as follows: The sign of thederivative at the central point of the bracketing triplet (a, b, c) indicates uniquelywhether the next test point should be taken in the interval (a, b) or in the interval(b, c). The value of this derivative and of the derivative at the second-best-so-farpoint are extrapolated to zero by the secant method (inverse linear interpolation),which by itself is superlinear of order 1.618.
(The golden mean again: see [1], p. 57.)We impose the same sort of restrictions on this new trial point as in Brent’s method.If the trial point must be rejected, we bisect the interval under scrutiny.Yes, we are fuddy-duddies when it comes to making flamboyant use of derivativeinformation in one-dimensional minimization. But we have met too many functionswhose computed “derivatives” don’t integrate up to the function value and don’taccurately point the way to the minimum, usually because of roundoff errors,sometimes because of truncation error in the method of derivative evaluation.You will see that the following routine is closely modeled on brent in theprevious section.#include <math.h>#include "nrutil.h"#define ITMAX 100#define ZEPS 1.0e-10#define MOV3(a,b,c, d,e,f) (a)=(d);(b)=(e);(c)=(f);float dbrent(float ax, float bx, float cx, float (*f)(float),float (*df)(float), float tol, float *xmin)Given a function f and its derivative function df, and given a bracketing triplet of abscissas ax,bx, cx [such that bx is between ax and cx, and f(bx) is less than both f(ax) and f(cx)],this routine isolates the minimum to a fractional precision of about tol using a modification ofBrent’s method that uses derivatives.