c10-5 (779543), страница 2
Текст из файла (страница 2)
Quadratic convergence is of no particular advantage to a program whichmust slalom down the length of a valley floor that twists one way and another (andanother, and another, . . . – there are N dimensions!). Along the long direction,a quadratically convergent method is trying to extrapolate to the minimum of aparabola which just isn’t (yet) there; while the conjugacy of the N − 1 transversedirections keeps getting spoiled by the twists.Sooner or later, however, we do arrive at an approximately ellipsoidal minimum(cf. equation 10.5.1 when b, the gradient, is zero).
Then, depending on how muchaccuracy we require, a method with quadratic convergence can save us several timesN 2 extra line minimizations, since quadratic convergence doubles the number ofsignificant figures at each iteration.Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited.
To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).Powell, in 1964, showed that, for a quadratic form like (10.5.1), k iterationsof the above basic procedure produce a set of directions ui whose last k membersare mutually conjugate. Therefore, N iterations of the basic procedure, amountingto N (N + 1) line minimizations in all, will exactly minimize a quadratic form.Brent [1] gives proofs of these statements in accessible form.Unfortunately, there is a problem with Powell’s quadratically convergent algorithm.
The procedure of throwing away, at each stage, u1 in favor of PN − P0tends to produce sets of directions that “fold up on each other” and become linearlydependent. Once this happens, then the procedure finds the minimum of the functionf only over a subspace of the full N -dimensional case; in other words, it gives thewrong answer. Therefore, the algorithm must not be used in the form given above.There are a number of ways to fix up the problem of linear dependence inPowell’s algorithm, among them:1. You can reinitialize the set of directions ui to the basis vectors ei after everyN or N + 1 iterations of the basic procedure.
This produces a serviceable method,which we commend to you if quadratic convergence is important for your application(i.e., if your functions are close to quadratic forms and if you desire high accuracy).2. Brent points out that the set of directions can equally well be reset tothe columns of any orthogonal matrix.
Rather than throw away the informationon conjugate directions already built up, he resets the direction set to calculatedprincipal directions of the matrix A (which he gives a procedure for determining).The calculation is essentially a singular value decomposition algorithm (see §2.6).Brent has a number of other cute tricks up his sleeve, and his modification ofPowell’s method is probably the best presently known. Consult [1] for a detaileddescription and listing of the program. Unfortunately it is rather too elaborate forus to include here.3. You can give up the property of quadratic convergence in favor of a moreheuristic scheme (due to Powell) which tries to find a few good directions alongnarrow valleys instead of N necessarily conjugate directions.
This is the methodthat we now implement. (It is also the version of Powell’s method given in Acton [2],from which parts of the following discussion are drawn.)10.5 Direction Set (Powell’s) Methods in Multidimensions417f0 ≡ f(P0 )fN ≡ f(PN )fE ≡ f(2PN − P0 )(10.5.7)Here fE is the function value at an “extrapolated” point somewhat further alongthe proposed new direction. Also define ∆f to be the magnitude of the largestdecrease along one particular direction of the present basic procedure iteration.
(∆fis a positive number.) Then:1. If fE ≥ f0 , then keep the old set of directions for the next basic procedure,because the average direction PN − P0 is all played out.2. If 2 (f0 − 2fN + fE ) [(f0 − fN ) − ∆f]2 ≥ (f0 − fE )2 ∆f, then keep the oldset of directions for the next basic procedure, because either (i) the decrease alongthe average direction was not primarily due to any single direction’s decrease, or(ii) there is a substantial second derivative along the average direction and we seemto be near to the bottom of its minimum.The following routine implements Powell’s method in the version just described.In the routine, xi is the matrix whose columns are the set of directions ni ; otherwisethe correspondence of notation should be self-evident.#include <math.h>#include "nrutil.h"#define TINY 1.0e-25#define ITMAX 200A small number.Maximum allowed iterations.void powell(float p[], float **xi, int n, float ftol, int *iter, float *fret,float (*func)(float []))Minimization of a function func of n variables.
Input consists of an initial starting pointp[1..n]; an initial matrix xi[1..n][1..n], whose columns contain the initial set of directions (usually the n unit vectors); and ftol, the fractional tolerance in the function valuesuch that failure to decrease by more than this amount on one iteration signals doneness. Onoutput, p is set to the best point found, xi is the then-current direction set, fret is the returnedfunction value at p, and iter is the number of iterations taken. The routine linmin is used.{void linmin(float p[], float xi[], int n, float *fret,float (*func)(float []));int i,ibig,j;float del,fp,fptt,t,*pt,*ptt,*xit;pt=vector(1,n);ptt=vector(1,n);xit=vector(1,n);*fret=(*func)(p);for (j=1;j<=n;j++) pt[j]=p[j];for (*iter=1;;++(*iter)) {fp=(*fret);ibig=0;Save the initial point.Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).The basic idea of our now-modified Powell’s method is still to take PN − P0 asa new direction; it is, after all, the average direction moved after trying all N possibledirections. For a valley whose long direction is twisting slowly, this direction islikely to give us a good run along the new long direction. The change is to discardthe old direction along which the function f made its largest decrease.
This seemsparadoxical, since that direction was the best of the previous iteration. However, itis also likely to be a major component of the new direction that we are adding, sodropping it gives us the best chance of avoiding a buildup of linear dependence.There are a couple of exceptions to this basic idea. Sometimes it is better notto add a new direction at all. Define418Chapter 10.del=0.0;Will be the biggest function decrease.for (i=1;i<=n;i++) {In each iteration, loop over all directions in the set.for (j=1;j<=n;j++) xit[j]=xi[j][i];Copy the direction,fptt=(*fret);linmin(p,xit,n,fret,func);minimize along it,if (fptt-(*fret) > del) {and record it if it is the largest decreasedel=fptt-(*fret);so far.ibig=i;}}if (2.0*(fp-(*fret)) <= ftol*(fabs(fp)+fabs(*fret))+TINY) {free_vector(xit,1,n);Termination criterion.free_vector(ptt,1,n);free_vector(pt,1,n);return;}if (*iter == ITMAX) nrerror("powell exceeding maximum iterations.");for (j=1;j<=n;j++) {Construct the extrapolated point and theptt[j]=2.0*p[j]-pt[j];average direction moved.
Save thexit[j]=p[j]-pt[j];old starting point.pt[j]=p[j];}fptt=(*func)(ptt);Function value at extrapolated point.if (fptt < fp) {t=2.0*(fp-2.0*(*fret)+fptt)*SQR(fp-(*fret)-del)-del*SQR(fp-fptt);if (t < 0.0) {linmin(p,xit,n,fret,func);Move to the minimum of the new direcfor (j=1;j<=n;j++) {tion, and save the new direction.xi[j][ibig]=xi[j][n];xi[j][n]=xit[j];}}}Back for another iteration.}Implementation of Line MinimizationMake no mistake, there is a right way to implement linmin: It is to usethe methods of one-dimensional minimization described in §10.1–§10.3, but torewrite the programs of those sections so that their bookkeeping is done on vectorvalued points P (all lying along a given direction n) rather than scalar-valuedabscissas x. That straightforward task produces long routines densely populatedwith “for(k=1;k<=n;k++)” loops.We do not have space to include such routines in this book.
Our linmin, whichworks just fine, is instead a kind of bookkeeping swindle. It constructs an “artificial”function of one variable called f1dim, which is the value of your function, say,func, along the line going through the point p in the direction xi. linmin calls ourfamiliar one-dimensional routines mnbrak (§10.1) and brent (§10.3) and instructsthem to minimize f1dim. linmin communicates with f1dim “over the head” ofmnbrak and brent, through global (external) variables. That is also how it passesto f1dim a pointer to your user-supplied function.The only thing inefficient about linmin is this: Its use as an interface between amultidimensional minimization strategy and a one-dimensional minimization routineresults in some unnecessary copying of vectors hither and yon. That should notSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.