Nash - Compact Numerical Methods for Computers (523163), страница 40
Текст из файла (страница 40)
Note theuse of the stepredn constant, though possibly 0.1 is faster forconvergence. Following mechanism used to set stepsize initial value.}stepsize := 0.0;for i := l to n doif stepsize < stepredn*abs(B[i]) then stepsize := stepredn*abs(B[i]);if stepsize=0.0 then stepsize := stepredn; {for safety}{STEP HJ2: Copy parameters into vector X}for i := 1 to n do X[i] := B[i];{STEP HJ3 not needed. In original code, parameters are entered into Xand copied to B}fval := fminfn(n, B,Workdata,notcomp); {STEP HJ4}if notcomp thenbeginwriteln(‘***FAILURE*** Function not computable at initial point’);fail := true;184Compact numerical methods for computersAlgorithm 27.
Hooke and Jeeves minimiser (cont.)end {safety stop for non-computable function}else {initial function computed -- proceed}begin {main portion of routine}writeln(‘Initial function value=‘,fval);for i := l to n dobeginwrite(B[i]:10:5,’’);if (7 * (i div 7) = i) and (i<n) then writeln;end;writeln;fold := fval; Fmin := fval; {to save function value at ‘old’ base whichis also current best function value}while stepsize>intol do {STEP HJ9 is now here}begin {STEP HJ5 == Axial Search}{write( ‘A’);} {Indicator output }for i := 1 to n do {STEP AS1}begin {STEP AS2}temp := B[i]; B[i] := temp+stepsize; {save parameter, step ‘forward’}fval := fmi.nfn(n, B,Workdata,notcomp); ifn := ifn+l; {STEP AS3}if notcomp then fval := big; {to allow for non-computable function}if fval<Fmin thenFmin := fval {STEP AS4}elsebegin {STEP AS5}B[i] := temp-stepsize; {to step ‘backward’ if forward stepunsuccessful in reducing function value}fval := fminfn(n, B,Workdata,notcomp); ifn := ifn+l; {STEP AS6}if notcomp then fval := big; {for non-computable function}if fval<Fmin then {STEP AS7}Fmin := fval {STEP AS9 -- re-ordering of original algorithm}else {STEP AS8}B[i] := temp; {to reset the parameter value to original value}end; {else fval>=Fmin}end; {loop on i over parameters and end of Axial Search}if Fmin<fold then {STEP HJ6}begin {Pattern Move} {STEP PM1}{write( ‘P’);} {Indicator output}for i := 1 to n do {loop over parameters}begin {STEP PM2}temp := 2.0*B[i]-X[i]; {compute new base point component}X[i] := B[i]; B[i] := temp; {save current point and new base point}end; {loop on i -- STEP PM3}fold := Fmin; {to save new base point value}end {of Pattern Move -- try a new Axial Search}elsebeginsamepoint := true; {initially assume points are the same}i := l;{set loop counter start}repeatif B[i]<>X[i] then samepoint := false;i := i+l;until (not samepoint) or (i>n); {test for equality of points}Direct search methods185Algorithm 27.
Hooke and Jeeves minimiser (cont.)if samepoint thenbegin {STEP HJ8}stepsize := stepsize*stepredn; {to reduce stepsize. The reductionfactor (0.2) should be chosen to reduce the stepsizereasonably rapidly when the initial point is close tothe solution, but not so small that progress cannotstill be made towards a solution point which is notnearby.}{writeln;} {Needed if indicator output ON}write(‘stepsize now ‘,stepsize:10,’ Best fn value=‘,Fmin);writeln(‘ after ‘,ifn);for i := l to n dobeginwrite(B[i]:10:5,’’);if (7 * (i div 7) = i) and (i<n) then writeln;end;writeln;endelse {not the same point -- return to old basepoint}begin {STEP HJ7}for i := 1 to n do B[i] := X[i];{writeln;} {Needed if indicator output ON}writeln(‘Return to old base point’);end; { reset basepoint}end; {if Fmin<fold}end; {while stepsize>intol} {STEP HJ10}writeln(‘Converged to Fmin=’ ,Fmin,’ after ‘,ifn,’ evaluations’);{Fmin has lowest function value, X[] has parameters.}end; {if notcomp on first function evaluation}end; {alg27.pas == hjmin}In the next two chapters, considerable attention is paid to calculating sets ofmutually conjugate directions.
Direct search techniques parallelling these developments have been attempted. The most successful is probably that of M J D Powell.This is discussed at length by Brent (1973). These algorithms are more elaborate thanany in this monograph, though it is possible they could be simplified. Alternatively,numerical approximation of derivatives ($18.2) can be used in algorithms 21 or 22.Out of this range of methods I have chosen to present the Nelder-Mead procedureand Hooke and Jeeves method because I feel they are the most easily understood.There is no calculus or linear algebra to explain and for two-dimensional problemsthe progress of algorithm 19 can be clearly visualised.The software diskette includes driver programs DR1920.PAS to allow the NelderMead minimiser to be used in conjunction with algorithm 20.
Driver DR27.PAS willallow the Hooke and Jeeves minimiser to be used, but this driver does not invokealgorithm 20 since a major part of the minimiser is an axial search.Previous Home NextChapter 15DESCENT TO A MINIMUM I: VARIABLEMETRIC ALGORITHMS15.1. DESCENT METHODS FOR MINIMISATIONThe Nelder-Mead algorithm is a direct search procedure in that it involves onlyfunction values. In the next few sections, methods will be considered which makeuse of the gradient of the function S(b) which will be called g :for j = 1, 2, . .
. , n(15.1)evaluated at the point b. Descent methods all use the basic iterative stepb ' =b - k B g(15.2)where B is a matrix defining a transformation of the gradient and k is a steplength. The simplest such algorithm, the method of steepest descents, was proposedby Cauchy (1848) for the solution of systems of nonlinear equations. This usesB= 1n(15.3)and any step length k which reduces the function so thatS( b')<S(b).(15.4)The principal difficulty with steepest descents is its tendency to hemstitch, that is,to criss-cross a valley on the function S(b) instead of following the floor of thevalley to a minimum.
Kowalik and Osborne (1968, pp 34-9) discuss some of thereasons for this weakness, which is primarily that the search directions generatedare not linearly independent. Thus a number of methods have been developedwhich aim to transform the gradient g so that the search directions generated in(15.2) are linearly independent or, equivalently, are conjugate to each other withrespect to some positive definite matrix A.
In other words, if x i and x j are searchdirections, xi and x j are conjugate with respect to the positive definite matrix A if(15.5)The conjugate gradients algorithm 22 generates such a set of search directionsimplicitly, avoiding the storage requirements of either a transformation matrix Bor the previous search directions. The variable metric algorithm 21 uses atransformation matrix which is adjusted at each step to generate appropriatesearch directions.
There is, however, another way to think of this process.Consider the set of nonlinear equations formed by the gradient at a minimumg (b')=0.186(15.6)Descent to a minimum I: variable metric algorithms187As in the one-dimensional root-finding problem, it is possible to seek suchsolutions via a linear approximation from the current point b as in equation(13.25), that is(15.7)g (b’)=g(b)+H(b) (b' -b)where H(b) is the Hessian matrix(15.8)of second derivatives of the function to be minimised or first derivatives of thenonlinear equations.
For convex functions, H is at least non-negative definite.For the current purpose, suppose it is positive definite so that its inverse exists.Using the inverse together with equations (15.6) impliesb' =b- H - l (b) g( b)(15.9)which is Newton’s method in n parameters. This is equivalent to equation (15.2)withB =H - 1k=l.(15.10)The step parameter k is rarely fixed, however, and usually some form of linearsearch is used (§13.2). While Newton’s method may be useful for solvingnonlinear-equation systems if the n-dimensional equivalents of the onedimensional difficulties of the algorithm are taken care of, for minimisationproblems it requires that second derivatives be computed.This means that n 2 second derivative evaluations, n first derivative evaluationsand a matrix inverse are needed even before the linear search is attempted.Furthermore the chances of human error in writing the subprograms to computethese derivatives is very high-the author has found that most of the ‘failures’ ofhis algorithms have been due to such errors when only first derivatives wererequired.