Nash - Compact Numerical Methods for Computers (523163), страница 37
Текст из файла (страница 37)
In exact arithmetic, (14.10) is acceptable for all points, and theauthor has in some implementations omitted the test for i = L. Some caution iswarranted, however, since some machines can form a mean of two numbers whichis not between those two numbers. Hence, the point b L may be altered in theoperations of formula (14.10).Different contraction factors β and β' may be used in (14.8) and (14.10). Inpractice these, as well as a and γ can be chosen to try to improve the rate ofconvergence of this procedure either for a specific class or for a wide range ofproblems. Following Nelder and Mead (1965), I have found the strategya =1γ =2β' = β = 0·5(14.11)to be effective.
It should be noted, however, that the choice of these values isbased on limited testing. In fact, every aspect of this procedure has been evolvedDirect search methods171heuristically based on a largely intuitive conception of the minimisation problem.As such there will always be functions which cannot be minimised by this methodbecause they do not conform to the idealisation. Despite this, the algorithm issurprisingly robust and, if permitted to continue long enough, almost always findsthe minimum.The thorniest question concerning minimisation algorithms must, therefore, beaddressed: when has the minimum been found? Nelder and Mead suggest usingthe ‘standard error’ of the function values(14.12)where(14.13)The procedure is taken to have converged when the test value falls below somepreassigned tolerance.
In the statistical applications which interested Nelder andMead, this approach is reasonable. However, the author has found this criterionto cause premature termination of the procedure on problems with fairly flat areason the function surface. In a statistical context one might wish to stop if such aregion were encountered, but presuming the minimum is sought. it seems logicalto use the simpler test for equality between S(b L) and S(b H), that is, a test forequal height of all points in the simplex.An additional concern on machines with low-precision arithmetic is that it ispossible for a general contraction (14.10) not to reduce the simplex size. Therefore, it is advisable to compute some measure of the simplex size during thecontraction to ensure a decrease in the simplex size, as there is no point incontinuing if the contraction has not been effective.
A very simple measure is thesum(14.14)where(14.15)Finally, it is still possible to converge at a point which is not the minimum. If,for instance, the (n+1) points of the simplex are all in one plane (which is a linein two dimensions), the simplex can only move in (n-1) directions in then-dimensional space and may not be able to proceed towards the minimum.O’Neill (1971), in a FORTRAN implementation of the Nelder-Mead ideas,tests the function value at either side of the supposed minimum along eachof the parameter axes. If any function value is found lower than the currentsupposed minimum, then the procedure is restarted.The author has found the axial search to be useful in several cases in avoidingfalse convergence.
For instance, in a set of 74 tests, six failures of the procedurewere observed. This figure would have been 11 failures without the restart facility.172Compact numerical methods for computers14.2. POSSIBLE MODIFICATIONS OF THE NELDER-MEADALGORITHMBesides choices for (a, β, β' and γ other than (14.11) there are many minorvariations on the basic theme of Nelder and Mead. The author has examinedseveral of these, mainly using the Rosenbrock (1960) test function of twoparameters(14.16)starting at the point (-1·2, 1).(i) The function value S(b C ) can be computed at each iteration.
If S(b C )<S(b L ),b L is replaced by b C . The rest of the procedure is unaffected by this change, whichis effectively a contraction of the simplex. If there are more than two parameters,the computation of bC can be repeated. In cases where the minimum lies withinthe current simplex, this modification is likely to permit rapid progress towardsthe minimum. Since, however, the simplex moves by means of reflection andexpansion, the extra function evaluation is often unnecessary, and in tests run bythe author the cost of this evaluation outweighed the benefit.(ii) In the case that S(bR ) <S( b L) the simplex is normally expanded by extensionalong the line (b R -b C ).
If b R is replaced by b E, the formulae contained in the firsttwo lines of equation (14.4) permit the expansion to be repeated. This modification suffers the same disadvantages as the previous one; the advantages of therepeated extension are not great enough-in fact do not occur often enough-tooffset the cost of additional function evaluations.(iii) Instead of movement of the simplex by reflection of b H through bC , one couldconsider extensions along the line (b L -b C ), that is, from the ‘low’ vertex of thesimplex. Simple drawings of the two-dimensional case show that this tends tostretch the simplex so that the points become coplanar, forcing restarts.
Indeed,a test of this idea produced precisely this behaviour.(iv) For some sets of parameters b, the function may not be computable, or aconstraint may be violated (if constraints are included in the problem). In suchcases, a very large value may be returned for the function to prevent motion inthe direction of forbidden points. Box (1965) has enlarged on this idea in hisComplex Method which uses more than (n+1) points in an attempt to prevent allthe points collapsing onto the constraint.(v) The portion of the algorithm for which modifications remain to be suggestedis the starting (and restarting) of the procedure.
Until now, little mention has beenmade of the manner in which the original simplex should be generated. Nelderand Mead (1965) performed a variety of tests using initial simplexes generated byequal step lengths along the parameter axes and various ‘arrangements of theinitial simplex.’ The exact meaning of this is not specified. They found the rate ofconvergence to be influenced by the step length chosen to generate an initialsimplex. O’Neill (1971) in his FORTRAN implementation permits the step alongeach parameter axis to be specified separately, which permits differences in thescale of the parameters to be accommodated by the program. On restarting, thesesteps are reduced by a factor of 1000. General rules on how step lengths shouldDirect search methods173be chosen are unfortunately difficult to state.
Quite obviously any starting stepshould appreciably alter the function. In many ways this is an (n+1)-foldrepetition of the necessity of good initial estimates for the parameters as in §12.2.More recently other workers have tried to improve upon the Nelder-Meadstrategies, for example Craig et al (1980). A parallel computer version reported byVirginia Torczon seems to hold promise for the solution of problems in relativelylarge numbers of parameters. Here we have been content to stay close to the originalNelder-Mead procedure, though we have simplified the method for ranking thevertices of the polytope, in particular the selection of the point b N .Algorithm 19.
A Nelder-Mead minimisation procedureprocedure nmmin(n: integer; {the number of parameters in thefunction to be minimised}var Bvec,X: rvector; {the parameter values oninput (Bvec) and output (X) from minmeth}var Fmin: real; {‘minimum’ function value}Workdata: probdata; {user defined data area}var fail: boolean; {true if method has failed}var intol: real); {user-initialized convergencetolerance; zero on entry if it is not set yet.}{alg19.pas == Nelder Mead minimisation of a function of n parameters.Original method due to J. Nelder and R. Mead., Computer Journal,vol 7, 1965 pp. 308-313.Modification as per Nash J and Walker-Smith M, Nonlinear ParameterEstimation: an Integrated System in BASIC, Marcel Dekker: New York,1987.Modifications are principally- in the computation of the “next to highest” vertex of the currentpolytope,- in the verification that the shrink operation truly reduces the sizeof the polytope, and- in form of calculation of some of the search points.We further recommend an axial search to verify convergence.
This canbe called outside the present code. If placed in-line, the code canbe restarted at STEP3.If space is at a premium, vector X is not needed except to returnfinal values of parameters.Copyright 1988 J.C.Nash}constPcol= 27; {Maxparm + 2 == maximum number of columns in polytope}Prow = 26; {Maxparm + 1 == maximum number of rows in polytope}alpha = 1.0; (reflection factor)beta = 0.5; {contraction and reduction factor}gamma = 2.0; {extension factor}varaction : string[15]; {description of action attempted on polytope. Theprogram does not inform the user of the success ofthe attempt. However, the modifications to dothis are straightforward.]174Compact numerical methods for computersAlgorithm 19. A Nelder-Mead minimisation procedure (cont.)C : integer; {pointer column in workspace P which stores thecentroid of the polytope.
C is set to n+2) here.}calcvert : boolean; {true if vertices to be calculated, as at startor after a shrink operation}convtol : real; {a convergence tolerance based on functionvalue differences}f : real; {temporary function value}funcount : integer; {count of function evaluations}H : integer; {pointer to highest vertex in polytope}i,j : integer; {working integers}L : integer; {pointers to lowest vertex in polytope}notcomp : boolean; {non-computability flag}n1 : integer; {n+l}oldsize : real; {former size measure of polytope}P : array[l..Prow,1..Pcol] of real; {polytope workspace}shrinkfail: boolean; {true if shrink has not reduced polytope size}size : real; {a size measure for the polytope}step : real; {stepsize used to build polytope}temp : real; {a temporary variable}trystep : real; {a trial stepsize}tstr : string[5]; {storage for function counter in string formto control display width}VH,VL,VN : real; {function values of ‘highest’,‘lowest’ and*next’ vertices}VR : real; {function value at Reflection}beginwriteln(‘Nash Algorithm 19 version 2 1988-03-17’);writeln(‘Nelder Mead polytope direct search function minimiser’);fail := false; {the method has not yet failed!}f := fminfn(n,Bvec,Workdata,notcomp); {initial fn calculation -- STEP 1}if notcomp thenbeginwriteln(‘**** Function cannot be evaluated at initial parameters ****‘);fail := true; {method has failed -- cannot get started}endelsebegin {proceed with minimisation}writeln(‘Function value for initial parameters = ‘,f);if intol<0.0 then intol := Calceps;funcount := 1; {function count initialized to 1}convtol := intol*(abs(f)+intol); {ensures small value relative tofunction value -- note that we can restart the procedure ifthis is too big due to a large initial function value.}writeln(‘Scaled convergence tolerance is ‘,convtol);n1 := n+1; C := n+2; P[n1,1] := f; {STEP 2}for i := 1 to n do P[i,1] := Bvec[i];{This saves the initial point as vertex 1 of the polytope.}L := 1; {We indicate that it is the ‘lowest’ vertex at the moment, sothat its funtion value is not recomputed later in STEP 10}size := 0.0; {STEP 3}{STEP 4: build the initial polytope using a fixed step size}step := 0.0;Direct search methodsAlgorithm 19.