Nash - Compact Numerical Methods for Computers (523163), страница 33
Текст из файла (страница 33)
Indeed, the largest part of many procedures for minimising afunction along a line by minimising an interpolant I(b) is concerned with ensuringthat the interpolant has the proper shape to guarantee a minimum. FurthermoreI(b) cannot be used to extrapolate very far from the region in which it has beer152Compact numerical methods for computersgenerated without risking, at the very least, unnecessary function evaluationswhich, after all, one is trying to avoid by using more of the information about thefunction S(b) than the rather conservative direct search methods above.At this juncture we can consider combining the success-failure algorithm withinverse interpolation.
One choice, made in algorithm 22, is to take the initialpoint in the success-failure procedure as b 0 Then ‘successes’ lower the functionvalue. Ultimately, however, a ‘failure’ will follow a ‘success unless u= b 0 isminimal, even if the procedure begins with a string of ‘failures’.
Call the positionat the last ‘failure’ b2, and the position at the ‘success’ which preceded it b1. (Notethat b1 has to be the lowest point found so far.) Then S(b1) is less than eitherS(b0) or S(b 2) and the three points define a V-shape which ensures that theinterpolating parabola (13.5) will not be used to extrapolate.Alternatively, as in algorithm 17 below, any of the following combinations ofevents lead to a V-shaped triple of points with b1 the lowest of the three:(i) initial point, success, failure (b 0, b 1, b2 )(ii) success, success, failure (b0, b 1, b 2 )(iii) initial point, failure, failure (b 1, b 0, b2).Consider then the three points defined by b 0, b 1, b2 where b, occupies the middleposition and where, using the notationS j =S(b j)(13.7)S1 <S0(13.8)S 1 <S 2 .(13.9)for brevity,andExcluding the exceptional cases that the function is flat or otherwise perverse, sothat at least one of the conditions (13.8) or (13.9) is an inequality, the interpolating parabola will have its minimum between b 0 and b 2.
Note now that we canmeasure all distances from b1, so that equations (13.5) can be rewritten(13.10)wherexj = bj - b1for j = 0, 1, 2.(13.11)Equations (13.10) can then be solved by elimination to give(13.12)and(13.13)One-dimensional problems153(Note that the denominators differ only in their signs.) Hence the minimum of theparabola is found at(13.14)The success-failure algorithm always leaves the step length equal to x2. Thelength x0 can be recovered if the steps from some initial point to the previous twoevaluation points are saved.
One of these points will be b 1; the other is taken asb0. The expression on the right-hand side of equation (13.14) can be evaluated ina number of ways. In the algorithm below, both numerator and denominator havebeen multiplied by -1.To find the minimum of a function of one parameter, several cycles ofsuccess-failure and parabolic inverse interpolation are usually needed. Note thatalgorithm 17 recognises that some functions are not computable at certain pointsb. (This feature has been left out of the program FMIN given by Forsythe et al(1977), and caused some failures of that program to minimise fairly simplefunctions in tests run by B Henderson and the author, though this commentreflects differences in design philosophy rather than weaknesses in FMIN.) Algorithm 17 continues to try to reduce the value of the computed function until(b+h) is not different from b in the machine arithmetic.
This avoids therequirement for machine-dependent tolerances, but may cause the algorithm toexecute indefinitely in environments where arithmetic is performed in extendedprecision accumulators if a storage of (b+h) is not forced to shorten the numberof digits carried.In tests which I have run with B Henderson, algorithm 17 has always beenmore efficient in terms of both time and number of function evaluations than alinear search procedure based on that in algorithm 22. The reasons for retainingthe simpler approach in algorithm 22 were as follows.(i) A true minimisation along the line requires repeated cycles of successfailure/inverse interpolation.
In algorithm 22 only one such cycle is used as part ofa larger conjugate gradients minimisation of a function of several parameters.Therefore, it is important that the inverse interpolation not be performed until atleast some progress has been made in reducing the function value, and theprocedure used insists that at least one ‘success’ be observed before interpolationis attempted.(ii) While one-dimensional trials and preliminary tests of algorithm 17-like cyclesin conjugate gradients minimisation of a function of several parameters showedsome efficiency gains were possible with this method, it was not possible to carryout the extensive set of comparisons presented in chapter 18 for the functionminimisation algorithms due to the demise of the Data General NOVA; thereplacement ECLIPSE uses a different arithmetic and operating system.
In viewof the reasonable performance of algorithm 22, I decided to keep it in thecollection of algorithms. On the basis of our experiences with the problem ofminimising a function of one parameter, however, algorithm 17 has been chosenfor linear search problems. A FORTRAN version of this algorithm performed154Compact numerical methods-for computerscompetitively with the program FMIN due to Brent as given in Forsythe et al (1977)when several tests were timed on an IBM 370/168.The choice of the step adjustment factors Al and A2 to enlarge the step lengthor to reduce it and change its sign can be important in that poor choices willobviously cause the success-failure process to be inefficient.
Systematic optimisation of these two parameters over the class of one-dimensional functions whichmay have to be minimised is not feasible, and one is left with the ratherunsatisfactory situation of having to make a judgement from experience. Dixon(1972) mentions the choices (2,-0·25) and (3,-0·5). In the present application,however, where the success-failure steps are followed by inverse interpolation, Ihave found the set (1·5,-0·25) to be slightly more efficient, though this maymerely reflect problems I have been required to solve.Algorithm 17. Minimisation of a function of one variableprocedure min1d(var bb : real; {initial value of argument of functionto be minimised, and resulting minimum position}var st: real; {initial and final step-length}var ifn : integer; {function evaluation counter}var fnminval : real {minimum function value on return});{alg17.pas ==One-dimensional minimisation of a function using success-failuresearch and parabolic inverse interpolationCopyright 1988 J.C.Nash}{No check is made that abs(st)>0.0.
Algorithm will still converge.}vara1, a2, fii, s0, s1, s2, tt0, tt1, tt2, x0, x1, x2, xii : real;notcomp, tripleok: boolean;beginwriteln(‘alg17.pas -- One dimensional function minimisation’);{STEP 0 -- partly in procedure call}ifn := 0; {to initialize function evaluation count}a1 := 1.5; {to set the growth parameter of the success-failure search}a2 := -0.25; {to set the shrink parameter of the success-failure search}x1 := bb; {to start, we must have a current ‘best’ argument}notcomp := false; {is set TRUE when function cannot be computed. We set ithere otherwise minimum or root of fn1d may be displayed}s0 := fn1d(x1,notcomp); ifn := ifn+l; {Compute the function at xl.}if notcomp thenbeginwriteln(‘***FAILURE *** Function cannot be computed at initial point’);halt;end;repeat {Main minimisation loop}x0 := xl; {to save value of argument}bb := x0;x1 := x0+st; {to set second value of argument}s1 := fn1d(x1,notcomp); if notcomp then s1 := big; ifn := ifn+1;{Note mechanism for handling non-computability of the function.}tripleok := false; {As yet, we do not have a triple of points in a V.}One-dimensional problemsAlgorithm 17.
Minimisation of a function of one variable (cont.)if s1<s0 thenbegin {Here we can proceed to try to find s2 now in same direction.}repeat {success-failure search loop}st := st*a1; {increase the stepsize after a success}x2 := x1+st; {get next point in the series}s2 := fn1d(x2,notcomp); if notcomp then s2 := big; ifn := ifn+1;if s2<s1 thenbegin {another success}s0 := s1; s1 := s2; {In order to continue search,}x0 := x1; x1 := x2; {we copy over the points.}write(‘Success1‘);endelse {‘failure’ after ‘success’ ==> V-shaped triple of points}begintripleok := true; (to record existence of the triple}write(‘Failure1‘);end;until tripleok, (End of the success-failure search for}end {s1<s0 on first pair of evaluations}elsebegin {s1>=s0 on first pair of evaluations in this major cycle, so wemust look for a third point in the reverse search direction.}st := a2*st; {to reverse direction of search and reduce the step size}tt2 := s0; s0 := s1; s1 := tt2; {to swap function values}tt2 := x0; x0 := x1; x1 := tt2; {to swap arguments}repeatx2 := x1+st; {get the potential third point}s2 := fn1d(x2,notcomp); if notcomp then s2 := big; ifn := ifn+1;if s2<s1 thenbegin {success in reducing function -- keep going}s0 := s1; s1 := s2, x0 := x1; x1 := x2; {reorder points}st := st*a1; {increase the stepsize maintaining direction}write(‘Success2‘);endelsebegin {two failures in a row ensures a triple in a V}tripleok := true; write(‘Failure2’);end;until tripleok; {End of success-failure search}end, {if s1<s0 for first pair of test points}{Now have a V of points (x0,s0), (x1,s1), (x2,s2).}writeln; writeln(‘Triple (‘,x0,‘,‘,s0,*)‘);writeln(‘(‘,x1,‘,‘,s1,‘)‘); writeln(’(‘,x2,‘,‘,s2,‘)‘);tt0 := x0-x1; {to get deviation from best point found.