Nash - Compact Numerical Methods for Computers (523163), страница 44
Текст из файла (страница 44)
Function minimisation by conjugate gradients (cont.)endelse {proceed with minimisation}beginFmin:=f;{save the best value so far}funcount:=1; {function count initialized to 1}gradcount:=0; {initialise gradient count}repeat {STEP 2: iteration for method -- terminates when no progresscan be made, and search is a steepest descent.}for i:=1 to n dobegint[i]:=0.0; {to zero step vector}c[i]:=0.0; {to zero ‘last’ gradient}end;cycle:=0; {STEP 3: main loop of cg process}oldstep:=1.0; {initially a full step}count:=0; {to ensure this is set < n}repeat {until one cg cycle complete}cycle:=cycle+1;writeln(gradcount,’ ’, funcount,’ ’, Fmin);write(‘parameters ’);for i:=1 to n dobeginwrite(Bvec[i]: 10:5,’ ’);if (7 * (i div 7) = i) and (i<n) then writeln;end;writeln;gradcount:=gradcount+1; {STEP 4: initialize gradient count to 0}fmingr(n, Bvec, Workdata, g);G1:=0.0; G2:=0.0; {STEP 5}for i:=1 to n dobeginX[i]:=Bvec[i];{save best parameters}case method ofFletcher-Reeves: beginG1:=G1+sqr(g[i]); G2:=G2+sqr(c[i]);end;Polak_Ribiere : beginG1:=G1-tg[i]*(g[i]-c[i]); G2:=G2+sqr(c[i]);end;Beale_Sorenson : beginG1:=G1+g[i]*(g[i]-c[i]); G2:=G2+t[i]*(g[i]-c[i]);end;end; {case statement for method selection}c[i]:=g[i];{save gradient}end; {loop on i}if G1>tol then {STEP 6: descent sufficient to proceed}begin {STEP 7: generate direction}if G2>0.0 then G3:=G1/G2 else G3:=1.0; {ensure G3 defined}gradproj:=0.0; {STEP 8}for i:=1 to n dobegint[i]:=t[i]*G3-g[i]; gradproj:=gradproj+t[i]*g[i];Descent to a minimum II: conjugate gradientsAlgorithm 22.
Function minimisation by conjugate gradients (cont.)end;steplength:=oldstep; {STEP 9}{STEP 10: step along search direction}accpoint:=false; {don’t have a good point yet}repeat {line search}count:=0; {to count unchanged parameters}for i:=1 to n dobeginBvec[i]:=X[i]+steplength*t[i];if (reltest+X[i])=(reltest+Bvec[i]) then count:=count+1;end; {loop on i}if count<n then {STEP 11} {main convergence test}begin {STEP 12} {can proceed with linear search}f:=fminfn(n, Bvec, Workdata, notcomp); {function calculation}funcount:=funcount+1;accpoint:=(not notcomp) and(f<=Fmin+gradproj*steplength*acctol);{STEP 13: a point is acceptable only if function is computable(not notcomp) and it satisfies the acceptable point criterion}if not accpoint thenbeginsteplength:=steplength*stepredn; write(‘*’);end;end; {compute and test for linear search}until (count=n) or accpoint; {end of loop for line search}if count<n then {STEP 14}begin {replacements for STEPS 15 onward}newstep:=2*((f-Fmin)-gradproj*steplength);{quadratic inverse interpolation}if newstep>0 thenbegin {cacl interp}newstep:=-gradproj*sqr(steplength)/newstep;for i:=1 to n dobeginBvec[i]:=X[i]+newstep*t[i];end; {no check yet on change in parameters}Fmin:=f; {save new lowest point}f:=fminfn(n, Bvec, Workdata, notcomp);funcount:=funcount+1;if f<Fmin thenbeginFmin:=f; write(‘ i< ’);endelse {reset to best Bvec}beginwrite(‘ i> ’);for i:=1 to n do Bvec[i]:=X[i]+steplength*t[i];end;end; {interpolation}end; {if count < n}end; {if G1>tol}oldstep:=setstep*steplength; {a heuristic to prepare next iteration}203204Compact numerical methods for computersAlgorithm 22.
Function minimisation by conjugate gradients (cont.)if oldstep>1.0 then oldstep:=1.0; {with a limitation to prevent toolarge a step being taken. This strategy follows Nash &Walker-Smith in multiplying the best step by 1.7 and thenlimiting the resulting step to length = 1}until (count=n) or (G1<=tol) or (cycle=cyclimit);{this ends the cg cycle loop}until (cycle=l) and ((count=n) or (G1<=tol));{this is the convergence condition to end loop at STEP 2}end; {begin minimisation}writeln(‘Exiting from Alg22.pas conjugate gradients minimiser’);writeln(‘ ’, funcount, ’ function evaluations used’);writeln(‘ ’, gradcount, ’ gradient evaluations used’);end; {alg22.pas == cgmin}Example 16.1. Conjugate gradients minimisationIn establishing a tentative regional model of hog supply in Canada, Dr J JJaffrelot wanted to reconcile the sum of supply by region for a given year (anannual figure) with the sum of quarterly supply figures (not reported by regions!).The required sum can, by attaching minus signs, be writtenwhere the bj are the parameters which should give T=0 when summed as shownwith the weights w.
Given wj, j=1, 2, . . . , n, T can easily be made zero sincethere are (n-1) degrees of freedom in b. However, some degree of confidencemust be placed in the published figures, which we shall call pj , j=1, 2, . . . , n.Thus, we wish to limit each bj so that| b j-p j| <d jfor j=1, 2, . . . , nwhere dj is some tolerance. Further, we shall try to make b close to p byminimising the functionThe factor 100 is arbitrary. Note that this is in fact a linear least-squares problem,subject to the constraints above.
However, the conjugate gradients method isquite well suited to the particular problem in 23 parameters which was presented,since it can easily incorporate the tolerances dj by declaring the function to be ‘notcomputable’ if the constraints are violated. (In this example they do not in factappear to come into play.) The output below was produced on a Data GeneralECLIPSE operating in six hexadecimal digit arithmetic. Variable 1 is used to holdthe values p, variable 2 to hold the tolerances d and variable 3 to hold the weightsw. The number of data points is reported to be 24 and a zero has been appendedDescent to a minimum II: conjugate gradients205to each of the above-mentioned variables to accommodate the particular way inwhich the objective function was computed.
The starting values for the parameters b are the obvious choices, that is, the reported values p. The minimisationprogram and data for this problem used less than 4000 bytes (characters) ofmemory.** NEW* ENTER”JJJRUN”* RUN12 7 1978 9 15 2NCG JULY 26 77CG + SUCCESS FAILUREDATA FILE NAME ? D16. 12# OF VARIABLES 3# DATA POINTS 24# OF PARAMETERS 23ENTER VARIABLESVARIABLE 1 - COMMENT - THE PUBLISHED VALUES P167.85 .895 167.85 .895 -99.69167.85 .895 -74.33 167.85 .895-4.8 -1.03 -1 3.42 -65.155-.73 -.12 -20.85 1.2 -2.8531.6 -20.66 -8.55 0VARIABLE 2 - COMMENT -THE TOLERANCES D65.2 5.5E-02 65.2 5.5E-02 2065.2 5.5E-02 19.9 65.2 5.5E-021.6 .36 .34 1.5 10.185.51 .26 9.57 .27 .5614.7 3.9 4.8 0VARIABLE 3 - COMMENT -THE WEIGHTS W1 1309.67 1 1388.87 11 1377.69 1 1 1251.0215 119.197 215 29.776 15806.229 1260 23.62 2761 207529.776 33.4 51.58 0ENTER STARTING VALUES FOR PARAMETERSB( 1 )= 167.85B( 2 )= .895B( 3 )= 167.85B( 4 )= .895B( 5 )=-99.69B( 6 )= 167.85B( 7 )= .895B( 8 )=-74.33B( 9 )= 167.55B( 10 )= .895B( 11 )=-4.8B( 12 )=-1.03B( 13 )=-1B( 14 )= 3.42B( 15 )=-67.155B( 16 )=-.73B( 17 )=-.12B( 18 )=-20.55B( 19 )= 1.2B( 20 )=-2.35B( 21 )= 31.6B( 22 )=-20.66B( 23 )=-9.55STEPSIZE= 10 1 7727981 21 5.76721E-04206Compact numerical methods for computers2 31 5.76718E-043 31 5.76718E-044 42 5.76716E-045 42 5.76716E-046 45 5.76713E-047 45 5.76713E-048 48 5.76711E-049 48 5.76711E-04CONVERGED TO 5.76711E-04 # lTNS= 10# EFES= 290B( 1 )= 167.85G( 1 )= .148611B( 2 )= .900395G( 2 )= 194.637B( 3 )= 167.85G( 3 )= .148611B( 4 )= .900721G( 4 )= 206.407B( 5 )=-99.69G( 5 )= .148626B( 6 )= 167.85G( 6 )= .145611B( 7 )= .900675G( 7 )= 204.746B( 8 )=-74.33G( 8 )= .148626B( 9 )= 167.85G( 9 )= .148611B( 10 )= .900153G( 10 )= 185.92B( 11 )=-4.79994G( 11 )= 2.22923B( 12 )=-1.02951G( 12 )= 17.7145B( 13 )=-.999114G( 13 )= 31.9523B( 14 )= 3.42012G( 14 )= 4.42516B( 15 )=-65.1549G( 15 )= 2.22924B( 16 )=-.726679G( 16 )= 119.818B( 17 )=-.114509G( 17 )= 157.255B( 19 )=-20.8499G( 18 )= 3.5103B( 19 )= 1.21137G( 19 )= 410.326B( 20 )=-2.84145G( 20 )= 308.376B( 21 )= 31.6001G( 21 )= 4.42516B( 22 )=-20.6598G( 22 )= 4.36376B( 23 )=-8.54979G( 23 )= 7.66536# EVALS= 50STOP AT 0911*SIZEUSED: 3626 BYTESLEFT: 5760 BYTES*An earlier solution to this problem, obtained using a Data General NOVAoperating in 23 binary digit arithmetic, had identical values for the parameters Bbut quite different values for the gradient components G.