Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C, страница 102
Описание файла
PDF-файл из архива "Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C", который расположен в категории "". Всё это находится в предмете "численные методы" из 6 семестр, которые можно найти в файловом архиве . Не смотря на прямую связь этого архива с , его также можно найти и в других разделах. Архив можно найти в разделе "книги и методические указания", в предмете "численные методы и алгоритмы" в общих файлах.
Просмотр PDF-файла онлайн
Текст 102 страницы из PDF
Following Brodlie (in [2]), we will give thefollowing heuristic motivation of the procedure.Subtracting equation (10.7.4) at xi+1 from that same equation at xi givesxi+1 − xi = A−1 · (∇fi+1 − ∇fi )(10.7.6)where ∇fj ≡ ∇f(xj ). Having made the step from xi to xi+1 , we might reasonablywant to require that the new approximation Hi+1 satisfy (10.7.6) as if it wereactually A−1 , that is,xi+1 − xi = Hi+1 · (∇fi+1 − ∇fi )(10.7.7)We might also imagine that the updating formula should be of the form Hi+1 =Hi + correction.What “objects” are around out of which to construct a correction term? Mostnotable are the two vectors xi+1 − xi and ∇fi+1 − ∇fi ; and there is also Hi .There are not infinitely many natural ways of making a matrix out of these objects,especially if (10.7.7) must hold! One such way, the DFP updating formula, is(xi+1 − xi ) ⊗ (xi+1 − xi )(xi+1 − xi ) · (∇fi+1 − ∇fi )[Hi · (∇fi+1 − ∇fi )] ⊗ [Hi · (∇fi+1 − ∇fi )]−(∇fi+1 − ∇fi ) · Hi · (∇fi+1 − ∇fi )Hi+1 = Hi +(10.7.8)where ⊗ denotes the “outer” or “direct” product of two vectors, a matrix: The ijcomponent of u⊗v is ui vj .
(You might want to verify that 10.7.8 does satisfy 10.7.7.)428Chapter 10.Minimization or Maximization of FunctionsThe BFGS updating formula is exactly the same, but with one additional term,· · · + [(∇fi+1 − ∇fi ) · Hi · (∇fi+1 − ∇fi )] u ⊗ uwhere u is defined as the vector(xi+1 − xi )u≡(xi+1 − xi ) · (∇fi+1 − ∇fi )Hi · (∇fi+1 − ∇fi )−(∇fi+1 − ∇fi ) · Hi · (∇fi+1 − ∇fi )(10.7.9)(10.7.10)(You might also verify that this satisfies 10.7.7.)You will have to take on faith — or else consult [3] for details of — the “deep”result that equation (10.7.8), with or without (10.7.9), does in fact converge to A−1in N steps, if f is a quadratic form.Here now is the routine dfpmin that implements the quasi-Newton method, anduses lnsrch from §9.7.
As mentioned at the end of newt in §9.7, this algorithmcan fail if your variables are badly scaled.#include <math.h>#include "nrutil.h"#define ITMAX 200#define EPS 3.0e-8#define TOLX (4*EPS)#define STPMX 100.0Maximum allowed number of iterations.Machine precision.Convergence criterion on x values.Scaled maximum step length allowed inline searches.#define FREEALL free_vector(xi,1,n);free_vector(pnew,1,n); \free_matrix(hessin,1,n,1,n);free_vector(hdg,1,n);free_vector(g,1,n); \free_vector(dg,1,n);void dfpmin(float p[], int n, float gtol, int *iter, float *fret,float(*func)(float []), void (*dfunc)(float [], float []))Given a starting point p[1..n] that is a vector of length n, the Broyden-Fletcher-GoldfarbShanno variant of Davidon-Fletcher-Powell minimization is performed on a function func, usingits gradient as calculated by a routine dfunc.
The convergence requirement on zeroing thegradient is input as gtol. Returned quantities are p[1..n] (the location of the minimum),iter (the number of iterations that were performed), and fret (the minimum value of thefunction). The routine lnsrch is called to perform approximate line minimizations.{void lnsrch(int n, float xold[], float fold, float g[], float p[], float x[],float *f, float stpmax, int *check, float (*func)(float []));int check,i,its,j;float den,fac,fad,fae,fp,stpmax,sum=0.0,sumdg,sumxi,temp,test;float *dg,*g,*hdg,**hessin,*pnew,*xi;dg=vector(1,n);g=vector(1,n);hdg=vector(1,n);hessin=matrix(1,n,1,n);pnew=vector(1,n);xi=vector(1,n);fp=(*func)(p);Calculate starting function value and gra(*dfunc)(p,g);dient,for (i=1;i<=n;i++) {and initialize the inverse Hessian to thefor (j=1;j<=n;j++) hessin[i][j]=0.0;unit matrix.hessin[i][i]=1.0;xi[i] = -g[i];Initial line direction.sum += p[i]*p[i];}stpmax=STPMX*FMAX(sqrt(sum),(float)n);10.7 Variable Metric Methods in Multidimensions429for (its=1;its<=ITMAX;its++) {Main loop over the iterations.*iter=its;lnsrch(n,p,fp,g,xi,pnew,fret,stpmax,&check,func);The new function evaluation occurs in lnsrch; save the function value in fp for thenext line search.
It is usually safe to ignore the value of check.fp = *fret;for (i=1;i<=n;i++) {xi[i]=pnew[i]-p[i];Update the line direction,p[i]=pnew[i];and the current point.}test=0.0;Test for convergence on ∆x.for (i=1;i<=n;i++) {temp=fabs(xi[i])/FMAX(fabs(p[i]),1.0);if (temp > test) test=temp;}if (test < TOLX) {FREEALLreturn;}for (i=1;i<=n;i++) dg[i]=g[i];Save the old gradient,(*dfunc)(p,g);and get the new gradient.test=0.0;Test for convergence on zero gradient.den=FMAX(*fret,1.0);for (i=1;i<=n;i++) {temp=fabs(g[i])*FMAX(fabs(p[i]),1.0)/den;if (temp > test) test=temp;}if (test < gtol) {FREEALLreturn;}for (i=1;i<=n;i++) dg[i]=g[i]-dg[i]; Compute difference of gradients,for (i=1;i<=n;i++) {and difference times current matrix.hdg[i]=0.0;for (j=1;j<=n;j++) hdg[i] += hessin[i][j]*dg[j];}fac=fae=sumdg=sumxi=0.0;Calculate dot products for the denomifor (i=1;i<=n;i++) {nators.fac += dg[i]*xi[i];fae += dg[i]*hdg[i];sumdg += SQR(dg[i]);sumxi += SQR(xi[i]);}if (fac > sqrt(EPS*sumdg*sumxi)) {Skip update if fac not sufficiently posifac=1.0/fac;tive.fad=1.0/fae;The vector that makes BFGS different from DFP:for (i=1;i<=n;i++) dg[i]=fac*xi[i]-fad*hdg[i];for (i=1;i<=n;i++) {The BFGS updating formula:for (j=i;j<=n;j++) {hessin[i][j] += fac*xi[i]*xi[j]-fad*hdg[i]*hdg[j]+fae*dg[i]*dg[j];hessin[j][i]=hessin[i][j];}}}for (i=1;i<=n;i++) {Now calculate the next direction to go,xi[i]=0.0;for (j=1;j<=n;j++) xi[i] -= hessin[i][j]*g[j];}}and go back for another iteration.nrerror("too many iterations in dfpmin");FREEALL}430Chapter 10.Minimization or Maximization of FunctionsQuasi-Newton methods like dfpmin work well with the approximate lineminimization done by lnsrch.
The routines powell (§10.5) and frprmn (§10.6),however, need more accurate line minimization, which is carried out by the routinelinmin.Advanced Implementations of Variable Metric MethodsAlthough rare, it can conceivably happen that roundoff errors cause the matrix Hi tobecome nearly singular or non-positive-definite. This can be serious, because the supposedsearch directions might then not lead downhill, and because nearly singular Hi ’s tend to givesubsequent Hi ’s that are also nearly singular.There is a simple fix for this rare problem, the same as was mentioned in §10.4: In caseof any doubt, you should restart the algorithm at the claimed minimum point, and see if itgoes anywhere.
Simple, but not very elegant. Modern implementations of variable metricmethods deal with the problem in a more sophisticated way.Instead of building up an approximation to A−1 , it is possible to build up an approximationof A itself. Then, instead of calculating the left-hand side of (10.7.4) directly, one solvesthe set of linear equationsA · (xm − xi ) = −∇f (xi )(10.7.11)At first glance this seems like a bad idea, since solving (10.7.11) is a process of orderN 3 — and anyway, how does this help the roundoff problem? The trick is not to store A butrather a triangular decomposition of A, its Cholesky decomposition (cf.
§2.9). The updatingformula used for the Cholesky decomposition of A is of order N 2 and can be arranged toguarantee that the matrix remains positive definite and nonsingular, even in the presence offinite roundoff. This method is due to Gill and Murray [1,2] .CITED REFERENCES AND FURTHER READING:Dennis, J.E., and Schnabel, R.B. 1983, Numerical Methods for Unconstrained Optimization andNonlinear Equations (Englewood Cliffs, NJ: Prentice-Hall). [1]Jacobs, D.A.H. (ed.) 1977, The State of the Art in Numerical Analysis (London: Academic Press),Chapter III.1, §§3–6 (by K.
W. Brodlie). [2]Polak, E. 1971, Computational Methods in Optimization (New York: Academic Press), pp. 56ff. [3]Acton, F.S. 1970, Numerical Methods That Work; 1990, corrected edition<b>Текст обрезан, так как является слишком большим</b>.