Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C, страница 101
Описание файла
PDF-файл из архива "Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C", который расположен в категории "". Всё это находится в предмете "численные методы" из 6 семестр, которые можно найти в файловом архиве . Не смотря на прямую связь этого архива с , его также можно найти и в других разделах. Архив можно найти в разделе "книги и методические указания", в предмете "численные методы и алгоритмы" в общих файлах.
Просмотр PDF-файла онлайн
Текст 101 страницы из PDF
The convergence toleranceon the function value is input as ftol. Returned quantities are p (the location of the minimum),iter (the number of iterations that were performed), and fret (the minimum value of thefunction). The routine linmin is called to perform line minimizations.{void linmin(float p[], float xi[], int n, float *fret,float (*func)(float []));int j,its;float gg,gam,fp,dgg;float *g,*h,*xi;g=vector(1,n);h=vector(1,n);xi=vector(1,n);fp=(*func)(p);Initializations.(*dfunc)(p,xi);for (j=1;j<=n;j++) {g[j] = -xi[j];xi[j]=h[j]=g[j];}for (its=1;its<=ITMAX;its++) {Loop over iterations.*iter=its;linmin(p,xi,n,fret,func);Next statement is the normal return:if (2.0*fabs(*fret-fp) <= ftol*(fabs(*fret)+fabs(fp)+EPS)) {FREEALLreturn;}fp= *fret;(*dfunc)(p,xi);dgg=gg=0.0;for (j=1;j<=n;j++) {424/*Chapter 10.Minimization or Maximization of Functionsgg += g[j]*g[j];dgg += xi[j]*xi[j];*/dgg += (xi[j]+g[j])*xi[j];}if (gg == 0.0) {FREEALLreturn;}gam=dgg/gg;for (j=1;j<=n;j++) {g[j] = -xi[j];xi[j]=h[j]=g[j]+gam*h[j];}This statement for Fletcher-Reeves.This statement for Polak-Ribiere.Unlikely.
If gradient is exactly zero thenwe are already done.}nrerror("Too many iterations in frprmn");}Note on Line Minimization Using DerivativesKindly reread the last part of §10.5. We here want to do the same thing, butusing derivative information in performing the line minimization.The modified version of linmin, called dlinmin, and its required companionroutine df1dim follow:#include "nrutil.h"#define TOL 2.0e-4Tolerance passed to dbrent.int ncom;Global variables communicate with df1dim.float *pcom,*xicom,(*nrfunc)(float []);void (*nrdfun)(float [], float []);void dlinmin(float p[], float xi[], int n, float *fret, float (*func)(float []),void (*dfunc)(float [], float []))Given an n-dimensional point p[1..n] and an n-dimensional direction xi[1..n], moves andresets p to where the function func(p) takes on a minimum along the direction xi from p,and replaces xi by the actual vector displacement that p was moved.
Also returns as fretthe value of func at the returned location p. This is actually all accomplished by calling theroutines mnbrak and dbrent.{float dbrent(float ax, float bx, float cx,float (*f)(float), float (*df)(float), float tol, float *xmin);float f1dim(float x);float df1dim(float x);void mnbrak(float *ax, float *bx, float *cx, float *fa, float *fb,float *fc, float (*func)(float));int j;float xx,xmin,fx,fb,fa,bx,ax;ncom=n;Define the global variables.pcom=vector(1,n);xicom=vector(1,n);nrfunc=func;nrdfun=dfunc;for (j=1;j<=n;j++) {pcom[j]=p[j];xicom[j]=xi[j];}ax=0.0;Initial guess for brackets.xx=1.0;mnbrak(&ax,&xx,&bx,&fa,&fx,&fb,f1dim);10.7 Variable Metric Methods in Multidimensions425*fret=dbrent(ax,xx,bx,f1dim,df1dim,TOL,&xmin);for (j=1;j<=n;j++) {Construct the vector results to return.xi[j] *= xmin;p[j] += xi[j];}free_vector(xicom,1,n);free_vector(pcom,1,n);}#include "nrutil.h"extern int ncom;Defined in dlinmin.extern float *pcom,*xicom,(*nrfunc)(float []);extern void (*nrdfun)(float [], float []);float df1dim(float x){int j;float df1=0.0;float *xt,*df;xt=vector(1,ncom);df=vector(1,ncom);for (j=1;j<=ncom;j++) xt[j]=pcom[j]+x*xicom[j];(*nrdfun)(xt,df);for (j=1;j<=ncom;j++) df1 += df[j]*xicom[j];free_vector(df,1,ncom);free_vector(xt,1,ncom);return df1;}CITED REFERENCES AND FURTHER READING:Polak, E.
1971, Computational Methods in Optimization (New York: Academic Press), §2.3. [1]Jacobs, D.A.H. (ed.) 1977, The State of the Art in Numerical Analysis (London: AcademicPress), Chapter III.1.7 (by K.W. Brodlie). [2]Stoer, J., and Bulirsch, R. 1980, Introduction to Numerical Analysis (New York: Springer-Verlag),§8.7.10.7 Variable Metric Methods inMultidimensionsThe goal of variable metric methods, which are sometimes called quasi-Newtonmethods, is not different from the goal of conjugate gradient methods: to accumulateinformation from successive line minimizations so that N such line minimizationslead to the exact minimum of a quadratic form in N dimensions. In that case, themethod will also be quadratically convergent for more general smooth functions.Both variable metric and conjugate gradient methods require that you are able tocompute your function’s gradient, or first partial derivatives, at arbitrary points.
Thevariable metric approach differs from the conjugate gradient in the way that it stores426Chapter 10.Minimization or Maximization of Functionsand updates the information that is accumulated. Instead of requiring intermediatestorage on the order of N , the number of dimensions, it requires a matrix of sizeN × N .
Generally, for any moderate N , this is an entirely trivial disadvantage.On the other hand, there is not, as far as we know, any overwhelming advantagethat the variable metric methods hold over the conjugate gradient techniques, exceptperhaps a historical one. Developed somewhat earlier, and more widely propagated,the variable metric methods have by now developed a wider constituency of satisfiedusers. Likewise, some fancier implementations of variable metric methods (goingbeyond the scope of this book, see below) have been developed to a greater level ofsophistication on issues like the minimization of roundoff error, handling of specialconditions, and so on.
We tend to use variable metric rather than conjugate gradient,but we have no reason to urge this habit on you.Variable metric methods come in two main flavors. One is the Davidon-FletcherPowell (DFP) algorithm (sometimes referred to as simply Fletcher-Powell). Theother goes by the name Broyden-Fletcher-Goldfarb-Shanno (BFGS). The BFGS andDFP schemes differ only in details of their roundoff error, convergence tolerances,and similar “dirty” issues which are outside of our scope [1,2] . However, it hasbecome generally recognized that, empirically, the BFGS scheme is superior in thesedetails.
We will implement BFGS in this section.As before, we imagine that our arbitrary function f(x) can be locally approximated by the quadratic form of equation (10.6.1). We don’t, however, have anyinformation about the values of the quadratic form’s parameters A and b, exceptinsofar as we can glean such information from our function evaluations and lineminimizations.The basic idea of the variable metric method is to build up, iteratively, a goodapproximation to the inverse Hessian matrix A−1 , that is, to construct a sequenceof matrices Hi with the property,lim Hi = A−1i→∞(10.7.1)Even better if the limit is achieved after N iterations instead of ∞.The reason that variable metric methods are sometimes called quasi-Newtonmethods can now be explained.
Consider finding a minimum by using Newton’smethod to search for a zero of the gradient of the function. Near the current pointxi , we have to second orderf(x) = f(xi ) + (x − xi ) · ∇f(x i ) + 12 (x − xi ) · A · (x − xi )(10.7.2)so∇f(x) = ∇f(xi ) + A · (x − xi )(10.7.3)In Newton’s method we set ∇f(x) = 0 to determine the next iteration point:x − xi = −A−1 · ∇f(x i )(10.7.4)The left-hand side is the finite step we need take to get to the exact minimum; theright-hand side is known once we have accumulated an accurate H ≈ A−1 .The “quasi” in quasi-Newton is because we don’t use the actual Hessian matrixof f, but instead use our current approximation of it.
This is often better than10.7 Variable Metric Methods in Multidimensions427using the true Hessian. We can understand this paradoxical result by considering thedescent directions of f at xi . These are the directions p along which f decreases:∇f · p < 0. For the Newton direction (10.7.4) to be a descent direction, we must have∇f(xi ) · (x − xi ) = −(x − xi ) · A · (x − xi ) < 0(10.7.5)that is, A must be positive definite. In general, far from a minimum, we have noguarantee that the Hessian is positive definite. Taking the actual Newton step withthe real Hessian can move us to points where the function is increasing in value.The idea behind quasi-Newton methods is to start with a positive definite, symmetricapproximation to A (usually the unit matrix) and build up the approximating Hi ’sin such a way that the matrix Hi remains positive definite and symmetric. Far fromthe minimum, this guarantees that we always move in a downhill direction.
Close tothe minimum, the updating formula approaches the true Hessian and we enjoy thequadratic convergence of Newton’s method.When we are not close enough to the minimum, taking the full Newton stepp even with a positive definite A need not decrease the function; we may movetoo far for the quadratic approximation to be valid. All we are guaranteed is thatinitially f decreases as we move in the Newton direction. Once again we can usethe backtracking strategy described in §9.7 to choose a step along the direction ofthe Newton step p, but not necessarily all the way.We won’t rigorously derive the DFP algorithm for taking Hi into Hi+1 ; youcan consult [3] for clear derivations.