c15-5 (779589), страница 2
Текст из файла (страница 2)
It also has an additional possibility of being ignorablysmall in practice: The term multiplying the second derivative in equation (15.5.7)is [yi − y(xi ; a)]. For a successful model, this term should just be the randommeasurement error of each point. This error can have either sign, and should ingeneral be uncorrelated with the model. Therefore, the second derivative terms tendto cancel out when summed over i.Inclusion of the second-derivative term can in fact be destabilizing if the modelfits badly or is contaminated by outlier points that are unlikely to be offset bycompensating points of opposite sign. From this point on, we will always use asthe definition of αkl the formula684Chapter 15.Modeling of Datathe components of [α] and you see that there is only one obvious quantity with thesedimensions, and that is 1/αkk , the reciprocal of the diagonal element.
So that mustset the scale of the constant. But that scale might itself be too big. So let’s dividethe constant by some (nondimensional) fudge factor λ, with the possibility of settingλ 1 to cut down the step. In other words, replace equation (15.5.10) by1βlλαllorλ αll δal = βl(15.5.12)It is necessary that all be positive, but this is guaranteed by definition (15.5.11) —another reason for adopting that equation.Marquardt’s second insight is that equations (15.5.12) and (15.5.9) can becombined if we define a new matrix α0 by the following prescriptionα0jj ≡ αjj (1 + λ)α0jk ≡ αjk(j 6= k)(15.5.13)and then replace both (15.5.12) and (15.5.9) byMXα0kl δal = βk(15.5.14)l=1When λ is very large, the matrix α0 is forced into being diagonally dominant, soequation (15.5.14) goes over to be identical to (15.5.12). On the other hand, as λapproaches zero, equation (15.5.14) goes over to (15.5.9).Given an initial guess for the set of fitted parameters a, the recommendedMarquardt recipe is as follows:• Compute χ2 (a).• Pick a modest value for λ, say λ = 0.001.• (†) Solve the linear equations (15.5.14) for δa and evaluate χ2 (a + δa).• If χ2 (a + δa) ≥χ2 (a), increase λ by a factor of 10 (or any othersubstantial factor) and go back to (†).• If χ2 (a + δa) < χ2 (a), decrease λ by a factor of 10, update the trialsolution a ← a + δa, and go back to (†).Also necessary is a condition for stopping.
Iterating to convergence (to machineaccuracy or to the roundoff limit) is generally wasteful and unnecessary since theminimum is at best only a statistical estimate of the parameters a. As we will seein §15.6, a change in the parameters that changes χ2 by an amount 1 is neverstatistically meaningful.Furthermore, it is not uncommon to find the parameters wanderingaround near the minimum in a flat valley of complicated topography. The reason is that Marquardt’s method generalizes the method of normal equations (§15.4),hence has the same problem as that method with regard to near-degeneracy of theminimum.
Outright failure by a zero pivot is possible, but unlikely. More often,a small pivot will generate a large correction which is then rejected, the value ofλ being then increased. For sufficiently large λ the matrix [α0 ] is positive definiteand can have no small pivots. Thus the method does tend to stay away from zeroSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).δal =15.5 Nonlinear Models685[C] ≡ [α]−1(15.5.15)which, as before, is the estimated covariance matrix of the standard errors in thefitted parameters a (see next section).The following pair of functions encodes Marquardt’s method for nonlinearparameter estimation. Much of the organization matches that used in lfit of §15.4.In particular the array ia[1..ma] must be input with components one or zerocorresponding to whether the respective parameter values a[1..ma] are to be fittedfor or held fixed at their input values, respectively.The routine mrqmin performs one iteration of Marquardt’s method.
It is firstcalled (once) with alamda < 0, which signals the routine to initialize. alamda is seton the first and all subsequent calls to the suggested value of λ for the next iteration;a and chisq are always given back as the best parameters found so far and their χ2 .When convergence is deemed satisfactory, set alamda to zero before a final call.The matrices alpha and covar (which were used as workspace in all previous calls)will then be set to the curvature and covariance matrices for the converged parametervalues. The arguments alpha, a, and chisq must not be modified between calls,nor should alamda be, except to set it to zero for the final call.
When an uphillstep is taken, chisq and a are given back with their input (best) values, but alamdais set to an increased value.The routine mrqmin calls the routine mrqcof for the computation of the matrix[α] (equation 15.5.11) and vector β (equations 15.5.6 and 15.5.8). In turn mrqcofcalls the user-supplied routine funcs(x,a,y,dyda), which for input values x ≡ xiand a ≡ a calculates the model function y ≡ y(xi ; a) and the vector of derivativesdyda ≡ ∂y/∂ak .#include "nrutil.h"void mrqmin(float x[], float y[], float sig[], int ndata, float a[], int ia[],int ma, float **covar, float **alpha, float *chisq,void (*funcs)(float, float [], float *, float [], int), float *alamda)Levenberg-Marquardt method, attempting to reduce the value χ2 of a fit between a set of datapoints x[1..ndata], y[1..ndata] with individual standard deviations sig[1..ndata],and a nonlinear function dependent on ma coefficients a[1..ma].
The input array ia[1..ma]indicates by nonzero entries those components of a that should be fitted for, and by zeroentries those components that should be held fixed at their input values. The program returns current best-fit values for the parameters a[1..ma], and χ2 = chisq. The arrayscovar[1..ma][1..ma], alpha[1..ma][1..ma] are used as working space during mostiterations. Supply a routine funcs(x,a,yfit,dyda,ma) that evaluates the fitting functionyfit, and its derivatives dyda[1..ma] with respect to the fitting parameters a at x. Onthe first call provide an initial guess for the parameters a, and set alamda<0 for initialization(which then sets alamda=.001).
If a step succeeds chisq becomes smaller and alamda decreases by a factor of 10. If a step fails alamda grows by a factor of 10. You must call thisSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).pivots, but at the cost of a tendency to wander around doing steepest descent invery un-steep degenerate valleys.These considerations suggest that, in practice, one might as well stop iteratingon the first or second occasion that χ2 decreases by a negligible amount, say eitherless than 0.01 absolutely or (in case roundoff prevents that being reached) somefractional amount like 10−3 .
Don’t stop after a step where χ2 increases: That onlyshows that λ has not yet adjusted itself optimally.Once the acceptable minimum has been found, one wants to set λ = 0 andcompute the matrix686Chapter 15.Modeling of Dataif (*alamda < 0.0) {Initialization.atry=vector(1,ma);beta=vector(1,ma);da=vector(1,ma);for (mfit=0,j=1;j<=ma;j++)if (ia[j]) mfit++;oneda=matrix(1,mfit,1,1);*alamda=0.001;mrqcof(x,y,sig,ndata,a,ia,ma,alpha,beta,chisq,funcs);ochisq=(*chisq);for (j=1;j<=ma;j++) atry[j]=a[j];}for (j=1;j<=mfit;j++) {Alter linearized fitting matrix, by augmenting difor (k=1;k<=mfit;k++) covar[j][k]=alpha[j][k];agonal elements.covar[j][j]=alpha[j][j]*(1.0+(*alamda));oneda[j][1]=beta[j];}gaussj(covar,mfit,oneda,1);Matrix solution.for (j=1;j<=mfit;j++) da[j]=oneda[j][1];if (*alamda == 0.0) {Once converged, evaluate covariance matrix.covsrt(covar,ma,ia,mfit);covsrt(alpha,ma,ia,mfit);Spread out alpha to its full size too.free_matrix(oneda,1,mfit,1,1);free_vector(da,1,ma);free_vector(beta,1,ma);free_vector(atry,1,ma);return;}for (j=0,l=1;l<=ma;l++)Did the trial succeed?if (ia[l]) atry[l]=a[l]+da[++j];mrqcof(x,y,sig,ndata,atry,ia,ma,covar,da,chisq,funcs);if (*chisq < ochisq) {Success, accept the new solution.*alamda *= 0.1;ochisq=(*chisq);for (j=1;j<=mfit;j++) {for (k=1;k<=mfit;k++) alpha[j][k]=covar[j][k];beta[j]=da[j];}for (l=1;l<=ma;l++) a[l]=atry[l];} else {Failure, increase alamda and return.*alamda *= 10.0;*chisq=ochisq;}}Notice the use of the routine covsrt from §15.4.
This is merely for rearrangingthe covariance matrix covar into the order of all ma parameters. The above routinealso makes use ofSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).routine repeatedly until convergence is achieved. Then, make one final call with alamda=0, sothat covar[1..ma][1..ma] returns the covariance matrix, and alpha the curvature matrix.(Parameters held fixed will return zero covariances.){void covsrt(float **covar, int ma, int ia[], int mfit);void gaussj(float **a, int n, float **b, int m);void mrqcof(float x[], float y[], float sig[], int ndata, float a[],int ia[], int ma, float **alpha, float beta[], float *chisq,void (*funcs)(float, float [], float *, float [], int));int j,k,l;static int mfit;static float ochisq,*atry,*beta,*da,**oneda;15.5 Nonlinear Models687#include "nrutil.h"dyda=vector(1,ma);for (j=1;j<=ma;j++)if (ia[j]) mfit++;for (j=1;j<=mfit;j++) {Initialize (symmetric) alpha, beta.for (k=1;k<=j;k++) alpha[j][k]=0.0;beta[j]=0.0;}*chisq=0.0;for (i=1;i<=ndata;i++) {Summation loop over all data.(*funcs)(x[i],a,&ymod,dyda,ma);sig2i=1.0/(sig[i]*sig[i]);dy=y[i]-ymod;for (j=0,l=1;l<=ma;l++) {if (ia[l]) {wt=dyda[l]*sig2i;for (j++,k=0,m=1;m<=l;m++)if (ia[m]) alpha[j][++k] += wt*dyda[m];beta[j] += dy*wt;}}*chisq += dy*dy*sig2i;And find χ2 .}for (j=2;j<=mfit;j++)Fill in the symmetric side.for (k=1;k<j;k++) alpha[k][j]=alpha[j][k];free_vector(dyda,1,ma);}ExampleThe following function fgauss is an example of a user-supplied functionfuncs.















