Shampine, Allen, Pruess - Fundamentals of Numerical Computing (523185), страница 29
Текст из файла (страница 29)
Then from (4.5)because the tangent is nearly horizontal. Thus, a reasonably good guess for a root leadsto a much worse approximation. Also, notice that if xi is much larger than 1, thenTo the same degree of approximation,which says that we creep back to the root at 1 at a rate considerably slower thanbisection. What is happening here is that the roots of this equation are the roots of4.1 BISECTION, NEWTON’S METHOD, AND THE SECANT RULE147unity. The 20 simple roots lie on a circle of radius 1 in the complex plane.
The rootsare well separated, but when “seen” from as far away as 26000, they appear to form aroot of multiplicity 20, as argued earlier in this chapter. Newton’s method convergeslinearly with constant 19/20 to a root of multiplicity 20, and that is exactly what isobserved when the iterates are far from the roots.Much is made of the quadratic convergence of Newton’s method, but it is quadratically convergent only for simple roots. Even for simple roots, this example showsthat quadratic convergence is observed only when “sufficiently” close to a root.
And,of course, when “too” close to a root, finite precision arithmetic affects the rate ofconvergence.nLet us now consider the behavior of Newton’s method and the secant rule at limiting precision. Figure 1.2 shows an interval of machine-representable numbers aboutα on which computed values of the function vary erratically in sign and magnitude.These represent the smallest values the computed f(x) can assume when formed in theworking precision, and quite frequently they have no digits in agreement with the truef(x). For a simple root, |f´(α)| is not zero, and if the root is not ill-conditioned, thederivative is not small.
As a consequence, the computed value of the first derivativeordinarily has a few digits that are correct. It then follows that the correction to xicomputed by Newton’s method is very small at limiting precision and the next iteratestays near the root even if it moves away because f(xi ) has the wrong sign. This is likebisection and is what is meant by the term “stable at limiting precision.” The secantrule behaves differently. The correction to the current iterate,has unpredictable values at limiting precision. Clearly it is possible that the next iteratelie far outside the interval of limiting precision.There is another way to look at the secant rule that is illuminating.
One approachto finding a root α of f(x) is to interpolate several values yi = f(xi ) by a polynomial P(x) and then approximate α by a root of this polynomial. The secant rule isthe case of linear interpolation. Higher order interpolation provides a more accurateapproximation to f(x), so it is plausible that it would lead to a scheme with a higherrate of convergence. This turns out to be true, although only the increase from linear to quadratic interpolation might be thought worth the trouble. The scheme basedon quadratic interpolation is called Muller’s method.
Muller’s method is a little moretrouble than the secant rule, because it involves computing the roots of a quadratic, andit converges somewhat faster. There are some practical differences. For all methodsbased on interpolation by a polynomial of degree higher than 1, there is a question ofwhich root to take as the next iterate xi+1.
To get convergence, the root closest to xishould be used. An important difference between Muller’s method and the secant ruleis due to the possibility of a real quadratic polynomial having complex roots. Evenwith real iterates and a real function, Muller’s method might produce a complex iterate. If complex roots are interesting, this is a virtue, but if only real roots are desired,it is a defect. The secant rule can be used to compute complex roots, but it will notleave the real line spontaneously like Muller’s method. The M ATHCAD documentation148CHAPTER 4ROOTS OF NONLINEAR EQUATIONSpoints out that its code, which is based on the secant rule, can be used to compute theroots of a complex-valued function by starting with a guess that is complex.An alternative to direct interpolation is inverse interpolation.
This approach isbased on interpolating the inverse function f--1(y) of y = f(x). To prevent confusionwith the reciprocal of f(x), the inverse function will be denoted here by G(y). Weassume that we have at our disposal only a procedure for evaluating f(x). However,each value f(xi ) = yi provides a value of the inverse function because by definitionxi = G(yi ). Finding a root of f(x) corresponds to evaluating the inverse function: aroot α satisfies f(α) = 0, hence α = G(0). This is a familiar task that we solve in afamiliar way. We are able to evaluate a function G(y) at certain points yi and we wishto approximate the value G(0).
This is done by approximating G(y) with a polynomialinterpolant P(y) and then evaluating P(0)a. Of course, it is easy to interpolateG(y) by whatever degree polynomial we want. However, as with direct interpolation,most of the improvement to the rate of convergence is gained on going to quadraticinterpolation. An interesting fact left to an exercise is that the method derived fromlinear inverse interpolation is the same as that derived from linear direct interpolation,namely the secant rule.
Examination of Figure 4.4 helps in understanding this. On theother hand, quadratic direct and inverse interpolation are quite different. For one thing,quadratic inverse interpolation cannot produce a complex iterate when the function andthe previous iterates are real.Inverse interpolation is attractive because of its simplicity. Unfortunately, there isa fundamental difficulty—f might not have an inverse on the interval of interest. Thisis familiar from the trigonometric functions. For instance, the function y = sin(x) doesnot have an inverse for all x. To invert the relationship with x = arcsin(y) the argumentx is restricted to an interval on which sin(x) is monotone.
In a plot like Figure 4.2, theinverse of f is found by “turning the picture on its side.” Only on an interval wheref(x) is monotone does the inverse function exist as a single-valued function. At asimple root a, f´(a)0, so there is an interval containing a on whichis monotone] and G(y) exists. So, the usual kind of result is obtained. If we can startclose enough to a simple root, there is an inverse function and we can compute theroot with inverse interpolation.
When some distance from a root or when the root ismultiple, there may be serious difficulties with inverse interpolation because then thefunction does not have an inverse on the relevant interval.With the exception of bisection, the methods we have studied are guaranteed toconverge only when sufficiently close to a zero of a function that is sufficiently smooth.This is rather unsatisfactory when we have no idea about where the roots are. On theother hand, often we are interested in the root closest to a specific value. It is by nomeans certain that the methods will converge from this value to the nearest root sincethat depends on just how close the value is to the root, but it is a useful characteristic ofthe methods. In contrast, if the initial bracket given a bisection code contains severalroots, the code might locate any one of them.
The technique of continuation is usefulwhen it is hard to find a starting guess good enough to get convergence to a particularroot, or to any root. Many problems depend on a parameter λ and it may be that zeroscan be computed easily for some values of the parameter. The family x exp(-x) - γ = 0is an example. Solutions are desired for values γ > 0, but it is obvious that a = 0is a root when γ = 0. It is generally the case, although not always, that the roots4.1 BISECTION, NEWTON’S METHOD, AND THE SECANT RULE149α(λ) depend continuously on λ.
The idea of continuation is to solve a sequence ofproblems for values of λ ranging from one for which the problem is solved easily tothe desired value of the parameter. This may not be just an artifice; you may actuallywant solutions for a range of parameter values. Roots obtained with a value λ´ areused as guesses for the next value λ´´. If the next value is not too different, the guesseswill be good enough to obtain convergence.
In the case of the example, the smallestpositive root is desired, so starting with α(0) = 0 should result in the desired root α(λ).When there is no obvious parameter in f(x) = 0, one can be introduced artificially. Acommon embedding of the problem in a family is 0 = F (x,λ) = f(x) + (λ - 1) f(x0).By construction, x0 is a root of this equation for λ = 0 and the original equation isobtained for λ = 1. Another embedding is 0 = F( x,λ) = λf(x) + (1 - λ) (x - x0 ),which is also to be started with the root x0 for λ = 0 and a sequence of problemssolved for λ increasing to 1.A virtue of bisection is that it is easy to decide when an approximate root is goodenough.
The convergence of Newton’s method, the secant rule, quadratic inverse interpolation, and the like cannot be judged in this way. Many codes use the size of theresidual for this purpose, but this is hazardous for reasons already studied. Superlinearconvergence provides another way to decide convergence. When the iterate xi is sufficiently close to a root α, superlinear convergence implies that the next iterate is muchcloser,Because of this, the error of xi can be approximated byThis is computationally convenient for if the estimated error is too large, xi+1 is available for another iteration.
In the case of Newton’s method, this estimate of the errorisThis estimate might be described as a natural scaling of the residual f (xi). If xi passesa convergence test based on superlinear convergence, it is assumed that xi+1 is a ratherbetter approximation to α, so why not use it as the answer? Reporting xi+1 as theanswer is called local extrapolation.EXERCISES4.1 The residual of an alleged root r of F(x) = 0 is F(r).One often sees the statement that a residual is “small,”so the approximate root must be “good.” Is this reliable? What role does scaling play?4.2 How are simple and multiple roots distinguishedgraphically? Interpret graphically how well the rootsare determined.