Shampine, Allen, Pruess - Fundamentals of Numerical Computing (523185), страница 27
Текст из файла (страница 27)
This is seen by expanding f(x) in a Taylor seriesabout a to obtain134135wherelies between x and α. Using (4.3), this simplifies to(4.4)If we take g(x) =then g(α) =We shall always assume that f(x) is sufficiently smooth near a that we can use (4.4) instead of the basicdefinition (4.2) and in particular, that roots are of integer multiplicity.According to the definition of a root α, the graph of f(x) touches the x axis at α(Figure 4.1).
For a root of multiplicity m, the function f(m) (x) does not change signnear a because it is continuous and f(m) (α)0. This observation and the expression(4.4) show that if m is even, f(x) is tangent to the x axis at α but does not cross thereand that if m is odd, f(x) crosses the axis at a.A family of problems taken up in Case Study 4 has the form 0 = f(x) = F(x) - γfor a parameter γ > 0. Specifically, F(x) = xexp(-x) and a representative value of γis 0.07. Curve sketching as in an introductory calculus course is often used to locateroots and determine their multiplicity.
For this family, asand asIt is seen from the first derivative, f´(x) = (1 - x) e x p (-x), thatf is strictly increasing for x < 1 and strictly decreasing for x > 1. At the extremum,f(1) = e-l - γ is positive for γ = 0.07. Also, f(0) = −γ is negative. These factsand the continuity of f tell us that when γ = 0.07, there are exactly two roots thatare simple, One is in (0,l) and the other is greater than 1. In general, for a root off(x) to be multiple, f’(x) must vanish at the root.
So, wherever the function is strictlyincreasing or strictly decreasing, any root it might have must be simple. For the familyof functions, the fact that f´(x) = 0 only at x = 1 means that this is the only pointwhere the function might have a multiple root. It is easily seen that there is a multipleroot only when γ = e-l and the root then is of multiplicity 2 (a double root).An approximate root z that results in a computed value f(z) = 0 is not unusual,especially when it approximates a multiple root a. After all, the aim is to find a z thatmakes f(z) vanish.
When the root is of multiplicity m, f(z)( z - α)mg( α). Somenumbers will help us to understand this. For a root of high multiplicity like m = 10, anapproximation of modest accuracy like z = α + 10-4 leads to f(z) = 10-40g (α). Thenif |g(α)| < 1, the function f(z) underflows in IEEE single precision arithmetic.As we shall see, standard methods are not as effective for multiple roots as theyare for simple roots. To understand the performance of codes based on these methods,it is necessary to appreciate that roots that are close together “look” like multiple roots.Suppose that f(x) has the two simple roots α1α2.
The basic definition and a littleargument show that f(x) = (x - α1 ) (x - α2 )G(x) for a G(x) that does not vanish ateither root. This expression can be rewritten asFor x far from the roots in the sense that |x - α1| >> |α2 - α1|, the pair of simple roots“looks” like a double root because136CHAPTER 4 ROOTS OF NONLINEAR EQUATIONSFigure 4.1 f(x) = tanx for 0 < x < 10.A concept from the theory of complex variables related to that of a root of multiplicity m is a pole of multiplicity m. If we can writewhere G(α)0, then we say that a is a pole of F(x) of multiplicity m. It is easy tosee that if α is a root of f(x) of multiplicity m, then it is a pole of F(x) = l/f(x) ofthe same multiplicity, and vice versa.
A familiar example is tan(x ) = sin(x)/cos(x),plotted in Figure 4.1. This function has a root where sin(x) vanishes and a pole wherecos(x) vanishes. Functions change sign at poles of odd multiplicity, just as they do atroots of odd multiplicity.One difficulty in computing a root of f(x) = 0 is deciding when an approximationz is good enough. The residual f(z) seems an obvious way to assess the quality of anapproximate root.
MATHCAD does exactly this. It accepts z as a root when |f(z)| <TOL, with TOL = 10-3 by default. The trouble with a residual test is that there isno obvious measure of scale. Multiple roots present difficulties because the functionis nearly flat in a considerable interval about the root. The issue is related to theconditioning of a root, but also to the way we set up the equation.When we formulate a problem, we select a scaling. This may be no more thanchoosing a system of units, but often we use the fact that any zero of f(x) is a zero ofF(x) = g(x) f(x).
Introducing a scaling g(x) can make quite a difference. For instance,the two problems sin(x) = 0 and F(x) = 10-38 sin(x) = 0 are mathematically equivalent, but the second is badly scaled because forming F(z) for even a moderately goodapproximate root z will result in underflow in single precision IEEE arithmetic. Oftenwe scale problems without giving any special thought to the matter, but a good scalecan be quite helpful.
It is quite a useful device for dealing with real or apparent singularities. The function f(x) = sin(x)/x is perfectly well behaved at x = 0 (it is analytic),but it has an apparent singularity there and some care is needed in its evaluation. This4.1 BISECTION, NEWTON’S METHOD, AND THE SECANT RULE137Figure 4.2 Graphical interpretation of roots.can be circumvented by calculating the roots of the scaled function F(x) = xf (x). Itmust be kept in mind that as with this example, F(x) has all the roots of f(x), but itmight pick up additional roots from g(x). A more substantial example is furnished byan equation to be solved in an exercise:This function has a simple pole at all the points where cos(x) = cos(π/10) and anapparent singularity at x = 0.
Scaling this function with g(x) =makes computing the roots more straightforward.Sometimes a natural measure of scale is supplied by a coefficient in the equation.An example is provided by the family of problems f(x) = F(x) - γ with γ > 0. Justas when solving linear equations, the residual r = f(z) = F(z) - γ can be used in abackward error analysis. Obviously z is the exact solution of the problem 0 = F(x) - γ´ ,where γ´ = γ + r. If |r| is small compared to |γ|, then z is the exact solution of a problemclose to the given problem. For such problems we have a reasonable way to specifyhow small the residual ought to be.4.1 BISECTION, NEWTON’S METHOD, AND THE SECANT RULEIf a continuous function f(x) has opposite signs at points x = B and x = C, then it hasat least one zero in the interval with endpoints B and C.
The method of bisection (orbinary search) is based on this fact. If f(B) f (C) < 0, the function f(x) is evaluatedat the midpoint M = (B + C)/2 of the interval. If f(M) = 0, a zero has been found.Otherwise, either f(B) f (M) < 0 or f(M) f (C) < 0. In the first case there is at leastone zero between M and B, as in Figure 4.2, and in the second case there is at least one138CHAPTER 4ROOTS OF NONLINEAR EQUATIONSzero between C and M. In this way an interval containing a root is found that has halfthe length of the original interval. The procedure is repeated until a root is located towhatever accuracy is desired.In algorithmic form we have thebisection method:until |B - C| is sufficiently small or f(M) = 0 beginM := (B + C)/2if f(B)f(M) < 0 thenC := MelseB := Mend until.Example 4.1. When f(x) = x2 - 2, the equation (4.1) has the simple rootFor B = 0, C = 6, the bisection method produces [note: 0.16 (+01) means 0.16 × 101]Note the erratic behavior of the error, although the interval width |B - C| is halved atneach step.Bisection is often presented in programming books in this manner because it is anumerical algorithm that is both simple and useful.
A more penetrating study of themethod will make some points important to understanding many methods for computing zeros, points that we require as we develop an algorithm that attempts to get thebest from several methods.An interval [B,C] with f(B) f(C) < 0 is called a bracket. A graphical interpretationtells us somewhat more than just that f(x) has a root in the interval. Zeros of evenmultiplicity between B and C do not cause a sign change and zeros of odd multiplicitydo. If there were an even number of zeros of odd multiplicity between B and C, thesign changes would cancel out and f would have the same sign at both ends. Thus, iff(B) f(C) < 0, there must be an odd number of zeros of odd multiplicity and possibly4.1 BISECTION, NEWTON’S METHOD, AND THE SECANT RULE139some zeros of even multiplicity between B and C.
If we agree to count the number ofzeros according to their multiplicity (i.e., a zero of multiplicity m counts as m zeros),then we see that there are an odd number of zeros between B and C.A careful implementation of bisection takes into account a number of mattersraised in Chapter 1. There is a test for values off that are exactly zero; the test for achange of sign is not programmed as a test of f(B) f(C) < 0 because of the potentialfor underflow of the product; and the midpoint is computed as M = B + (B - C)/2because it is just as easy to compute and more accurate than M = (B + C)/2.We often try to find an approximate root z for which f(z) is as small as possible.In attempting this, the finite word length of the computer must be taken into accountand so must the details of the procedure for evaluating f.
Eventually even the signof the computed value may be incorrect. This is what is meant by limiting precision.Figure 1.2 shows the erratic size and sign of function values when the values are sosmall that the discrete nature of the floating point number system becomes important.If a computed function value has the wrong sign because the argument is very closeto a root, it may happen that the bracket selected in bisection does not contain a root.Even so, the approximations computed thereafter will stay in the neighborhood of theroot.