Heath - Scientific Computing (523150), страница 46
Текст из файла (страница 46)
Safe methods, suchas bisection, on the other hand, are slow and therefore costly. Which should one choose?A solution to this dilemma is provided by hybrid methods that combine features of bothtypes of methods. For example, one could use a rapidly convergent method but maintain abracket around the solution. If the next approximate solution given by the rapid algorithmfalls outside the bracketing interval, one would fall back on a safe method, such as bisection,5.3. SYSTEMS OF NONLINEAR EQUATIONS165for one iteration.
Then one can try the rapid method again on a smaller interval witha greater chance of success. Ultimately, the fast convergence rate should prevail. Thisapproach seldom does worse than the slow method and usually does much better.A popular implementation of such a hybrid approach was originally developed by Dekkerand van Wijngaarden and later improved by Brent.
This method, which is found in a numberof subroutine libraries, combines the safety of bisection with the faster convergence of inversequadratic interpolation. By avoiding Newton’s method, derivatives of the function are notrequired. A careful implementation must address a number of potential pitfalls in floatingpoint arithmetic, such as underflow, overflow, or an unrealistically tight user-supplied errortolerance.5.2.8Zeros of PolynomialsThus far we have discussed methods for finding a single zero of an arbitrary function inone dimension.
For the special case of a polynomial p(x) of degree n, one often may needto find all n of its zeros, which may be complex even if the coefficients are real. Severalapproaches are available:• Use one of the methods we have discussed, such as Newton’s method or Muller’s method,to find a single root x1 (keeping in mind that the root may be complex), then considerthe deflated polynomial p(x)/(x − x1 ) of degree one less. Repeat until all roots have beenfound.
It is a good idea to go back and refine each root using the original polynomialp(x) to avoid contamination due to rounding error in the deflated polynomials.• Form the companion matrix of the given polynomial and use an eigenvalue routine tocompute all of its eigenvalues (see Section 4.2.1). This method is reliable but relativelyinefficient in both work and storage.• Use a method designed specifically for finding all the roots of a polynomial.
Some ofthese methods are based on classical techniques for isolating the roots of a polynomial ina region of the complex plane, typically a union of discs, and then refining it in a mannersimilar in spirit to bisection until the roots have been localized as accurately as desired.Like bisection, such methods are guaranteed to work but are only linearly convergent.More rapidly convergent methods are available, however, such as that of Jenkins andTraub [136, 137], which is probably the most effective method available for finding all ofthe roots of a polynomial.The first two of these approaches are relatively simple to implement since they make useof other software for the primary subtasks.
The third approach is rather complicated, butfortunately good software implementations are available.5.3Systems of Nonlinear EquationsWe now consider nonlinear equations in more than one dimension. The multidimensionalcase is much more difficult than the scalar case for a variety of reasons:• A much wider range of behavior is possible, so that a theoretical analysis of the existenceand number of solutions is much more complex.166CHAPTER 5. NONLINEAR EQUATIONS• There is no simple way, in general, to guarantee convergence to the correct solution or tobracket the solution to produce an absolutely safe method.• Computational overhead increases rapidly with the dimension of the problem.Example 5.11 Systems of Nonlinear Equations.
Consider the system of nonlinearequations in two dimensions 2 x1 − x2 + γ0f (x) ==,−x1 + x22 + γ0where γ is a given parameter. Each of the two component equations defines a parabola,and any point where the two parabolas intersect is a solution to the system. Depending onthe particular value for γ, this system can have either zero, one, two, or four solutions, asillustrated in Fig. 5.8.....................................................................................................................................
................................................................................γ = 0.5....................................... ............ ............... ...................................................................... ............. ...... ................................................................................................................................................γ = −0.5............................................................. ...........................................................................................................................................................................γ = 0.25........ ..........................................................................................................................................
............. ........................................................................................................... ..........................................................................................γ = −1.0Figure 5.8: Some systems of nonlinear equations.5.3.1Fixed-Point IterationJust as for one dimension, a system of nonlinear equations can be converted into a fixedpoint problem, so we now briefly consider the multidimensional case. If g: Rn → Rn , thena fixed-point problem for g is to find an n-vector x such thatx = g(x).5.3.
SYSTEMS OF NONLINEAR EQUATIONS167The corresponding fixed-point iteration is simplyxk+1 = g(xk ),given some starting vector x0 .In one dimension, we saw that the convergence (and convergence rate) of fixed-pointiteration is determined by |g 0 (x∗ )|, where x∗ is the solution. In higher dimensions theanalogous condition isρ(G(x∗ )) < 1,where G(x) denotes the Jacobian matrix of g evaluated at x,{G(x)}ij =∂gi (x),∂xjand ρ denotes the spectral radius, which is defined to be the maximum modulus of theeigenvalues of the matrix (see Section 4.1).
If the foregoing condition is satisfied, then thefixed-point iteration converges if started close enough to the solution. (Note that testingthis condition does not necessarily require computing the eigenvalues, since ρ(A) ≤ kAkfor any matrix A and any matrix norm subordinate to a vector norm; see Exercise 4.11.)As with scalar systems, the smaller the spectral radius the faster the convergence rate. Inparticular, if G(x∗ ) = O, the zero matrix, then the convergence rate is at least quadratic.We will next see that Newton’s method is a systematic way of selecting g so that thishappens.5.3.2Newton’s MethodMany one-dimensional methods do not generalize directly to n dimensions. The most popular and powerful method that does generalize is Newton’s method, which in n dimensionshas the formxk+1 = xk − Jf (xk )−1 f (xk ),where Jf (x) is the Jacobian matrix of f ,{Jf (x)}ij =∂fi (x).∂xjIn practice, we do not explicitly invert Jf (xk ) but instead solve the linear systemJf (xk )sk = −f (xk ),then take as the next iteratexk+1 = xk + sk .In this sense, Newton’s method replaces a system of nonlinear equations with a system oflinear equations, but since the solutions of the two systems are not identical in general, theprocess must be repeated until the approximate solution is as accurate as desired.168CHAPTER 5.
NONLINEAR EQUATIONSExample 5.12 Newton’s Method. We illustrate Newton’s method by solving thenonlinear system x1 + 2x2 − 20f (x) ==,x21 + 4x22 − 40for which the Jacobian matrix is given by1Jf (x) =2x1If we take x0 = [ 12.8x22 ]T , then3f (x0 ) =131and Jf (x0 ) =22.16Solving the system122−3s0 =16−13−0.58 ]T , and hence−0.830x1 = x0 + s 0 =, f (x1 ) =,1.424.72gives s0 = [ −1.831−1.672.11.31Jf (x2 ) =−0.382.8.76Jf (x1 ) =Solving the system120s =−1.67 11.3 1−4.72−0.32 ]T , and hence−0.190, f (x2 ) =,x2 = x1 + s 1 =1.100.83gives s1 = [ 0.64Iterations continue until convergence to the solution x∗ = [ 01 ]T .We can determine the convergence rate of Newton’s method in n dimensions by differentiating the corresponding fixed-point operator (assuming it is smooth) and evaluating theresulting Jacobian matrix at the solution x∗ :g(x) = x − Jf (x)−1 f (x),G(x∗ ) = I − (Jf (x∗ )−1 Jf (x∗ ) +nXfi (x∗ )Hi (x∗ )) = O,i=1where Hi (x) denotes a component matrix of the derivative of Jf (x)−1 (which is a tensor).Thus, the convergence rate of Newton’s method for solving a nonlinear system is normallyquadratic, provided that the Jacobian matrix Jf (x∗ ) is nonsingular, but the algorithm mayhave to be started close to the solution in order to converge.The arithmetic overhead per iteration for Newton’s method in n dimensions can besubstantial:5.3.
SYSTEMS OF NONLINEAR EQUATIONS169• Computing the Jacobian matrix, either in closed form or by finite differences, requires theequivalent of n2 scalar function evaluations for a dense problem (i.e., if every componentfunction of f depends on every variable). Computation of the Jacobian may be muchcheaper if it is sparse or has some special structure. Another alternative that may becheaper for computing derivatives is automatic differentiation (see Section 8.7.2).• Solving the linear system by Gaussian elimination costs O(n3 ) arithmetic operations,again assuming the Jacobian matrix is dense.5.3.3Secant Updating MethodsThe high cost per iteration of Newton’s method and its finite difference variants has led tothe development of methods, analogous to the one-dimensional secant method, that gradually build up an approximation to the Jacobian based on successive iterates and functionvalues without explicitly evaluating derivatives.