Conte, de Boor - Elementary Numerical Analysis. An Algorithmic Approach (523140), страница 40
Текст из файла (страница 40)
All we require of an approximate inverseC for A (and for the convergence of the corresponding fixed-point iteration) is that I - CA have some matrix norm less than one. For example,the matrix B =satisfiesStill, ||B|| < 1 in some matrix norm, for example, ||B||1 = 0.9 < 1. Thismakes it important to find ways of telling whether ||B|| < 1 in some matrixnorm (without having to try out all possible matrix norms). The followingtheorem provides such a way (in principle).Theorem 5.2 Let p(B) be the spectral radius of the matrix B, i.e.,ρ(B) = max{|λ| : λ is an eigenvalue of B }Then there exists, for any ε > 0, a vector norm for which theassociated matrix norm for B satisfies ||B|| < ρ(B) + ε.We conclude that C is an approximate inverse for A if and only ifρ(I - CA) < 1.
Further, the smaller the spectral radius of I - CA is, thefaster ultimately is the convergence of the fixed-point iteration apt to be.This can also be seen by observing that the errorin the mth iterate in fixed-point iterationx( m +l) = x( m ) + C( b - Ax ( m ))for the solution Aξ = b satisfiese ( m +1) = (I - CA) e ( mhencee(m )m (0)=B e(0)m = 0, 1, 2, . . .)all mall m, with B = I - CA(l)This shows the sequence e , e , e(2), .
. . of errors to be of a formdiscussed in Chap. 4 [see (4.62) through (4.67)] in connection with thepower method. We stated there that the corresponding normalizedsequencee ( m )/||e( m ) | |m = 0, 1, 2, . . .usually converges to an eigenvector of B = I - CA belonging to theabsolutely largest eigenvalue of B, i.e.,with |λ| = ρ(B). Thus, eventually, the error is reduced at each iterationstep by a factor ρ( B ) and no faster, in general.We now discuss specific examples of fixed-point iteration for linearsystems.
One such example is iterative improvement discussed in thepreceding chapter. To recall, one computes the residual r( m ) = b - A x ( m )*5.3FIXED-POINT ITERATION AND RELAXATION METHODS229for the mth approximate solution x( m ); then, using the triangular factorization of A calculated during elimination, one finds the (approximate)solution y( m ) of the linear system Ay = r( m ) and, adding y( m ) to x( m ), obtainsthe better (so one hopes) approximate solution x ( m +1) = x( m ) + y ( m ). Thevector y ( m ) is in general not the (exact) solution of Ay = r( m ).
This ispartially due to rounding errors during forward- and back-substitution.But the major contribution to the error in y ( m ) can be shown to come,usually, from inaccuracies in the computed triangular factorization PLUfor A, that is, from the fact that PLU is only an approximation to A. If weignore rounding errors during forward- and back-substitution, we haveHenceThis shows iterative improvement to be a special case of fixed-pointiteration, C being the computed triangular factorization PLU for A. Butfor certain classes of matrices A, a matrix C satisfying (i) and (ii) ofAlgorithm 5.5 can be found with far less computational effort than it takesto calculate the triangular factorization for A.
For a linear system withsuch a coefficient matrix, it then becomes more economical to dispensewith elimination and to calculate the solution directly by Algorithm 5.5.To discuss the two most common choices for C, we write thecoefficient matrix A = (a ij ) as the sum of a strictly lower-triangular matrixa diagonal matrixand a strictly upper-triangularwithFurther, we assume that all diagonal entries of A are nonzero; i.e., weassume that D is invertible.
If this is not so at the outset, we first rearrangethe equations so that this condition is satisfied; this can always be done ifA is invertible (see Exercise 4.7-5).In the Jacobi iteration, or method of simultaneous displacements, onechooses C = D-1, as discussed in Example 5.7.If Jacobi iteration converges, the diagonal partof A is a goodenough approximation to A to giveBut in this circumstance, one would expect the lower-triangular partL + D of A to be an even better approximation to A; that is, one would230*SYSTEMS OF EQUATIONS AND UNCONSTRAINED OPTIMIZATIONexpect to haveFixed-point iteration with C-1 =would then seem a faster convergent iteration than the Jacobi method.
Although this is not true in general,it is true for various classes of matrices A, for example, when A is strictlyrow-diagonally dominant, or when A is tridiagonal (and more generally,when A is block-tridiagonal with diagonal diagonal blocks), or when A haspositive diagonal entries and nonpositive off-diagonal entries.Fixed-point iteration with C-1 =is called Gauss-Seidel iteration, or the method of successive displacements. In this method, one hasororgiving the formulasApparently, we can calculate the ith entry of x( m +1) once we knowAlgorithm 5.6: Gauss-Seidel iteration Given the linear system Ax = bof order n whose coefficient matrix A = (a i j ) has all diagonal entriesnonzero.Calculate the entries of B = (b i j ) and of c = (c i ) byall i and jPick x(0), for example, x(0) = 0For m = 1, 2, .
. . , until satisfied, do:For i = 1, . . . , n, do:If some matrix norm ofis less than one, then thesequence x(0), x(1), . . . so generated converges to the solution of thegiven system.The vectors x(1), x(2), x( 3 ) resulting from Gauss-Seidel iteration appliedto the linear system of Example 5.7 are listed in Table 5.1.
Note that, for*5.3FIXED-POINT ITERATION AND RELAXATION METHODS231this example, Gauss-Seidel iteration converges much faster than doesJacobi iteration. After three steps, the accuracy is already better than thatobtained at the end of six steps of Jacobi iteration.In Jacobi iteration, the entries of x(m) are used only in the calculationof the next iterate x(m+1), while in Gauss-Seidel iteration, each entry of x( m )is already used in the calculation of all succeeding entries of x( m ); hencethe names simultaneous displacement and successive displacement. Inparticular, Jacobi iteration requires that two iterates be kept in memory,while Gauss-Seidel iteration requires only one vector.Gauss-Seidel iteration can be shown to converge if the coefficientmatrix A is strictly (row) diagonally dominant. It also converges if A ispositive definite, i.e., if A is real symmetric and for all nonzerovectors y,y T Ay > 0Finally, from among the many acceleration techniques available forspeeding up the convergence of fixed-point iteration, we mention successive overrelaxation or SOR, in which one overshoots the change from x ( m )to x(m+1) proposed by Gauss-Seidel iteration.
Thus, instead of takingall ias in Gauss-Seidel iteration, one overshoots and takesall iwith ω (> 1) the overrelaxation parameter. It is possible, though not veryilluminating, to write the resulting iteration explicitly in the formx (m+1) = x(m) + Cω (b - Ax( m ) )The corresponding iteration matrix isIn theory, the overrelaxation parameter ω is to be chosen so that ρ ( I C ω A) is as small as possible. This is, of course, a more difficult task thansolving the linear system Aξ = b in the first place. But, one may have tosolve such a linear system for many right-hand sides (and only to a certainaccuracy), in which case it would pay to obtain a “good” ω by experiment.Also, for certain matrices A occurring in the numerical solution of standard partial differential equations, one can express ρ(I - Cω A) in terms ofthe spectral radius of the iteration matrixof Jacobiiteration, and thus make qualitative statements about the optimal choice ofω.
The typical choice for ω is between 1.2 and 1.6.As pointed out earlier, iterative methods are usually applied to largelinear systems with a sparse coefficient matrix. For sparse matrices, thenumber of nonzero entries is small, and hence the number of arithmetic232*SYSTEMS OF EQUATIONS AND UNCONSTRAlNED OPTIMIZATIONoperations to be performed per step is small. Moreover, iterative methodsare less vulnerable to the growth of round-off error. Only the size of theroundoff generated in a single iteration is important.
On the other hand,iterative methods will not always converge, and even when they doconverge, they may require a prohibitively large number of iterations. Forlarge systems, the total number of iterations required for convergence tofour or five places may be of the order of several hundred.The idea underlying Jacobi and Gauss-Seidel iteration is that ofrelaxation, and this idea makes good sense also in the context of a generalsystemf(ξ)(ξ) = 0of n nonlinear equations in n unknowns. In its simplest form, one assumesthe equations so ordered that it is possible to solve the ith equationfor the ith unknown to get the equivalent equationThen, given an approximation x to ξ, one attempts to improve its ithcomponent by changing it toThe term “relaxation” for this procedure is due to Southwell.
In effect,the current guess x for the solution is the exact solution of the relatedsystemwhere the error terms ri are brought in to force the system to have x as itssolution. In relaxation, the ith component of the current guess is thenimproved by letting it find its new (relaxed) level in response to theremoval of the forcing term ri in equation i.Relaxation is usually carried out Gauss-Seidel fashion, i.e., the newvalue of the ith component is immediately used in the subsequent improvement of other components. Further, one goes through all the equations insome systematic fashion, changing all components of x. Each suchrunthrough constitutes a sweep.There are many useful variants of the basic relaxation idea.