Heath - Scientific Computing (523150), страница 94
Текст из файла (страница 94)
For this reason, a preconditioner is usually used with the conjugate gradient method. By doing so, the preconditionedmatrix M −1 A has a much smaller condition number than A, and hence the convergencerate is greatly improved. The foregoing estimate is conservative, however, and the algorithmmay do much better than this. For example, if the matrix A has only m distinct eigenvalues, then theoretically, conjugate gradients converges in at most m iterations. Thus, thedetailed convergence behavior depends on all of the spectrum of A, not just on its extremeeigenvalues, and in practice the convergence is often superlinear.11.5.7Multigrid MethodsWe have seen that stationary iterative methods often have very poor rates of convergence.There are various techniques for accelerating convergence, but these methods generally11.5. ITERATIVE METHODS FOR LINEAR SYSTEMS351remain impractical for problems of realistic size.
These convergence results are asymptotic,however, and such methods may initially make rapid progress before eventually settlinginto the slow asymptotic phase. In particular, many stationary iterative methods tend toreduce the high-frequency (i.e., oscillatory) components of the error rapidly but reduce thelow-frequency (i.e., smooth) components of the error much more slowly, which produces thepoor asymptotic rate of convergence (see Computer Problems 11.13 and 12.14 for examples).For this reason, such methods are sometimes called smoothers. This observation providesthe motivation for multigrid methods, which we now outline very briefly.The notions of smooth or oscillatory components of the error are relative to the meshon which the solution is defined. In particular, a component that appears smooth on a finegrid may appear oscillatory when sampled on a coarser grid. If we apply a smoother on thecoarser grid, then we may make rapid progress in reducing this (now oscillatory) componentof the error.
After a few iterations of the smoother, the results can then be interpolated backto the fine grid to produce a solution in which the low-frequency components of the errorhave been reduced. It may then be desirable to use a few more iterations of the smoother onthe fine grid to ensure that the high-frequency components of the error are still small. Thenet result is an approximate solution on the fine grid for which both the high-frequency andlow-frequency components of the error are reduced, and then the process can be repeated,if desired, until some convergence criterion is met.This idea can be extended to multiple levels of grids, so that error components of variousfrequencies can be reduced rapidly, each at the appropriate level. Transition from a finergrid to a coarser grid involves restriction (sometimes called injection), whereas transitionfrom a coarser grid to a finer grid involves interpolation (sometimes called prolongation).If x̂ is an approximate solution to Ax = b, with residual r = b − Ax̂, then the errore = x − x̂ satisfies the equation Ae = r.
Thus, in improving the approximate solutionwe can work with just this “residual equation” involving the error and the residual, ratherthan the solution and original right-hand side. One advantage of the residual equation isthat zero is a reasonable starting guess for its solution. A two-grid method then takes thefollowing form:1. On the fine grid, use a few iterations of a smoother to compute an approximate solutionx̂ for the system Ax = b.2. Compute the residual r = b − Ax̂.3.
Restrict the residual to the coarse grid.4. On the coarse grid, use a few iterations of a smoother on the residual equation to obtaina coarse grid approximation to the error.5. Interpolate the coarse grid correction to the fine grid to obtain an improved approximatesolution on the fine grid.6. Apply a few iterations of a smoother to the corrected solution on the fine grid.A multigrid method results from recursion in Step 4, that is, the coarse grid correction isitself improved by using a still coarser grid, and so on down to some bottom level. Thecomputations become cheaper as one moves to coarser and coarser grids because the systemsbecome successively smaller.
In particular, a direct method may be feasible on the coarsestgrid if the system is small enough.352CHAPTER 11. PARTIAL DIFFERENTIAL EQUATIONSThere are many possible strategies for cycling through the various grid levels, the mostcommon of which are depicted schematically in Fig. 11.9. The V-cycle starts with the finestgrid and goes down through successive levels to the coarsest grid and then back up again tothe finest grid. To get more benefit from the coarser grids, where computations are cheaper,the W-cycle zigzags among the lower-level grids before moving back up to the finest grid.Full multigrid starts at the coarsest level, where a good initial solution is easier to comeby (perhaps by direct solution), then bootstraps this solution up through the grid levels,ultimately reaching the finest grid.fine..................................................................coarse•......•......................................................
...... ......•••V-cycle•......•............................................. ........... .................... ....... .... ..... ..........•••••W-cycle••....... ....... ................................................. ............ ........... ............ ........ ....... ...... ....................•••••••Full multigridFigure 11.9: Cycling schemes for multigrid methods.By exploiting the strengths of the underlying iterative smoothers and avoiding theirweaknesses, multigrid methods are capable of extraordinarily good performance. In particular, at each level the smoother reduces the oscillatory component of the error very rapidlyand at a rate that is independent of the mesh size h (since only a few iterations of thesmoother, often only one, are performed at each level).
Since all components of the errorappear oscillatory at some level, it follows that the convergence rate of the entire multigridscheme should be rapid and independent of the mesh size, which is in stark contrast tothe other iterative methods we have considered. Moreover, the cost of an entire cycle ofmultigrid is only a modest multiple of the cost of a single sweep on the finest grid. As aresult, multigrid methods are among the most powerful methods available for solving sparselinear systems arising from PDEs and are capable of converging to within the truncationerror of the discretization at a cost comparable to fast direct methods, although the latterare much less broadly applicable.11.6Comparison of MethodsNow that we have examined both direct and iterative methods in some detail, we cansummarize their relative advantages and disadvantages for solving linear systems:• Direct methods require no initial estimate for the solution, but they take no advantageof it if a good estimate happens to be available.• Direct methods are good at producing high accuracy, but they take no advantage if onlylow accuracy is needed.• Iterative methods are often dependent on special properties, such as the matrix beingsymmetric positive definite, and are subject to very slow convergence for badly condi-11.6.
COMPARISON OF METHODS353tioned systems. Direct methods are more robust in both senses.• Iterative methods usually require less work if convergence is rapid but often requirethe computation or estimation of various parameters or preconditioners to accelerateconvergence, which at least partially offsets this advantage.• Iterative methods do not require explicit storage of the matrix entries and hence are goodwhen the matrix can be produced easily on demand or is most easily implemented as alinear operator.• Iterative methods are less readily embodied in standard software packages, since thebest representation of the matrix is often problem-dependent and “hard-coded” in anapplication program, whereas direct methods employ more standard storage schemes.To make a more quantitative comparison, Table 11.3 shows the order of magnitude of thecomputational cost for solving a discretized elliptic boundary value problem in two or threedimensions (2-D or 3-D) by each of the methods we have discussed (and also a few methodsthat we have not discussed).
These results should be taken only as a rough guide, as theydepend on several assumptions:• The discretization is by a finite difference scheme on a regular grid (k × k in two dimensions, k × k × k in three dimensions) with mesh size h = 1/k. For the divide-and-conquermethods, k is assumed to be a power of two (and all logarithms are base two).• The resulting matrix is symmetric, positive definite, and sparse, with a constant numberof nonzeros per row and a condition number that is O(1/h2 ).• For the iterative methods that depend on various parameters, optimal values are knownand used in all cases.• For the band Cholesky method, the bandwidth is O(k) in two dimensions and O(k 2 ) inthree dimensions.• For the sparse Cholesky method, an optimal nested dissection ordering is used.• For the preconditioned conjugate gradient method, the preconditioner reduces the condition number to O(1/h).• The iterative methods are iterated to convergence within the truncation error of thediscretization, i.e., until the initial error is reduced by a factor of h2 .In interpreting these results, several caveats should be kept in mind:• We have omitted the proportionality constants.
In theory these are irrelevant asymptotically, but they may matter a great deal for a specific problem of interest, even quite alarge one. Also, the value of the proportionality constant for a given method depends onthe specific discretization used.• The methods listed are not equally applicable. For the Poisson equation on the unitsquare with the standard five-point difference scheme, for example, all of the foregoingassumptions hold and all of the methods listed are applicable.