Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 48
Текст из файла (страница 48)
17) we obtainwhere(see Problem 11.1). Finally, we boundwe haveWriting, g =h =and this expression is approximately bounded by u 2 (h(g + n + 1) + 2( g + n +2)2(1 + uh)2 cond(A –1)). Requiringnot to exceed γn +1 leadsto the result.Theorem 11.4 says that as long as A is not too ill conditioned,is nottoo badly scaledis not too large), and the solver is not toounstableis not too large), then w|A|,|b| < 2γ n+1 after onestep of iterative refinement. Note that the term γn + 1in (11.20) comesfrom the error bound for evaluation of the residual, so this bound for w isabout the smallest we could expect to prove.Let us apply Theorem 11.4 to GE with or without pivoting.
If there ispivoting, assume (without loss of generality) that no interchanges are required.Theorem 9.4 shows that we can takewhereuse Aare the computed LU factors of A. To apply Theorem 11.4 weLU and writewhich shows that we can takeWithout pivoting the growth factor-type termis unbounded,but with partial pivoting it cannot exceed 2n and is typically O(n) [1019,1990] .We can conclude that, for GE with partial pivoting (GEPP), one step ofiterative refinement will usually be enough to yield a small componentwise240I TERATIVE R EFINEMENTTable 11.1. w|A|,|b| values for A = orthog(25).relative backward error as long as A is not too ill conditioned andisnot too badly scaled. Without pivoting the same holds true with the addedproviso that the computation of the originalmust not be too unstable.These results for GE are very similar to those of Skeel [920, 1980].
Themain differences are that Skeel’s analysis covers an arbitrary number of refinement steps with residuals computed in single or double precision, his analysisis specific to GE, and his results involve σ(A, x) rather than σ (A,One interesting problem remains: to reconcile Theorem 11.4 with Theorem 11.2. Under the conditions of Theorem 11.4 the componentwise relativebackward error is small after one step of iterative refinement, so the forwarderror is certainly bounded by a multiple of cond(A, x)u. How can this beshown (for GE) using the analysis of §11.1? An explanation is nontrivial-seeProblem 11.2.We will see applications of Theorems 11.3 and 11.4 to other types of linearequation solver in Chapters 18, 19, and 21.Tables 11.1–11.3 show the performance of fixed precision iterative refinement for GE without pivoting, GEPP, and Householder QR factorization (see§18.6).
The matrices are from the Test Matrix Toolbox (see Appendix E),and may be summarized as follows. Clement(n) is tridiagonal with zero diagonal entries; orthog(n) is a symmetric and orthogonal matrix, and gfpp(n)is a matrix for which the growth factor for GEPP is maximal. In eachcase the right-hand side b was chosen as a random vector from the uniformdistribution on [0, 1].
We report the componentwise relative backward errors for the initial solution and the refined iterates (refinement was terminated whenGEPP performs as predicted by both our andSkeel’s analyses. In fact, iterative refinement converges in one step even whenθ(A, x) := cond(A - 1 )σ(A, x) exceeds u -1 in the examples reported and inmost others we have tried. GE also achieves a small componentwise relativebackward error, but can require more than one refinement step, even whenθ(A, x) is small.11.3 N OTESANDR EFERENCES241Table 11.2. w|A|,|b| values for A = clement (50).Table 11.3.
w|A|,|b| values for A = gfpp(50).11.3. Notes and ReferencesWilkinson [1088, 1963] gave a detailed analysis of iterative refinement in a kindof scaled fixed point arithmetic called block-floating arithmetic. Moler [765,1967] extended the analysis to floating point arithmetic. Very readable analyses of iterative refinement are given in the books by Forsythe and Moler [396,1967, §22] and Stewart [941, 1973, §4.5].As we mentioned in §9.10, as early as 1948 Wilkinson had written a program for the ACE to do GEPP and iterative refinement.
Other early implementations of iterative refinement are in a code for the University of Illinois’ILLIAC by Snyder [932, 1955], the Algol code of McKeeman [745, 1962], andthe Algol codes in the Handbook [138, 1966], [729, 1966]. Some of the machines for which these codes were intended could accumulate inner productsin extended precision, and so were well suited to mixed precision iterativerefinement.Interest in fixed precision iterative refinement was sparked by two papersthat appeared in the late 1970s. Jankowski and Wozniakowski [610, 1977]proved that an arbitrary linear equation solver is made normwise backwardstable by the use of fixed precision iterative refinement, as long as the solveris not too unstable to begin with and A is not too ill conditioned.
Skeel [920,19 8 0 ] analysed iterative refinement for GEPP and showed that one step ofrefinement yields a small componentwise relative backward error, as long ascond(A – 1 )σ (A, x) is not too large.242I TERATIVE R EFINEMENTThe analysis in §11.1 extends existing results in the literature. The analysisin §11.2 is from Higham [549, 1991].The quantity σ(A, x) appearing in Theorem 11.4 can be interpreted asfollows. Consider a linear system Ax = b for which (|A||x|)i = 0 for some i.While the componentwise relative backward error w|A|,|b|( x ) of the exact solution x is zero, an arbitrarily small change to a component xj where aij 0yields w|A|,|b| (x + ∆x) > 1. Therefore solving Ax = b to achieve a smallcomponentwise relative backward error can be regarded as an ill-posed problem when |A||x| has a zero component.
The quantity σ(A, x) reflects thisill-posedness because it is large when |A||x| has a relatively small component.For a lucid survey of both fixed and mixed precision iterative refinementand their applications, see Björck [111, 1990]. For particular applications offixed precision iterative refinement, see Govaerts and Pryce [475, 1990] andJankowski and Wozniakowski [611, 1985].By increasing the precision from one refinement iteration to the next itis possible to compute solutions to arbitrarily high accuracy, an idea firstsuggested by Stewart in an exercise [941, 1973, pp. 206–207].
For algorithms,see Kielbasinski [656, 1981] and Smoktunowicz and Sokolnicka [931, 1984].There are a number of practical issues to attend to when implementing iterative refinement. Mixed precision iterative refinement cannot be implementedin a portable way when the working precision is already the highest precisionsupported by a compiler. This is the main reason why iterative refinement isnot supported in LINPACK. (The LINPACK manual lists a subroutine thatimplements mixed precision iterative refinement for single precision data, butit is not part of LINPACK [307, 1979, pp.
1.8–1. 10]. ) For either form of refinement, a copy of the matrix A needs to be kept in order to form the residual,and this necessitates an extra n 2 elements of storage. A convergence test forterminating the refinement is needed. In addition to revealing when convergence has been achieved, it must signal lack of (sufficiently fast) convergence,which may certainly be experienced when A is very ill conditioned. In theLAPACK driver xGESVX, fixed precision iterative refinement is terminated ifthe componentwise relative backward error w = w|A|,|b|satisfies1.
w < u,2. w has not decreased by a factor of at least 2 during the current iteration,or3. five iterations have been performed.These criteria were chosen to be robust in the face of different BLAS implementations and machine arithmetics. In an implementation of mixed precisioniterative refinement it is more natural to test for convergence of the sequencewith a test such as< u (see, e.g., Forsythe andPROBLEMS243Moler [396, 1967, p. 65]). However, if A is so ill conditioned that Theorem 11.1is not applicable, the sequencecould converge to a vector other than thesolution.
This behaviour is very unlikely, and Kahan [626, 1966] quotes a“prominent figure in the world of error-analysis” as saying “Anyone unluckyenough to encounter this sort of calamity has probably already been run overby a truck.”A by-product of extended precision iterative refinement is an estimate ofthe condition number. Since the error decreases by a factor approximatelyη =on each iteration (Theorem 11.1), the relative changemade to x on the first iteration should be about η , that is,Now that reliable and inexpensive condition estimators areavailable (Chapter 14) this rough estimate is less important.An unusual application of iterative refinement is to fault-tolerant computing. Boley et al.
[132, 1994] propose solving Ax = b by GEPP or QRfactorization, performing one step of fixed precision iterative refinement andthen testing whether the a priori residual bound in Theorem 11.4 is satisfied.If the bound is violated then a hardware fault may have occurred and specialaction is taken.11.3.1. LAPACKIterative refinement is carried out by routines whose names end -RFS, andthese routines are called by the expert drivers (name ending -SVX). Iterativerefinement is available for all the standard matrix types except triangular matrices, for which the original computed solution already has a componentwiserelative backward error of order u. As an example, the expert driver xGESVXuses LU factorization with partial pivoting and fixed precision iterative refinement to solve a general system of linear equations with multiple right-handsides, and the refinement is actually carried out by the routine xGERFS .Problems11.1.
Show that forσ = maxi |xi |/mini |xi |.andwhere11.2. Use the analysis of §11.1 to show that, under the conditions of Theorem 11.4,is bounded by a multiple of cond(A, x)u for GEPPafter one step of fixed precision iterative refinement.11.3. Investigate empirically the size offor L from GEPP.11.4. (Demmel and Higham [291, 1992]) Suppose GEPP with fixed precisioniterative refinement is applied to the multiple-right-hand side system AX = B,and that refinement of the columns of X is done “in parallel”: R = B – AX,244I TERATIVE R EFINEMENTAD = R, Y = X + D. What can be said about the stability of the processif R is computed by conventional multiplication but the second step is doneusing a fast multiplication technique for which only (12.3) holds?11.5.
(RESEARCH PROBLEM) Is one step of fixed precision iterative refinementsufficient to produce a componentwise relative backward error of order u forCholesky factorization applied to a symmetric positive definite system Ax = b,assuming cond(A – 1 )σ(A , x) is not too large? Answer the same question forthe diagonal pivoting method with partial pivoting applied to a symmetricsystem Ax = b.Previous Home NextChapter 12Block LU FactorizationBlock algorithms are advantageous for at least two important reasons.First, they work with blocks of data having b2 elements,performing O(b 3) operations.The O(b) ratio of work to storage means thatprocessing elements with an O(b) ratio ofcomputing speed to input/output bandwidth can be tolerated.Second, these algorithms are usually rich in matrix multiplication.This is an advantage becausenearly every modern parallel machine is good at matrix multiplication.—ROBERT S.