Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 31
Текст из файла (страница 31)
(Theoretically it is an exact bound!)The reason is that we cannot compute r orexactly. In place of rwe computeand(7.26)Therefore a strict bound. and one that should be used in practice in place of(7.25), is(7.27)This forward error bound is estimated and returned by LAPACK‘s xyyRFSroutines. For details on how this is done without computing A -1, see Chapter 14.The LAPACK linear equation solvers estimate only one condition number:the standard condition number κ1 (A) (or, rather, its reciprocal, referred to asrcond), which is returned by the xyyCON routines.7.8.
Perturbation Theory by CalculusThe perturbation results in this book are all derived algebraically, without anyuse of derivatives. Calculus can also be used to derive perturbation bounds,often in a straight forward fashion.7.9 N OTESANDR EFERENCES145As a simple example, consider a linear system A(t)z(t) = b(t), whereand x(t), b(t)are assumed to be continuously differentiablefunctions of t. Differentiating givesor, dropping the t arguments,Taking norms, we obtainThis bound shows that κ(A) is a key quantity in measuring the sensitivity ofa linear system. A componentwise bound could have been obtained just aseasily.We normally express perturbations of the data in the formTo use the calculus framework we can take A(0) as the original matrix A andwritebut the perturbation boundthen becomes a first-order one.The calculus technique is a useful addition to the armoury of the erroranalyst (it is used by Golub and Van Loan [470, 1989], for example), but thealgebraic approach is preferable for deriving rigorous perturbation bounds ofthe standard forms.7.9.
Notes and ReferencesThis chapter draws on the survey paper Higham [558, 1994].Theorem 7.3 is due to Oettli and Prager [802, 196 4], and predates thenormwise backward error result Theorem 7.1 of Rigal and Gaches [873, 1967].In addition to Theorem 7.1, Rigal and Gaches give a more general resultbased on norms of blocks that includes Theorems 7.3 and 7.1 as special cases.Theorem 7.1 is also obtained by Kovarik [672, 1976].Theorems 7.1 and 7.3 both remain valid when A is rectangular. Componentwise backward error for rectangular A was considered by Oettli, Prager,and Wilkinson [803, 1965], but their results are subsumed by those of Oettliand Prager [802, 1964] and Rigal and Gaches [873, 1967].For a linear system Ax = b subject to componentwise perturbations, Oettli [801, 1965] shows how linear programming can be used to obtain boundson the components of x when all solutions of the perturbed system lie in thesame orthant.
Cope and Rust [244, 1979] extend this approach by showing, ingeneral, how to bound all the solutions that lie in a given orthant. This type146P ERTURBATIO N T HEORYFORLINEAR S YSTEMSof analysis can also be found in the book by Kuperman [681, 1971], whichincludes an independent derivation of Theorem 7.3. See also Hartfiel [505,1 9 8 0 ].Theorem 7.4 is a straightforward generalization of a result of Skeel [919,1979 , Thms. 2.1 and 2.2].
It is clear from Bauer’s comments in [80, 19 66]that the bound (7.9), with E = |A| and f = |b|, was known to him, thoughhe does not state the bound. This is the earliest reference we know in whichcomponentwise analysis is used to derive forward perturbation bounds.Theorem 7.8 is from Bauer [79, 1963]. Bauer actually states that equalityholds in (7.21) for any A, but his proof of equality is valid only when |A - 1||A|and |A||A- 1| have positive Perron vectors. Businger [168, 1968] proves thata sufficient condition for the irreducibility condition of Theorem 7.8 to hold(which, of course, implies the positivity of the Perron vectors) is that there donot exist permutations P and Q such that PAQ is in block triangular form.Theorem 7.7 is from Stewart and Sun [954, 1990, Thm.
4.3.5].Further results on scaling to minimize the condition number κ(A) are givenby Forsythe and Straus [386, 1955], Bauer [81, 1969], Golub and Varah [465,1974], McCarthy and Strang [742, 1974], Shapiro [913, 1982]. [914, 1985], [915,1991], and Watson [1067, 1991].Chan and Foulser [193, 1988] introduce the idea of “effective conditioning”for linear systems, which takes into account the projections of b onto the rangespace of A. See Problem 7.5, and for an application to partial differentialequations see Christiansen and Hansen [208, 1994].For an example of how definitions of numerical stability for linear equation solvers can be extended to incorporate structure in the problem, seeBunch [162, 1987].An interesting application of linear system perturbation analysis is toMarkov chains. A discrete-time Markov chain can be represented by a squarematrix P, where pij is the probability of a transition from state i to state j .Since state i must lead to some other state.and these conditionscan be writ ten in matrix vector form asPe = e.(7.28)A nonnegative matrix satisfying (7.28) is called a stochastic matrix.
Theinitial state of the Markov chain can be defined by a vector zT , where zi ,denotes the probability that the ith state of the chain is occupied. Then thestate of the chain at the next time unit is given by zTP. The steady state orstationary vector of the chain is given byAn important question is the sensitivity of the individual components of thesteady-state vector to perturbations in P. This is investigated, for example.147PROBLEMSby Ipsen and Meyer [605, 1994], who measure the perturbation matrix normwise, and by O’Cinneide [800, 1993], who measures the perturbation matrixcomponentwise.
For a matrix-oriented development of Markov chain theorysee Berman and Plemmons [94, 1994].It is possible to develop probabilistic perturbation theory for linear systemsand other problems by making assumptions about the statistical distributionof the perturbations. We do not consider this approach here (though see Problem 7.13), but refer the interested reader to the papers by Fletcher [376, 1985],Stewart [948, 1990], and Weiss, Wasilkowski, Wozniakowski, and Shub [1073,19 86].Problems7.1. Under the conditions of Theorem 7.4, show thatHence derive a first-order bound for |xi - yi |/|xi |.7.2.
Let Ax = b, whereShow that for any vector y and anysubordinate matrix norm,where the residual r = b - Ay. Interpret this result.7.3. Prove (7.13) and deduce (7.12).7.4. Letwhere D =cond(H) <be symmetric positive definite and let A = DHD,(this is the scaling used in Corollary 7.6). Show that< n cond(H).7.5.
(Chan and Foulser [193, 1988]) Lethave the SVD A =and define the projection matrix Pk :=wherewhere Uk = U(:,n + 1 - k:n). Show that if Ax = b and A(x + ∆x) =(b + ∆b) thenWhat is the interpretation of this result?7.6. (a) For the choice of tolerances E = |A|eeT, f = |b|, corresponding to arow-wise backward error, show thatP ERTURBATION T HEORY148(b) For E = eeT|A| and f =ward error, show thatFORLINEAR S YSTEMScorresponding to a columnwise back-7.7. Show that7.8. Letbe nonsingular. A componentwise condition number forthe problem of computing cTx, where Ax = b, can be defined byObtain an explicit formula for x E,f (A , x). Show that xE,f(A, x) > 1 ifE = |A| or f = |b|.
Derive the corresponding normwise condition numberin which the constraints are ||∆A||2 <and ||∆b|| 2 <7.9. (Bauer [79, 1963]) Let A, B, Cpositive elements then(a) Prove that if B and C havewhere= {diag(d i ) : di > 0, i = 1:n}. (Hint: consider D 1 = diag(x1 ) - 1and D 2 = diag(Cx1), where x1 > 0 is a right Perron vector of BC: BCx1 =p (B C) x1 . )(b) Deduce that if |A| and |A - 1 | have positive entries, then(c) Show that for any nonsingular A,(d) Strengthen (b) by showing that for any nonsingular A such that|A||A- 1 | and |A- 1 ||A|| are irreducible,PROBLEMS(e) What can you deduce about2-norms?149for the 1- and7.10.
(Bauer [80, 1966, p. 413], Rohn [876, 1989]) We can modify the definition of µE(A) in (7.22) by measuring ∆X componentwise relative to X,giving7.11. Letbe symmetric and let y be an approximate solution toAx = b. If y has a small backward error, so that y solves a nearby system. doesit follow that y solves a nearby symmetric system? This problem answers thequestion for both the normwise and componentwise relative backward errors.(a) (Bunch, Demmel, and Van Loan [163, 1989]) Show that if (A+G)y = bthen there exists H = HT such that (A + H)y = b with ||H|| 2 < ||G||2 and||H|| F <(This result does not require A = AT.)(b) (Smoktunowicz [930, 1995]) Show that if A is symmetric and diagonallythen there exists H = HT suchdominant and (A + G)y = b with |G| <that (A + H)y = b with |H| <(For a general symmetric A there maynot exist such an H, as is easily shown by a 2 × 2 example [527, 1992].)(c) (Smoktunowicz [930, 1995]) Show that if A is symmetric positive definite and (A + G)y = b with |G| <then there exists H = H T such that(A + H)y = b with |H| <7.12.