Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 75
Текст из файла (страница 75)
Let A, BB||2 < 1, thenIf rank(A) = rank(B) and η = ||A+||2||A–Proof. Let r = rank(A). A standard result on the perturbation of singularvalues givesthat is,which gives the result on rearranging.Lemma 19.8. Let A, BIf rank(A) = rank(B) thenProof. We haveThe result then follows from the (nontrivial) equality ||PA(I – PB)||2 = ||PB(I–PA)||2; see, for example, Stewart [943, 1977, Thin. 2.3] or Stewart and Sun408T HE LEAST S QUARES P ROBLEM[954, 1990, Lem.
3.3.5], where proofs that use the CS decomposition aregiven.Proof of Theorem 19.1. Let B := A + AA. We have, with r = b – Ax,y – z = B+(b + ∆b) – z = B+(r + Ax + ∆b) – x= B + (r + Bx – ∆Ax + ∆b) – x= B + (r – ∆Ax + ∆b) – (I – B+B)x= B + ( r – ∆Ax + ∆b),(19.23)since B has full rank. NowB + r = B + (BB + )r = B + P B r = B + P B (I – P A )r.(19.24)Applying Lemmas 19.7 and 19.8, we obtain(19.25)Similarly,(19.26)The bound for ||x – y||2 /||x||2 is obtained by using inequalities (19.25) and(19.26) in (19.23).Turning to the residual, using (19.23) we find thats – r = ∆b + B(x – y) – ∆Ax= ∆b – BB+(r – ∆Ax + ∆b) – ∆Ax= (I – BB + )(∆b – ∆Ax) – BB+r.Since ||I – BB+||2 = min{l, m – n },Using (19.24), Lemma 19.8, and ||BB+||2 = 1, we obtain19.9 N OTESANDR EFERENCES409HenceFor the attainability, see Wedin [1069, 1973, 6].Note that, as the proof has shown, Wedin’s theorem actually holds without any restriction on m and n, provided we define x = A + b and y =(A+ ∆A)+(b + ∆b) when m < n (in which case r = 0).
We consider underdetermined systems in detail in the next chapter. The original version ofWedin’s theorem also requires only rank(A ) = rank(A + ∆A) and not that Ahave full rank.19.9. Notes and ReferencesThe most comprehensive and up to date treatment of the LS problem is thebook by Björck [116, 1996], which is an updated and expanded version of [112,1990]. It treats many aspects not considered here, including rank-deficient,weighted, and constrained problems. An early book devoted to numericalaspects of the LS problem was written by Lawson and Hanson [695, 1974],who, together with Stewart [94 1, 1973], were the first to present error analysisfor the LS problem in textbook form.The history of the LS problem is described in the statistically orientedbook by Farebrother [363, 1988].The pseudo-inverse A+ underlies the theory of the LS problem, since theLS solution can be expressed as x = A+b.
An excellent reference for perturbation theory of the pseudo-inverse is Stewart and Sun [954, 1990, 3.3]. Theliterature on pseudo-inverses is vast, as evidenced by the annotated bibliography of Nashed and Rail [786, 1976], which contains 1,776 references publishedup to 1976.Normwise perturbation theory for the LS problem was developed by various authors in the 1960s and 1970s. The earliest analysis was by Golub andWilkinson [466, 1966], who gave a first-order bound and were the first to recognize the potential k2 (A)2 sensitivity.
A nonasymptotic perturbation boundwas given by Björck [107, 1967], who worked from the augmented system.An early set of numerical experiments on the Householder, Gram-Schmidt,and normal equations methods for solving the LS problem was presented byJordan [618, 1968]; this paper illustrates the incomplete understanding ofperturbation theory and error analysis for the LS problem at that time.van der Sluis [1041, 1975] presents a geometric approach to LS perturbation theory and gives lower bounds for the effect of worst-case perturbations.Golub and Van Loan [470, 1989, Thin.
5.3.1] give a first-order analogue ofTheorem 19.1 expressed in terms of the angle θ between b and range(A ) instead of the residual r.410T HE LEAST S QUARES P ROBLEMWei [1072, 1990] gives a normwise perturbation result for the LS problemwith a rank deficient A that allows rank(A + ∆A) > rank(A).Componentwise perturbation bounds of the form in Theorem 19.2 werefirst derived by Björck in 1988 and variations have been given by Arioli, Duff,and de Rijk [25, 1989], Björck [113, 1991], and Higham [542, 1990].Higham [542, 1990] examined the famous test problem from Longley [711,1967]—a regression problem which has a notoriously ill-conditioned 16 x 7coefficient matrix with k2 (A) 5 x 10 9 .
The inequality (19.8) was foundto give tight bounds for the effect of random componentwise relative perturbations of the problem generated in experiments of Beaton, Rubin, andBarone [86, 1976]. Thus componentwise perturbation bounds are potentiallyuseful in regression analysis as an alternative to the existing statistically basedtechniques.The tools required for a direct proof of the normwise backward error resultin Theorem 19.3 are developed in Wilkinson’s book The Algebraic EigenvalueProblem [1089, 1965].
Results of this form were derived informally by Goluband Wilkinson (assuming the use of extended precision inner products) [466,1966], stated by Wilkinson [1090, 1965, p. 93] and Stewart [941, 1973], andproved by Lawson and Hanson [695, 1974, Chap. 16].The idea of using QR factorization to solve the LS problem was mentionedin passing by Householder [586, 1958].
Golub [463, 1965] worked out thedetails, using Householder QR factorization, and this method is sometimescalled “Golub’s method”. In the same paper, Golub suggested the form ofiterative refinement described at the start of 19.5 (which is implemented ina procedure by Businger and Golub [167, 1965]), and showed how to use QRfactorization to solve an LS problem with a linear constraint Bx = c.It was Björck [106, 1967] who first recognized that iterative refinementshould be applied to the augmented system for best results, and he gave adetailed rounding error analysis for the use of a QR factorization computed bythe Householder or MGS methods.
Björck and Golub [118, 1967] give an Algolcode for computation and refinement of the solution to an LS problem witha linear constraint; they use Householder transformations, while Björck [108,1968] gives a similar code baaed on the Gram–Schmidt method. In [109, 1978],Björck dispels some misconceptions of statisticians about (mixed precision)iterative refinement for the LS problem; he discusses standard refinementtogether with two versions of refinement based on the seminormal equations.Error analysis for solution of the LS problem by the classical GramSchmidt method with reorthogonalization is given by Abdelmalek [2, 1971],who obtains a forward error bound as good as that for a backward stablemethod.Higharn and Stewart [569, 1987] compare the normal equations methodwith the QR factorization method, with emphasis on aspects relevant to regression problems in statistics.19.9 N OTESANDR EFERENCES411Foster [398, 1991] proposes a class of methods for solving the LS problemthat are intermediate between the normal equations method and the MGSmethod, and that can be viewed as block MGS algorithms.The most general analysis of QR factorization methods for solving the LSand related problems is by Björck and Paige [120, 1994], who consider anaugmented system with an arbitrary right-hand side (see Problem 20.1) andprove a number of subtle stability results.Theorem 19.4 and the following analysis are from Higham [549, 1991].Arioli, Duff, and de Rijk [25, 1989] investigate the application of fixed precision iterative refinement to large, sparse LS problems, taking the basic solverto be the block LDLT factorization code MA27 [329, 1982] from the HarwellSubroutine Library (applied to the augmented system); in particular, theyuse scaling of the form (19.17).
Björck [114, 1992] determines, via an erroranalysis for solution of the augmented system by block LDLT factorization, achoice of α in (19.17) that minimizes a bound on the forward error.The idea of implementing iterative refinement with a precision that increases on each iteration (see the Notes and References to Chapter 11) can beapplied to the LS problem; see Gluchowska and Smoktunowicz [453, 1990].The use of SNE was first suggested by Kahan, in the context of iterativerefinement, as explained by Golub and Wilkinson [466, 1966].Stewart [945, 1977] discusses the problem of finding the normwise backward error for the LS problem and offers some backward perturbations thatare candidates for being of minimal norm. The problem is also discussed byHigham [542, 1990]. Componentwise backward error for the LS problem hasbeen investigated by Arioli, Duff, and de Rijk [25, 1989], Björck [113, 1991],and Higham [542, 1990].Theorem 19.5 has been extended to the multiple right-hand side LS problem by Sun [976, 1996].Lemma 19.6 is from a book byand Schwetlick, which hasbeen published in German [658, 1988] and Polish [659, 1992] editions, butnot in English.