Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 68
Текст из файла (страница 68)
We can then write, usingLemma 18.1,(18.3)where, as required for the next two results, the dimension is now m. Here, wehave introduced the generic constantin which c denotes a small integer constant whose exact value is unimportant.We will make frequent use of γcm in the rest of this chapter, because it is notworthwhile to evaluate the integer constants in our bounds explicitly.
Becausewe are not “chasing constants” we can afford to be somewhat cavalier andfreely use inequalities such as(recall Lemma 3.3), where we signify by the prime that the constant c on theright-hand side is different from that on the left.The next result describes the application of a Householder matrix to avector, and is the basis of all the subsequent analysis. In the applications ofinterest P is defined as in Lemma 18.1, but we will allow P to be an arbitraryHouseholder matrix. Thus v is an arbitrary, normalized vector, and the onlyassumption we make is that the computedsatisfies (18.3).Lemma 18.2. Letwhenand consider the computation of y =satisfies (18.3): The computedwhere P = I - vvT.Proof.
(Cf. the proof of Lemma 3.8,) We havewhereand |∆b| < γm|b|. Hencewhere |∆w| < γc m |υ||υT||b|. ThenWe have= (I –satisfies18.3 E RROR A NALYSISHencewhereOFH OUSEHOLDER C OMPUTATIONSwheresatisfies367But thenNext, we consider a sequence of Householder transformations applied to amatrix. Again, each Householder matrix is arbitrary and need have no connection to the matrix to which it is being applied. In the cases of interest, theHouseholder matrices Pk have the form (18.2), and so are of ever-decreasingeffective dimension, but to exploit this property would not lead to any significant improvement in the bounds.For the remaining results in this section, we make the (reasonable) assumption that an inequality of the form(18.4)holds, where r is the number of Householder transformations and it is implicitthat G and H denote nonnegative matrices.Lemma 18.3.
Consider the sequence of transformationswhere A1 =and Pk =is a Householder matrix.Assume that the transformations are performed using computed Householdervectorsthat satisfy (18.3). The computed matrix Ar+1 satisfies(18.5)where Q T = PrPr–1 . . . P1 and ∆A satisfies the normwise and componentwisebounds(In fact, we can take G = m- 1 eeT, where e = [1, 1,..., 1]T .) In the specialcase n = 1, so that Aa, we havewithProof.
First, we consider the jth column of A, aj, which undergoes thetransformationsBy Lemma 18.2 we havewhere each ∆Pk depends on j and satisfies ||∆ Pk||F < γcm. Using Lemma 3.6we obtain(18.6)368QR F ACTORIZATIONusing Lemma 3.1 and assumption (18.4). Hence ∆A in (18.5) satisfiesNOW, since ||aj||2 < ||aj||1 = eT|aj|, from (18.6) we havewhere ||G|| F = 1 (since ||ee T || F = m). Finally, if n = 1, so that A is acolumn vector, then (as in the proof of Lemma 18.2) we can rewrite (18.5)whereasNote that the componentwise bound for AA in Lemma 18.3 does not implythe normwise one, because of the extra factor m in the componentwise bound.This is a nuisance, because it means we have to state both bounds in this andother analyses.We now apply Lemma 18.3 to the computation of the QR factorization ofa matrix.Theorem 18.4. Letbe the computed upper trapezoidal QR factor(m > n) obtained via the Householder QR algorithm.
Thenofthere exists an orthogonalsuch thatwhere ||∆A||F < nγ cm||A||F and |∆ A| < mnγ cmG|A| , with ||G||F = 1. Thematrix Q is given explicitly as Q = (PnPn–1 . . . P1 )T, where Pk is the Householder matrix that corresponds to the exact application of the kth step of thealgorithm to Ak.Proof. This is virtually a direct application of Lemma 18.3, with Pkdefined as the Householder matrix that produces zeros below the diagonal inthe kth column of the computed matrix Ak. one subtlety is that we do notexplicitly compute the lower triangular elements of R, but rather set them tozero explicitly. However, it is easy to see that the conclusions of Lemmas 18.2and 18.3 are still valid in these circumstances; the essential reason is that theelements of ∆Pb in Lemma 18.2 that correspond to elements that are zeroedby the Householder matrix P are forced to be zero, and hence we can set thecorresponding rows of ∆P to zero too, without compromising the bound on||∆P||F.Finally, we consider use of the QR factorization to solve a linear system.Given a QR factorization of a nonsingular matrixa linear system Ax = b can be solved by forming Q T b and then solving Rx = Q T b.18.3 E RROR A NALYSISOFH OUSEHOLDER C OMPUTATIONS369From Theorem 18.4, the computedis guaranteed to be nonsingular ifWe give only componentwise bounds.Theorem 18.5.
Letbe nonsingular. Suppose we solve the systemAx = b with the aid of a QR factorization computed by the Householderalgorithm. The computedsatisfieswhereProof. By A Theorem 18.4, the computed upper triangular factorsatisfiesA + ∆A = QR with |∆ A| < n 2 γ c n G 1 |A| and ||G1 ||F = 1. BY Lemma 18.3,the computed transformed right-hand side satisfieswithImportantly, the same orthogonal matrix Q appears inthe equations involvingandBy Theorem 8.5, the computed solutionto the triangular systemsatisfiesPremultiplying by Q yieldsthat is,we havewhereUsingwhere G > G 1 and ||G||F = 1.The proof of Theorem 18.5 naturally leads to a result in which b is perturbed. However, we can easily modify the proof so that only A is perturbed:the trick is to use Lemma 18.3 to writewhereand to premultiply by (Q + ∆Q)-T instead of Q in the middle of the proof.This leads to the result(18.7)An interesting application of Theorem 18.5 is to iterative refinement, asexplained in §18.6.370QR F ACTORIZATION18.4.
Aggregated Householder TransformationsIn Chapter 12 we noted that LU factorization algorithms can be partitioned soas to express the bulk of the computation as matrix-matrix operations (level3 BLAS). For computations with Householder transformations the same goalcan be achieved by aggregating the transformations. This technique is widelyused in LAPACK.One form of aggregation is the “WY” representation of Bischof and VanLoan [105, 1987]. This involves representing the product Qr = PrPr -1 . .
. P1of r Householder transformations(where= 2).in the formThis can be done using the recurrence(18.8)Using the WY representation, a partitioned QR factorization can be developed as follows. Partitionas(18.9)and compute the Householder QR factorization of A 1 ,The product PrPr–1 .
. . P1 = I +is accumulated using (18.8), as thePi are generated, and then B is updated according towhich involves only level-3 BLAS operations. The process is now repeated onthe last m – r rows of B.When considering numerical stability, two aspects of the WY representation need investigating: its construction and its application. For the construction, we need to show thatsatisfies(18.10)(18.11)for modest constants d 1 , d 2, and d 3.
Now18.5 GIVENS ROTATIONS371But this last equation is essentially a standard multiplication by a Householdermatrix,albeit with less opportunity for rounding errors.It follows from Lemma 18.3 that the near orthogonality ofis inherited bythe condition onin (18.11) follows similarly and that onis trivial.Note that the condition (18.
10) implies that(18.12)that is,is close to an exactly orthogonal matrix (see Problem 18.13).Next we consider the application ofSuppose we formfor the B in (18.9), so thatAnalysing this level-3 BLAS-based computation using (18.12) and the verygeneral assumption (12.3) on matrix multiplication (for the 2-norm), it isstraightforward to show that(18.13)This result shows that the computed update is an exact orthogonal update ofa perturbation of B, where the norm of the perturbation is bounded in termsof the error constants for the level-3 BLAS.Two conclusions can be drawn. First, algorithms that employ the WY representation with conventional level-3 BLAS are as stable as the correspondingpoint algorithms. Second, the use of fast BLAS3 for applying the updates affects stability only through the constants in the backward error bounds.
Thesame conclusions apply to the more storage-efficient compact WY representation of Schreiber and Van Loan [905, 1989], and the variation of Puglisi [848,1992].18.5. Givens RotationsAnother way to compute the QR factorization is with Givens rotations. AGivens rotation (or plane rotation) G(i, j, θ)is equal to the identitymatrix except that372QR F ACTORIZATIONFigure 18.2.
Givens rotation, y = G(i, j, θ)x.where c = cosθ and s = sinθ. The multiplication y = G(i, j, θ)x rotates xthrough θ radians clockwise in the (i, j) plane; see Figure 18.2. Algebraically,and so yj = 0 if(18.14)Givens rotations are therefore useful for introducing zeros into a vector oneat a time. Note that there is no need to work out the angle θ, since c and sin (18.14) are all that are needed to apply the rotation.
In practice, we wouldscale the computation to avoid overflow (cf. §25.8).To compute the QR factorization, Givens rotations are used to eliminatethe elements below the diagonal in a systematic fashion. Various choices andorderings of rotations can be used; a natural one is illustrated as follows for ageneric 4 x 3 matrix:18.5 GIVENS ROTATIONS373The operation count for Givens QR factorization of a general m x n matrix(m > n) is 3n 2 (m – n/3) flops, which is 50% more than that for HouseholderQR factorization. The main use of Givens rotations is to operate unstructuredmatrices—for example, to compute the QR factorization of a tridiagonal orHessenberg matrix, or to carry out delicate zeroing in updating or downdatingproblems [470, 1989, §12.6].Error analysis for Givens rotations is similar to that for Householdermatrices—but a little easier.
We omit the (straightforward) proof of the firstresult.Lemma 18.6. Let a Givens rotation G(i, j, θ) be instructed according to(18.14). The computed andsatisfy(18.15)whereLemma 18.7. Letand consider the computation of y =is a computed Givens rotation in the (i, j) plane for whichand(18.15). The computedsatisfieswheresatisfywhere Gij is an exact Givens rotation based on c and s in (18.15). All therows of ∆Gij except the ith and jth are zero.Proof.