Heath - Scientific Computing (523150), страница 28
Текст из файла (страница 28)
In effect, we will rotate a so that it isaligned with the first coordinate axis. Then its second component will become zero. Theprevious equation can be rewritten as αa1a2c=.a2 −a1s0We can now perform Gaussian elimination on this system to obtain the triangular system a1a2cα=.0 −a1 − a22 /a1s−αa2 /a1Back-substitution then givesαa2,+ a22s=a21c=Finally, the requirement that c2 + s2 = 1, or α =c= pa1,a21 + a22αa1.+ a22a21pa21 + a22 , implies thata2s= pa21 + a22.As with Householder transformations, unnecessary overflow or underflow can be avoidedby appropriate scaling.
If |a1 | > |a2 |, then we can work with the tangent of the angle ofrotation, t = s/c = a2 /a1 , so that the cosine and sine are given bypc = 1/ 1 + t2 , s = c · t.If |a2 | > |a1 |, on the other hand, then we can use the analogous formulas involving thecotangent τ = c/s = a1 /a2 , obtainingps = 1/ 1 + τ 2 , c = s · τ.96CHAPTER 3. LINEAR LEAST SQUARESIn either case, we can avoid squaring any magnitude larger than 1.
Note that the angle ofrotation need not be determined explicitly, as only its cosine and sine are actually needed.Example 3.6 Givens Rotation. To illustrate the construction just described, we determine a Givens rotation that annihilates the second component of the vector 4.a=3For this problem, we can safely compute the cosine and sine directly, obtaininga1c= pa21+a22=4= 0.85a2and s = pa21+a22=3= 0.6,5or, equivalently, we can use the tangent t = a2 /a1 = 3/4 = 0.75 to obtain1c= p=1 + (0.75)21= 0.81.25and s = c · t = (0.8)(0.75) = 0.6.Thus, the rotation is given by c s0.8=G=−s c−0.60.6.0.8To confirm that the rotation performs as expected, we compute 0.8 0.645Ga ==,−0.6 0.830which shows that the zero pattern of the result is correct and that the norm is preserved.Note that the value of the angle rotation, which in this case is about 36.87 degrees, doesnot enter directly into the computation and need not be determined explicitly.We have seen how to design a plane rotation to annihilate a given component of avector in two dimensions.
To annihilate a selected component of a vector in n dimensions,we can apply the same technique by rotating the target component, say j, with anothercomponent, say i. The two selected components of the vector are used as before to determinethe appropriate 2 × 2 rotation matrix, which is then embedded as a 2 × 2 submatrix in rowsand columns i and j of the n-dimensional identity matrix I, as illustrated here for the casen = 5, i = 2, j = 4: 1 0 0 0 0a1a1 0 c 0 s 0 a2 α 0 0 1 0 0 a3 = a3 . 0 −s 0 c 0 a4 0 0 0 0 0 1a5a5Using a sequence of such Givens rotations, we can selectively and systematically annihilateentries of a matrix A to reduce the matrix to upper triangular form.
The only restriction onthe order in which we annihilate entries is that we should avoid reintroducing nonzero values3.4. ORTHOGONALIZATION METHODS97into matrix entries that have previously been annihilated, but this can be accomplished bya number of different orderings. Once again, the product of all of the rotations is itself anorthogonal matrix that gives us the desired QR factorization.A straightforward implementation of the Givens method for solving general linear leastsquares problems requires about 50 percent more work than the Householder method. Italso requires more storage, since each rotation requires two numbers, c and s, to define it(and hence the zeroed entry aij does not suffice for storage).
These work and storage disadvantages can be overcome to make the Givens method competitive with the Householdermethod, but at the cost of a more complicated implementation. Therefore, the Givensmethod is generally reserved for situations in which its greater selectivity is of paramountimportance, such as when the matrix is sparse or when some particular pattern of existingzeros must be maintained.As with Householder transformations, the matrix Q need not be formed explicitly because multiplication by the successive rotations produces the same effect as multiplicationby Q.
If Q is needed explicitly for some other reason, however, then it can be computed bymultiplying each rotation in sequence times a matrix that is initially the identity matrix I.Example 3.7 Givens QR Factorization. We illustrate Givens QR factorization byusing it to solve the quadratic polynomial data-fitting problem in Example 3.2, with11A=111−1.0 1.0−0.5 0.25 0.0 0.0 ,0.5 0.25 1.0 1.01.0 0.5 b= 0.0 . 0.5 2.0We can annihilate the (5,1) entry of A using a Givens rotation based on the fourth andfifth entries of the first column. The appropriate rotation is given by c = 0.707, s = 0.707.Applying this rotation G1 to A and b yields10G1 A = 00001000001000000.707−0.707 11 −1.0 1.000 1 −0.5 0.25 10 0.0 0.0 = 11 1.4140.70710.5 0.2511.0 1.000.70700100−1.0−0.50.01.0610.3541.00.25 0.0 0.884 0.530and10G1 b = 00001000 001.01.0 00 0.5 0.5 00 0.0 = 0.0 .0.707 0.7070.51.768 −0.707 0.7072.01.061We next annihilate the (4,1) entry using a Givens rotation based on the third and fourthentries of the first column.
The appropriate rotation is given by c = 0.577, s = 0.816.98CHAPTER 3. LINEAR LEAST SQUARESApplying this rotation G2 yields10G2 G1 A = 000010000001000 10.577 0.816 0 1−0.816 0.577 0 1.4140010 −1.0 1.01−0.5 0.25 10.0 0.0 = 1.7321.061 0.884 00.354 0.5300−1.0−0.50.8660.6120.3541.00.25 0.722 0.510 0.530and1 00 1G2 G1 b = 0 00 00 0000.577−0.8160000.8160.5770 1.01.00 0 0.5 0.5 0 0.0 = 1.443 .0 1.768 1.020 1.06111.061We continue up the first column in this manner until all of its subdiagonal entries havebeen annihilated. We then proceed similarly to the second and third columns, eventuallyproducing the upper triangular matrix and transformed right-hand side2.236 0QT A = 0 0001.5810001.1180 0.935 ,0 01.789 0.632 TQ b= 1.336 , 0.338 0where QT is the product of all of the Givens rotations used.
We can now solve the uppertriangular system by back-substitution to obtain0.086x = 0.400 .1.4293.4.6Gram-Schmidt OrthogonalizationAnother method for computing the QR factorization is the Gram-Schmidt orthogonalizationprocess, which you may have seen in a calculus or linear algebra course. Given two vectorsa1 and a2 , we can determine two orthonormal vectors q1 and q2 that span the same subspaceby orthogonalizing one of the given vectors against the other, as shown in Fig.
3.3.This process can be extended to an arbitrary number of vectors ak (up to the dimensionof the space) by orthogonalizing each successive vector against all of the previous ones,giving the classical Gram-Schmidt orthogonalization procedure:3.4.
ORTHOGONALIZATION METHODS99a2a.1........................ ......................... ........... ....... ........ ....... . ......... ...................................................................................................1 .....2............................... .......... .......... ........... ..................... ............ ......... ........T....... ...2....1qqa − (q a2 )q1Figure 3.3: One step of Gram-Schmidt orthogonalization.for k = 1 to nqk = akfor j = 1 to k − 1rjk = qjT akqk = qk − rjk qjendrkk = kqk k2qk = qk /rkkendIf we take the ak to be the columns of the matrix A, then the resulting qk are the columnsof Q and the rij are the entries of the upper triangular matrix R in the QR factorizationof A.Unfortunately, the classical Gram-Schmidt procedure requires separate storage for A,Q, and R because the original ak are needed in the inner loop, and hence the qk cannotoverwrite the columns of A.
This shortcoming can be alleviated, however, if we orthogonalize each chosen vector in turn against all of the subsequent vectors, in effect generatingthe upper triangular matrix R by rows rather than by columns. This rearrangement of thecomputation is known as modified Gram-Schmidt orthogonalization:for k = 1 to nrkk = kak k2qk = ak /rkkfor j = k + 1 to nrkj = qkT ajaj = aj − rkj qkendendWe have continued to write the ak and qk separately for clarity, but now they can in factshare the same storage.
(A programmer would have formulated the algorithm this way inthe first place.) Unfortunately, separate storage for Q and R is still required, a disadvantagecompared with the Householder method, for which Q and R can share the space formerlyoccupied by A. On the other hand, Gram-Schmidt provides an explicit representation forQ, which, if desired, would require additional storage with the Householder method.In addition to requiring less storage than the classical procedure, an added bonus ofmodified Gram-Schmidt is that it is also numerically superior to classical Gram-Schmidt:100CHAPTER 3.
LINEAR LEAST SQUARESthe two procedures are mathematically equivalent, but in finite-precision arithmetic theclassical procedure tends to lose orthogonality among the computed qk . The modifiedprocedure also permits the use of column pivoting to deal with possible rank deficiency (seeSection 3.4.8).