Heath - Scientific Computing (523150), страница 37
Текст из файла (страница 37)
METHODS FOR COMPUTING SELECTED EIGENVALUES131Example 4.13 Rayleigh Quotient. For Example 4.11 using the power method, thevalue of the Rayleigh quotient at each iteration is shown next.k0123456xTk0.0000.3330.6000.7780.8820.9390.969kyk k∞1.01.01.01.01.01.01.01.5001.6671.8001.8891.9411.970xTk Axk /xTk xk1.5001.8001.9411.9851.9961.9992.000Thus, the Rayleigh quotient converges to the dominant eigenvalue, 2, faster than the successive approximations produced by the power method alone.4.3.8Rayleigh Quotient IterationGiven an approximate eigenvector, the Rayleigh quotient yields a very good estimate for thecorresponding eigenvalue.
Conversely, inverse iteration converges very rapidly to an eigenvector if an approximate eigenvalue is used as shift, with a single iteration often sufficing.It is natural, therefore, to combine these two ideas in the Rayleigh quotient iterationσk = xTk Axk /xTk xk ,(A − σk I)yk+1 = xk ,xk+1 = yk+1 /kyk+1 k∞ ,starting from a given nonzero vector x0 . This iteration scheme is especially effective forsymmetric matrices and usually converges very rapidly.On the other hand, using a different shift at each iteration means that the matrix mustbe refactored each time to solve the linear system, so that the cost per iteration is relativelyhigh unless the matrix has some special form that makes the factorization easy. In general,the power method, inverse iteration, and Rayleigh quotient iteration show the expectedtrade-off, with faster convergence coming at the expense of more work per iteration.Rayleigh quotient iteration also works for complex matrices, for which the transpose isreplaced by conjugate transpose, and the Rayleigh quotient becomes xH Ax/xH x.Example 4.14 Rayleigh Quotient Iteration.
Using the same matrix as our previousexamples and a randomly chosen starting vector x0 , Rayleigh quotient iteration convergesto the accuracy shown in only two iterations:k012xTk0.807 0.3970.924 1.0001.000 1.000σk1.8961.9982.0001324.3.9CHAPTER 4. EIGENVALUES AND SINGULAR VALUESLanczos Method for Symmetric MatricesThe power method produces a sequence of vectors, each of which is a successively betterapproximation to an eigenvector. At any point in the process, however, the approximation isbased on a single vector, which spans a one-dimensional subspace. A better approximationshould result if we compute the best approximation to an eigenvector over an entire subspaceof higher dimension. The Rayleigh-Ritz procedure is a method for doing just that.Let A be an n × n symmetric matrix, and let S be an n × m matrix, n ≥ m, whosecolumns span a subspace of dimension m.
Orthogonalize the columns of S (see Section 3.4),if necessary, to obtain an n × m matrix Q with orthonormal columns spanning the samesubspace. Form the m × m symmetric matrix B = QT AQ. Denote the eigenvalues andcorresponding eigenvectors of B by γi and yi , respectively, and let zi = Qyi , i = 1, . . . , m.Then it can be shown that the γi and zi , which are called Ritz values and Ritz vectors,respectively, are the best possible approximations to eigenvalue-eigenvector pairs of A overthe subspace spanned by S.
One must still compute the eigenvalues of B, but if m n,this problem should be much easier.So how can we obtain a suitable subspace? The answer is that we can use the Krylovsubspace spanned by the sequence of vectorsx, Ax, A2 x, . . . , Am−1 x,where x is any nonzero starting vector. Note that this is just the sequence of vectorsgenerated by the power method, which means that we will obtain the best eigenvalueeigenvector approximation over the entire subspace spanned by all of the iterates, ratherthan using only the last vector in the sequence.
Orthogonalization of an arbitrary set ofvectors would be very expensive, but for the Krylov sequence, it can be shown that thesuccessive orthogonal vectors satisfy a three-term recurrence, so that each new vector needbe orthogonalized only against the previous two, rather than all of the previous vectors(which means that they need not be saved).
Thus, in this case the m × m matrix QT AQis tridiagonal, and we denote it by Tm . As m increases, the eigenvalues of Tm becomeincreasingly better approximations to the extreme (largest and smallest) eigenvalues of A.The ideas we have just outlined—using the Rayleigh-Ritz approximation over the Krylovsubspace and taking advantage of the resulting three-term recurrence—form the basis forthe Lanczos method for computing eigenvalues and eigenvectors of symmetric matrices.Beginning with an arbitrary nonzero starting vector r0 , and taking β0 = kr0 k2 and q0 = o,the following steps are repeated for k = 1, . . . , m:1.2.3.4.5.6.qk = rk−1 /βk−1 .uk = Aqk .rk = uk − βk−1 qk−1 .αk = qkT rk .rk = rk − αk qk .βk = krk k2 .The αk , k = 1, . . .
, m, and βk , k = 1, . . . , m − 1, are the diagonal and subdiagonal entries,respectively, of the symmetric tridiagonal matrix Tm . If at any point βk = 0, then thealgorithm appears to break down, but in that case an invariant subspace has already been4.3. METHODS FOR COMPUTING SELECTED EIGENVALUES133identified (i.e., the Ritz values and vectors are already exact at that point). Note thatthe algorithm as just stated does not produce the eigenvalues and eigenvectors directlybut rather the tridiagonal matrix Tm , whose eigenvalues and eigenvectors must then becomputed by some other method to obtain the Ritz values and vectors.In principle, if the foregoing algorithm were run until m = n, then the resulting tridiagonal matrix would be orthogonally similar to A.
In practice, unfortunately, roundingerror causes a loss of orthogonality that invalidates this expectation. This problem can beovercome by reorthogonalizing the vectors as needed, but the expense of doing so can besubstantial. Alternatively, one can ignore the problem, in which case the algorithm stillproduces good eigenvalue approximations, but multiple copies of some eigenvalues may begenerated, which can be a nuisance to say the least.
In any case, there are better ways totridiagonalize a matrix (e.g., Householder’s method) than running the Lanczos algorithmfor n steps. The great virtue of the Lanczos method is its ability to produce good approx√imations to the extreme eigenvalues with m n, often on the order of n. Moreover,the algorithm requires only one matrix-vector multiplication by A per step and very littleauxiliary storage, so it is ideally suited to large sparse matrices, unlike methods that alterthe entries of A.Example 4.15 Lanczos Method. The behavior of the Lanczos method is illustratedin Fig. 4.2, where the algorithm is applied to a matrix of order 29 whose eigenvalues are1, .
. . , 29. The iteration count is plotted on the vertical axis, and the corresponding Ritzvalues are on the horizontal axis. At each iteration k, the points (γi , k), i = 1, . . . , k, areplotted. We see that the extreme eigenvalues are closely approximated by Ritz values afteronly a few iterations, but the interior eigenvalues take much longer to appear. For this smallmatrix with well-separated eigenvalues, the Ritz values are identical to the eigenvalues after29 iterations, as theory predicts, but for more realistic problems this cannot be relied uponowing to rounding error. Moreover, running the algorithm for a full n iterations may notbe feasible if n is very large.
The main point, however, is the relatively rapid convergenceto the extreme eigenvalues, which is typical of the Lanczos method in general.The Lanczos method most quickly produces approximate eigenvalues near the ends ofthe spectrum. If eigenvalues are needed in the middle of the spectrum, say, near the value σ,then the algorithm can be applied to the matrix (A − σI)−1 , assuming that it is practical tosolve systems of the form (A − σI)x = y. Such a “shift-and-invert” strategy enables muchmore rapid convergence to interior eigenvalues, since they correspond to extreme eigenvaluesof the new matrix.A generalization of the Lanczos method to nonsymmetric matrices, known as the Arnoldimethod , reduces the input matrix to Hessenberg form rather than tridiagonal form.
Severalsoftware packages that implement the Lanczos and Arnoldi methods are available.4.3.10Spectrum-Slicing Methods for Symmetric MatricesAnother family of methods is based on counting eigenvalues. For real symmetric matrices,there are various methods for determining the number of eigenvalues that are less than agiven real number σ. By systematically choosing various values for σ (slicing the spectrum134CHAPTER 4. EIGENVALUES AND SINGULAR VALUES30 . . . .
. . . . . . . . . . . . . . . . . . . . . . . . ... .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... . . . . . . . . . . . . .. . . .. . . . . . . .25 .. .. .. .. .. .. .. .. .. .. . .. . . ... . . . .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. . . . . .. . ... . .. ... . .. .. .. .. .. ... . .
. . . . . .. . .. ... . . . . .20 .. .. .. .. .. .. . . .. . .. . .. . . . .. . . .. .. ... . .... .. .. .. .. .. .. . . .... . . . .. . . . . .... ... . . . . . ..... . . . .... ... ... ...iteration 15 .. .. .. .. .. . . ........ .. .. . . . . .. . .. .. .
.. . ... ... .. . . . . .. . ... .......10 . ........ .. . . . . .. ... . . . . .. ... . ....5 .... .......0051015202530Ritz valuesFigure 4.2: Convergence of Ritz values to eigenvalues in the Lanczos method.at σ) and monitoring the resulting count, any eigenvalue can be isolated as accurately asdesired. We sketch such methods briefly here.Let A be a real symmetric matrix.
The inertia of A is a triple of integers consistingof the numbers of positive, negative, and zero eigenvalues. A congruence transformationhas the form SAS T , where S is any nonsingular matrix. Unless S T = S −1 (i.e., S isorthogonal), a congruence is not a similarity transformation and hence does not preservethe eigenvalues of A. However, by Sylvester’s Law of Inertia, a congruence transformationdoes preserve the inertia of A, i.e., the numbers of positive, negative, and zero eigenvaluesare invariant under congruences.If we can find a congruence transformation that makes the inertia easy to determine, thenwe can apply it to the matrix A−σI to determine the numbers of eigenvalues to the right orleft of σ. An obvious candidate is the LDLT factorization discussed in Section 2.5.2, whereD is a matrix whose inertia is easily determined.
By computing the LDLT factorization,and hence the inertia, of A − σI for any desired value of σ, individual eigenvalues can beisolated as accurately as desired using an interval bisection technique (see Section 5.2.1).Another spectrum-slicing method for computing individual eigenvalues is based on theSturm sequence property of symmetric matrices. Let A be a symmetric matrix and let pk (σ)denote the determinant of the leading principal submatrix of order k of A − σI. Then thezeros of pk (σ) strictly separate (i.e., are interleaved with) those of pk−1 (σ). Furthermore,the number of agreements in sign of successive members of the sequence pk (σ), for k =1, . .