Heath - Scientific Computing (523150), страница 36
Текст из файла (страница 36)
For the general case, the overall cost isroughly 10n3 operations if only the eigenvalues are needed, and about 25n3 operations ifthe eigenvectors are also desired.4.34.3.1Methods for Computing Selected EigenvaluesPower MethodThe QR and Jacobi methods are designed to compute all of the eigenvalues of a matrixand consequently require a great deal of work.
In practice, one may need only one ora few eigenvalues and corresponding eigenvectors. The simplest method for computing asingle eigenvalue and eigenvector of a matrix is the power method , which in effect takessuccessively higher powers of the matrix times an initial starting vector. Assume that thematrix has a unique eigenvalue λ1 of maximum modulus, with corresponding eigenvectoru1 . Then, starting from a given nonzero vector x0 , the iteration schemexk = Axk−1converges to a multiple of u1 , the eigenvector corresponding to the dominant eigenvalue λ1 .PnTo see why, we first express the starting vector x0 as a linear combination, x0 =i=1 αi ui , where the ui are eigenvectors of A.
We then havexk = Axk−1 = A2 xk−2 = · · · = Ak x0nnnXXX= Akαi ui =αi Ak ui =λki αi uii=1i=1i=14.3. METHODS FOR COMPUTING SELECTED EIGENVALUES= λk1 (α1 u1 +nX127(λi /λ1 )k αi ui ).i=2Since |λi /λ1 | < 1 for i > 1, successively higher powers go to zero, leaving only the componentcorresponding to u1 .Example 4.10 Power Method. In the sequence of vectors produced by the powermethod, the ratio of the values of a given component of xk from one iteration to the nextconverges to the dominant eigenvalue λ1 . For example, if 1.5 0.50A=and x0 =,0.5 1.51then we obtain the following sequence.k012345678xTk0.01.00.51.51.52.53.54.57.58.515.516.531.532.563.564.5127.5 128.5Ratio1.5001.6671.8001.8891.9411.9701.9851.992The sequence of vectors xk is converging to a multiple of the eigenvector [ 1 1 ]T , and theratio of successive iterates for each component is converging to the corresponding eigenvalue,2, which we saw in Example 4.1 is indeed the largest eigenvalue of this matrix.In practice the power method usually works, but it can fail for any of a number ofreasons:• The starting vector may have no component in the dominant eigenvector u1 (i.e., α1 = 0).This possibility is not a problem in practice, because rounding error usually introducessuch a component in any case.• For a real matrix and starting vector, the iteration can never converge to a complexvector.• There may be more than one eigenvalue having the same (maximum) modulus, in whichcase the iteration may converge to a vector that is a linear combination of the corresponding eigenvectors.4.3.2NormalizationGeometric growth of the components at each iteration risks eventual overflow (or underflowif the dominant eigenvalue is less than 1 in magnitude), so normalizing the approximateeigenvector at each iteration is preferable, say, by requiring its largest component to havemodulus 1.
This step gives the iteration schemeyk = Axk−1 ,128CHAPTER 4. EIGENVALUES AND SINGULAR VALUESxk = yk /kyk k∞ .With this normalization, kyk k∞ → |λ1 |, and xk → u1 /ku1 k∞ .Example 4.11 Power Method with Normalization. Repeating the previous examplewith this normalized scheme, we get the following sequence:k012345678xTk0.0000.3330.6000.7780.8820.9390.9690.9840.992kyk k∞1.01.01.01.01.01.01.01.01.01.5001.6671.8001.8891.9411.9701.9851.992The eigenvalue estimates have not changed, but now the approximate eigenvector is normalized at each iteration, thereby avoiding geometric growth or shrinkage of its components.4.3.3Geometric InterpretationThe behavior of the power method is depicted geometrically in Fig.
4.1. The eigenvectorsof the example matrix are shown by dashed arrows. The initial vector 01−1x0 ==1+1111contains equal components in the two eigenvectors. Repeated multiplication by the matrixA, however, causes the component in the first eigenvector (corresponding to the largereigenvalue, 2) to dominate, and hence the sequence of vectors converges to that eigenvector.1.0.........................u20.50.0x0x1 x2 x3 x4..... ..
.............. ....... ......... ............................. ........................... ............................................. . . ............ ... ... ...1............ .... .... .... ...... .............. .................... .......... ................... ... .................... ... .............................. .
... ..... .... .. ... .................. .. ....................... ... ..................... ..... ....................... ............................−1.0u−0.50.00.51.0Figure 4.1: Geometric interpretation of the power method.4.3.4ShiftsThe convergence rate of the power method depends on the ratio |λ2 /λ1 |, where λ2 is theeigenvalue having second-largest modulus: the smaller this ratio, the faster the convergence.4.3. METHODS FOR COMPUTING SELECTED EIGENVALUES129It may be possible to choose a shift, A − σI, such that λ2 − σ λ2 λ1 − σ < λ1 ,and thus convergence is accelerated.
Of course, the shift must then be added to the resultto obtain the eigenvalue of the original matrix. In our earlier example, for instance, if wepick a shift of σ = 1 (which is equal to the other eigenvalue), then the ratio becomes zeroand the method converges in a single iteration. In general, we would not be able to makesuch a fortuitous choice, but such shifts can still be extremely useful in some contexts, aswe will see later.4.3.5DeflationSuppose that an eigenvalue λ1 and corresponding eigenvector x1 for a matrix A have beencomputed. We now consider how to compute additional eigenvalues λ2 , . .
. , λn of A, ifneeded, by a process called deflation, which effectively removes the known eigenvalue.Let H be any nonsingular matrix such that Hx1 = αe1 , a scalar multiple of the firstcolumn of the identity matrix I (for example, an appropriate Householder transformationis a good choice for H). Then the similarity transformation determined by H transformsA into the formλ1 bT−1HAH =,o Bwhere B is a matrix of order n − 1 having eigenvalues λ2 , .
. . , λn . Thus, we can work withB to compute the next eigenvalue λ2 . Moreover, if y2 is an eigenvector of B correspondingto λ2 , then bT y2α−1x2 = H, where α =,y2λ 2 − λ1is an eigenvector corresponding to λ2 for the original matrix A, provided λ1 6= λ2 . Thisprocess can be repeated to find additional eigenvalues and eigenvectors, as needed.An alternative approach to deflation is to let v1 be any vector such that v1T x1 = λ1 .Then the matrix A − x1 v1T has eigenvalues 0, λ2 , .
. . , λn . Possible choices for v1 include• v1 = λ1 x1 , if A is symmetric and x1 is normalized so that kx1 k2 = 1• v1 = λ1 y1 , where y1 is the corresponding left eigenvector (i.e., AT y1 = λ1 y1 ) normalizedso that y1T x1 = 1• v1 = AT ek , if x1 is normalized so that kx1 k∞ = 1 and the kth component of x1 is 14.3.6Inverse IterationFor some applications, the smallest eigenvalue of a matrix is required rather than the largest.We can make use of the fact that the eigenvalues of A−1 are the reciprocals of those of A,and hence the smallest eigenvalue of A is the reciprocal of the largest eigenvalue of A−1 .We therefore use the inverse iteration schemeAyk = xk−1 ,130CHAPTER 4.
EIGENVALUES AND SINGULAR VALUESxk = yk /kyk k∞ ,which is equivalent to the power method applied to A−1 . Of course, the inverse of A isnot computed explicitly. Instead the system of linear equations is solved at each iteration,perhaps by LU factorization, which need be done only once. Inverse iteration converges tothe eigenvector corresponding to the smallest eigenvalue of A. The eigenvalue obtained isthe dominant eigenvalue of A−1 , and hence its reciprocal is the smallest eigenvalue of A inmodulus.As before, a shifting strategy (working with A − σI for some scalar σ) can greatlyimprove convergence.
For this reason, inverse iteration is particularly useful for computingthe eigenvector corresponding to an approximate eigenvalue that has already been computedby some other means because it converges very rapidly when applied to the matrix A − λI,where λ is an approximate eigenvalue. Inverse iteration is also useful for computing theeigenvalue of a matrix closest to a given value β, for if β is used as shift, then the desiredeigenvalue corresponds to the smallest eigenvalue of the shifted matrix.Example 4.12 Inverse Iteration.
As an illustration of inverse iteration, we apply it toour previous example to compute the smallest eigenvalue, obtaining the sequencek0123456xTk0.000−0.333−0.600−0.778−0.882−0.939−0.969kyk k∞1.01.01.01.01.01.01.00.7500.8330.9000.9440.9710.985which is converging to the eigenvector [ −1 1 ]T corresponding to the dominant eigenvalueof A−1 , which is the same as the eigenvector corresponding to the smallest eigenvalue of A.The approximate eigenvalue is converging to 1, which is its own reciprocal in this case.4.3.7Rayleigh QuotientIf one is given an approximate eigenvector x for a real matrix A, determining the bestestimate for the corresponding eigenvalue λ can be considered as an n × 1 linear leastsquares approximation problemxλ ≈ Ax.From the normal equation xT xλ = xT Ax, we see that the least squares solution is givenbyxT Axλ= T .x xThe latter quantity, known as the Rayleigh quotient, has many useful properties.
Forexample, it can be used to accelerate the convergence of an iterative method such as thepower method, since at iteration k the Rayleigh quotient xTk Axk /xTk xk gives a betterapproximation to an eigenvalue than that provided by the basic method alone.4.3.