Deturck, Wilf - Lecture Notes on Numerical Analysis (523142), страница 26
Текст из файла (страница 26)
.. . ... ... .. ... . .0 0 0.. .. ... . .0 0 0.. .. ... . .row p row q 000 000······00·········...············cos θ······...······· · · − sin θ · · ·············00············ 0 0 0··· 0 0 0 .. .. .. ··· ··· . . . .. .. .. ··· ··· . . . sin θ · · · 0 0 0 ... .. .. ··· ··· . . . cos θ · · · 0 0 0 . . .. .. .. . . .
. ···0000(3.8.3)··· 0 1 0 ··· 0 0 1Not only is Jpq (θ) an orthogonal matrix, there is a reasonably pleasant way to pictureits action on n-dimensional space. Since the 2 × 2 matrix of (3.8.1) is the familiar rotationof the plane through an angle θ, we can say that the matrix Jpq (θ) carries out a special kindof rotation of n-dimensional space, namely one in which a certain plane, the plane of thepth and qth coordinate, is rotated through the angle θ, and the remaining coordinates areall left alone. Hence Jpq (θ) carries out a two-dimensional rotation of n-dimensional space.These matrices of Jacobi turn out to be useful in a host of numerical algorithms for theeigenproblem.
The first application that we’ll make of them will be to the real symmetricmatrices, but later we’ll find that the same two-dimensional rotations will play importantroles in the solution of non-symmetric problems as well.First, let’s see how they can help us with symmetric matrices. What we propose to dois the following. If a real symmetric matrix A is given, we will determine p, q, and theangle θ in such a way that the matrix JAJ T is a little bit more diagonal (whatever thatmeans!) than A is. It turns out that this can always be done, at any rate unless A is alreadydiagonal, so we will have the germ of a numerical procedure for computing eigenvalues andeigenvectors.114Numerical linear algebraIndeed, suppose we have found out how to determine such an angle θ, and let’s then seewhat the whole process would look like.
Starting with A, we would find p, q, and θ, andthen the matrix JAJ T is somehow a little more diagonal than A was. Now JAJ T is still asymmetric matrix (try to transpose it and see what happens) so we can do it again. Afterfinding another p, q and θ we will have J 00 J 0 A(J 00 J 0 )T a bit “more diagonal” and so forth.Now suppose that after some large number of repetitions of this process we find thatthe current matrix is very diagonal indeed, so that perhaps aside from roundoff error it isa diagonal matrix D.
Then we will know thatD = (product of all J’s used)A(product of all J’s used)T .(3.8.4)If we let P denote the product of all J’s used, then we have P AP T = D, so the columnsof P will be the (approximate) eigenvectors of A and the diagonal elements of D will beits eigenvalues. The matrix P will automatically be an orthogonal matrix, since it is theproduct of such matrices, and the product of orthogonal matrices is always orthogonal(proof?).That, at any rate, is the main idea of Jacobi’s method (he introduced it in order tostudy planetary orbits!). Let’s now fill in the details.First we’ll define what we mean by “more diagonal”. For any square, real matrix A, letOd(A) denote the sum of the squares of the off-diagonal entries of A.
From now on, insteadof “B is more diagonal than A,” we’ll be able to say Od(B) < Od(A), which is much moreprofessional.Now we claim that if A is a real symmetric matrix, and it is not already a diagonalmatrix, then we can find p, q and θ such that Od(Jpq (θ)AJpq (θ)T ) < Od(A). We’ll do thisby a very direct computation of the elements of JAJ T (we’ll need them anyway for thecomputer program), and then we will be able to see what the new value of Od is.So fix p, q and θ. Then by direct multiplication of thethat (cos θ)Apj + (sin θ)Aqj(JA)ij =−(sin θ)Apj + (cos θ)Aqjaijmatrix in (3.8.3) by A we findif i = pif i = qotherwise(3.8.5)Then after one more multiplication, this time on the right by the transpose of the matrixin (3.8.3), we find thatif i 6∈ {p, q}; j = p or i = p; j 6∈ {p, q}if i 6∈ {p, q}; j = q or i = q; j 6∈ {p, q}if i = j = p(JAJ T )ij =if i = j = qif i = p, j = q or i = q j = potherwise.(3.8.6)In (3.8.6) we have written C for cos θ and S for sin θ.CAip + SAiq−SAip + CAiqC 2 App + 2SCApq + S 2 AqqS 2 App − 2SCApq + C 2 AqqCS(Aqq − App ) + (C 2 − S 2 )ApqaijNow we are going to choose the angle θ so that the elements Apq and Aqp are reducedto zero, assuming that they were not already zero.
To do this we refer to the formula in3.9 Convergence of the Jacobi method115(3.8.6) for Apq , equate it to zero, and then solve for θ. The result istan 2θ =2ApqApp − Aqqand we will choose the value of θ that lies between − π4 and(3.8.7)π4.With this value of θ, we will have reduced one single off-diagonal element of A to zeroin the new symmetric matrix JAJ T .The full Jacobi algorithm consists in repeatedly executing these plane rotations, eachtime choosing the largest off-diagonal element Apq and annihilating it by the choice (3.8.7)of θ. After each rotation, Od(A) will be a little smaller than it was before.
We will provethat Od(A) converges to zero.It is important to note that a plane rotation that annihilates Apq may “revive” someother Ars that was set to zero by an earlier plane rotation. Hence we should not think ofthe zero as “staying put”.Let’s now see exactly what happens to Od(A) after a single rotation. If we sum thesquares of all of the off-diagonal elements of JAJ T using the formulas (3.8.6), but remembering that the new Apq =0, then it’s quite easy to check that the new sum of squares isexactly equal to the old sum of squares minus the squares of the two entries Apq and Aqpthat were reduced to zero. Hence we haveTheorem 3.8.1 Let A be an n × n real, symmetric matrix that is not diagonal.
If Apq 6= 0for some p 6= q, then we can choose θ as in equation (3.8.7) so that if J = Jpq (θ) thenOd(JAJ T ) = Od(A) − 2A2pq < Od(A).3.9(3.8.8)Convergence of the Jacobi methodWe have now described the fundamental operation of the Jacobi algorithm, namely theplane rotation in n-dimensional space that sends a real symmetric matrix A into JAJ T ,and we have explicit formulas for the new matrix elements.
There are still a number ofquite substantive points to discuss before we will be able to assemble an efficient programfor carrying out the method. However, in line with the philosophy that it is best to breakup large programs into small, manageable chunks, we are now ready to prepare the firstmodule of the Jacobi program.What we want is to be able to execute the rotation through an angle θ, according to theformulas (3.8.6) of the previous section. This could be accomplished by a single subroutinethat would take the symmetric matrix A and the sine and cosine of the rotation angle, andexecute the operation (3.8.6).If we keep later applications in mind, then the best choice for a module will be one thatwill, on demand, multiply a given not-necessarily-symmetric matrix on the left by J or onthe right by J T , depending on the call. This is one of the situations we referred to earlier116Numerical linear algebrawhere the most universal choice of subroutine will not be the most economical one in everyapplication, but we will get a lot of mileage out of this routine!Hence, suppose we are given1.
an n × n real matrix A2. the sine S and cosine C of a rotation angle3. the plane [p, q] of the rotation, and4. a parameter option that will equal 1 if we want to do JA, and 2 if we want AJ T .The procedure will be calledProcedure rotate(s,c,p,q,option);What the procedure will do is exactly this. If called with option = 1, it will multiply Aon the left by J, according to the formulas (3.8.5), and exit.
If option = 2, it will multiplyA on the right by J T . The Maple procedure is as follows:rotate:=proc(s,c,p,q,opt) local j,temp;global A,n;if opt=1 thenfor j from 1 to n dotemp:=evalf(c*A[p,j]+s*A[q,j]);A[q,j]:=-s*A[p,j]+c*A[q,j];A[p,j]:=temp;odelsefor j from 1 to n dotemp:=c*A[j,p]+s*A[j,q];A[j,q]:=-s*A[j,p]+c*A[j,q];A[j,p]:=temp;odfi;RETURN()end:To carry out one iteration of the Jacobi method, we will have to call rotate twice, oncewith option = 1 and then with option = 2.The amount of computational labor that is done by this module is O(N ) per call, sinceonly two lines of the matrix are affected by its operation.Next, let’s prove that the results of applying one rotation after another do in factconverge to a diagonal matrix.3.9 Convergence of the Jacobi method117Theorem 3.9.1 Let A be a real symmetric matrix. Suppose we follow the strategy of searching for the off-diagonal element of largest absolute value, choosing so as to zero out thatelement by carrying out a Jacobi rotation on A, and then repeating the whole process on theresulting matrix, etc.
Then the sequence of matrices that is thereby obtained approaches adiagonal matrix D.Proof. At a certain stage of the iteration, let Apq denote the off-diagonal element of largestabsolute value. Since the maximum of any set of numbers is at least as big as the average ofthat set (proof?), it follows that the maximum of the squares of the off-diagonal elements ofA is at least as big as the average square of an off-diagonal element.
The average square isequal to the sum of the squares of the off-diagonal elements divided by the number of suchelements, i.e., divided by n(n − 1). Hence the average square is exactly Od(A)/(n(n − 1)),and thereforeOd(A)A2pq ≥.(3.9.1)n(n − 1)Now the effect of a single rotation of the matrix is to reduce Od(A) by 2A2pq , so equation(3.8.8) yieldsOd(JAJ T ) = Od(A) − 2A2pq2 Od(A)n(n − 12= 1−Od(A).n(n − 1)≤ Od(A) −(3.9.2)Hence a single rotation will reduce Od(A) by a multiplicative factor of 1 − 2/(n(n − 1)) atleast. Since this factor is less than 1, it follows that the sum of squares of the off-diagonalentries approaches zero as the number of plane rotations grows without bound, completingthe proof.The proof told us even more since it produced a quantitative estimate of the rate atwhich Od(A) approaches zero.
Indeed, after r rotations, the sum of squares will havedropped to at mostr21−Od(original A).(3.9.3)n(n − 1If we put r = n(n − 1)/2, then we see that Od(A) has dropped by at least a factor of(approximately) e. Hence, after doing an average of one rotation per off-diagonal element,the function Od(A) is no more than 1/e times its original value. After doing an average of,say, t rotations per off-diagonal element (i.e., tn(n − 1)/2 rotations), the function Od(A)will have dropped to about e−t times its original value.