Nash - Compact Numerical Methods for Computers (523163), страница 52
Текст из файла (страница 52)
Equation (2.13) can be rewritteny j+ l = 7jh3 +[2-h2 /(1+j 2 h2 ) ]y j-yj- 1 .(19.14)TABLE 19.1. Comparison of three compact methods for solvingFroberg’s differential equation by finite difference approximation.Order ofproblemAlgorithm 24 andequation (19.12)Algorithm 24 andequation (19.13)ShootingmethodLargest deviation from true solution410502·62E-68·34E-62·03E-42·88E-51·02E-20·930l·97E-61·14E-52·03E-4Approximate time for the three problems (min)0·0050·0280·003All computations performed in six hexadecimal digit arithmetic on an IBM370/168 computer.Thus, since y0 is fixed as zero by (2.9), if the value y 1 is known, all the points ofthe solution can be computed.
But (2.10) requires(19.15)yn+1=2thus we can consider the difference(19.16)f( y 1 ) =y n+ 1 - 2to generate a root-finding problem. This process is called a shooting method sincewe aim at the value of yn+l desired by choosing yl. Table 19.1 compares the threemethods suggested for n=4, 10 and 50. The main comparison is between thevalues found for the deviation from the true solutionory (x) =x+x 3(19.17)y(x j)=jh(l+j 2 h2 ) .(19.18)It is to be noted that the use of the conjugate gradients method with the normalequations (19.13) is unsuccessful, since we have unfortunately increased the illconditioning of the equations in the manner discussed in chapter 5. The other twomethods offer comparable accuracy, but the shooting method, applying algorithm18 to find the root of equation (19.16) starting with the interval [0,0·5], issomewhat simpler and faster.
In fact, it could probably be solved using a trial-anderror method to find the root on a pocket calculator.240Compact numerical methods for computersExample 19.2. Surveying-data fittingThe output below illustrates the solution of a linear least-squares problem of thetype described in example 2.4. No weighting of the observations is employed here,though in practice one would probably weight each height-difference observationby some factor inversely related to the distance of the observation position fromthe points whose height difference is measured. The problem given here wasgenerated by artificially perturbing the differences between the heights b=(0, 100, 121, 96)T.
The quantities G printed are the residuals of the normalequations.RUNSURVEYING LEAST SQUARES# OF POINTS? 4# OF OBSERVATIONS? 5HEIGHT DIFF BETWEEN? 1 AND? 2=?HEIGHT DIFF BETWEEN? 2 AND? 3=?HEIGHT DIFF BETWEEN? 3 AND? 4=?HEIGHT DIFF BETWEEN? 1 AND? 3=?HEIGHT DIFF BETWEEN? 2 AND? 4=?B( 1 )=-79.2575 G= 2.61738E-6B( 2 )= 20.7375 G=-2.26933E-6B( 3 )= 41.7575 G=-6.05617E-6B( 4 )= 16.7625 G=-5.73596E-6DIFF( 1 )=-99.995DIFF( 2 )=-21.02DIFF( 3 )= 24.995DIFF( 4 )=-121.015DIFF( 5 )= 3.97501# MATRIX PRODUCTS= 4HEIGHT FORM B(1)=0299.995121.0153496.02-99.99-21.0324.98-121.023.99The software diskette contains the data file EX24LSl.CNM which, used with the driverDR24LS.PAS, will execute this example.As a test of the method of conjugate gradients (algorithm 24) in solvingleast-squares problems of the above type, a number of examples were generatedusing all possible combinations of n heights.
These heights were produced using apseudo-random-number generator which produces numbers in the interval (0,1).All m=n*(n-1)/2 height differences were then computed and perturbed bypseudo-random values formed from the output of the above-mentioned generatorminus 0·5 scaled to some range externally specified. Therefore if S1 is the scalefactor for the heights and S2 the scale factor for the perturbation and the functionRND(X) gives a number in (0,1), heights are computed usingS1*RND(X)and perturbations on the height differences usingS2*[RND(X)-0·51.Table 19.2 gives a summary of these calculations.
It is important that theConjugate gradients method in linear algebra241convergence tolerance be scaled to the problem. Thus I have usedtol=n*S1*S1*eps*epswhere eps is the machine precision. The points to be noted in the solutions are asfollows:(i) the rapid convergence in terms of matrix-vector products needed to arrive ata solution (recall that the problem is singular since no ‘zero’ of height has beenspecified), and(ii) the reduction in the variance of the computed height differences from theirknown values.All the computations were performed on a Data General NOVA operating in23-bit binary arithmetic.TABLE 19.2.
Surveying-data fitting by algorithm 24.nm=n(n-1)/2MatrixproductsHeightscale S144410202020666451901901903223335100100100001000100010001000Perturbation Perturbationvariance?scale S20·01110·10·110020009·15E-69·15E-29·15E-29·25E-48·37E-4836·6334631Variance ofcomputed heightdifferences?1·11E-61·11E-21·11E-22·02E-46·96E-569·4326668·7Variancereductionfactor0·120·120·120·220·080·080·08† From ‘known’ values.19.3.
INVERSE ITERATION BY ALGORITHM 24The algorithm just presented, since it solves systems of linear equations, can beemployed to perform inverse iteration on symmetric matrix eigenproblems viaeither of the schemes (9.10) or (9.12), that is, the ordinary and generalisedsymmetric matrix eigenproblems. The only difficulties arise because the matrixA' = A - s B(19.19)where s is some shift, is not positive definite. (Only the generalised problem willbe treated.) In fact, it is bound to be indefinite if an intermediate eigenvalue issought. Superficially, this can be rectified immediately by solving the least-squaresproblem( A')T ( A')yi =(A ') TBx i(19.20)in place of (9.12a).
However, as with the Froberg problem of example 19.1, this isdone at the peril of worsening the condition of the problem. Since Hestenes(1975) has pointed out that the conjugate gradients method may work forindefinite systems-it is simply no longer supported by a convergence theorem-242Compact numerical methods for computerswe may be tempted to proceed with inverse iteration via conjugate gradients forany real symmetric problem.Ruhe and Wiberg (1972) warn against allowing too large an increase in thenorm of y in a single step of algorithm 24, and present techniques for coping withthe situation. Of these, the one recommended amounts only to a modification ofthe shift.
However, since Ruhe and Wiberg were interested in refining eigenvectors already quite close to exact, I feel that an ad hoc shift may do just as well if asudden increase in the size of the vector y, that is, a large step length k, isobserved.Thus my suggestion for solution of the generalised symmetric matrix eigenvalueproblem by inverse iteration using the conjugate gradients algorithm 24 is asfollows.(i) Before each iteration, the norm (any norm will do) of the residual vectorr =(A - e B) x(19.21)should be computed and this norm compared to some user-defined tolerance as aconvergence criterion.
While this is less stringent than the test made at STEPs 14and 15 of algorithm 10, it provides a constant running check of the closeness of thecurrent trial solution (e,x) to an eigensolution. Note that a similar calculationcould be performed in algorithm 10 but would involve keeping copies of thematrices A and B in the computer memory. It is relatively easy to incorporate arestart procedure into inverse iteration so that tighter tolerances can be enteredwithout discarding the current approximate solution.
Furthermore, by using b=0as the starting vector in algorithm 24 at each iteration and only permitting nconjugate gradient steps or less (by deleting STEP 12 of the algorithm), thematrix-vector multiplication of STEP 1 of algorithm 24 can be made implicit in thecomputation of residuals (19.21) sincec = Bx.(19.22)Note that the matrix H to be employed at STEP 5 of the algorithm isH = (A - s B) =A ' .(19.23)(ii) To avoid too large an increase in the size of the elements of b, STEP 7 ofalgorithm 24 should include a test of the size of the step-length parameter k.
I usethe testIf ABS(k) > 1/SQR(eps), then . . .where eps is the machine precision, to permit the shift s to be altered by the user. Iremain unconvinced that satisfactory simple automatic methods yet exist tocalculate the adjustment to the shift without risking convergence to an eigensolution other than that desired. The same argument applies against using theRayleigh quotient to provide a value for the shift s. However, since the Rayleighquotient is a good estimate of the eigenvalue (see § 10.2, p 100), it is a good idea tocompute it.(iii) In order to permit the solution b ofH b = (A - s B) b = Bx = c(19.24)243Conjugate gradients method in linear algebrato be used to compute the eigenvalue approximation(19.25)where bm is the largest element in magnitude in b (but keeps its sign!!), I use theinfinity norm (9.14) in computing the next iterate x from b.
To obtain aneigenvector normalised so thatx TBx=1(19.26)x Tc = x TBx .(19.27)one has only to formAt the point where the norm of r is compared to the tolerance to determine if thealgorithm has converged, c is once again available from the computation (19.21).The residual norm ||r|| should be divided by (x T Bx)½ if x is normalised by thisquantity.Since only the solution procedure for the linear equations arising in inverseiteration has been changed from algorithm 10 (apart from the convergencecriterion), the method outlined in the above suggestions will not converge anyfaster than a more conventional approach to inverse iteration.
Indeed for problems which are ill conditioned with respect to solution via the conjugate gradientsalgorithm 24, which nominally requires that the coefficient matrix H be nonnegative definite, the present method may take many more iterations. Nevertheless, despite the handicap that the shift s cannot be too close to the eigenvalue orthe solution vector b will ‘blow up’, thereby upsetting the conjugate gradientsprocess, inverse iteration performed in the manner described does provide a toolfor finding or improving eigensolutions of large matrices on a small computer.Examples are given at the end of §19.4.19.4.
EIGENSOLUTIONS BY MINIMISING THE RAYLEIGHQUOTIENTConsider two symmetric matrices A and B where B is also positive definite. TheRayleigh quotient defined byR=x T Ax/ xT Bx(19.28)then takes on its stationary values (that is, the values at which the partialderivatives with respect to the components of x are zero) at the eigensolutions of(2.63)Ax=e Bx.In particular, the maximum and minimum values of R are the extreme eigenvalues of the problem (2.63). This is easily seen by expanding(19.29)where ~j is the jth eigenvector corresponding to the eigenvalue ej.
Then we have(19.31)244Compact numerical methods for computersIfe 1 >e 2 > . . .>e n(19.31)then the minimum value of R is en and occurs when x is proportional to φ n . Themaximum value is e1. Alternatively, this value can be obtained via minimisation of-R. Furthermore, if B is the identity, then minimising the Rayleigh quotientR' =x T (A-k1 n )2x/ x T x(19.32)will give the eigensolution having its eigenvalue closest to k.While any of the general methods for minimising a function may be applied tothis problem, concern for storage requirements suggests the use of the conjugategradients procedure. Unfortunately, the general-purpose algorithm 22 may converge only very slowly.
This is due (a) to the inaccuracy of the linear search, and(b) to loss of conjugacy between the search directions t j , j=1, 2, . . . , n. Both theseproblems are exacerbated by the fact that the Rayleigh quotient is homogeneousof degree zero, which means that the Rayleigh quotient takes the same value forany vector Cx, where C is some non-zero constant. This causes the Hessian of theRayleigh quotient to be singular, thus violating the conditions normally requiredfor the conjugate gradients algorithm.