Nash - Compact Numerical Methods for Computers (523163), страница 17
Текст из файла (страница 17)
The computation in exactarithmetic is given for comparison.By using data in deviation form, this difficulty is avoided. However, there is asecond difficulty in using the cross-products matrix which is inherent in itsformation. Consider the least-squares problem defined byTABLE 5.1. Results from formula (5.21).Exact2(a1 )(a2 ) 2(a 3)2(a4 ) 2sumsum/4var( a)100000010040041008016101606440280841007021-1007012.258·75Truncated100000 * 10100400 * 10100801* 10101606 * 10402807 * 10100701* 10-100701 * 100Rounded100000 * 10100400 * 10100802 * 10101606 * 10402808 * 10100702 * 10-100701 * 10l * 10 = 10An added note to caution.
In this computation all the operations arecorrectly rounded or truncated. Many computers are not so fastidiouswith their arithmetic.Some comments on the formation of the cross-products matrix69In six-digit rounded computation this produceswhich is singular since the first two columns or rows are identical. If we usedeviations from means (and drop the constant column) a singular matrix stillresults. For instance, on a Data General NOVA minicomputer using a 23-bitbinary mantissa (between six and seven decimal digits), the A matrix usingdeviation from mean data printed by the machine isand the cross-products matrix as printed iswhich is singular.However, by means of the singular-value decomposition given by algorithm 1,the same machine computes the singular values of A (not A') as2·17533, 1·12603 and 1E–5.Since the ratio of the smallest to the largest of the singular values is only slightlylarger than the machine precision (2-22 2·38419E-7), it is reasonable to presume that the tolerance q in the equation (3.37) should be set to some valuebetween 1E-5 and 1·12603.
This leads to a computed least-squares solutionwith a residual sum of squaresr T r = 1·68955E–5.With the tolerance of q = 0, the computed solution iswithTr r = 1·68956E–4.(In exact arithmetic it is not possible for the sum of squares with q= 0 to exceedthat for a larger tolerance.)70Compact numerical methods for computersWhen using the singular-value decomposition one could choose to work withdeviations from means or to scale the data in some way, perhaps using columnswhich are deviations from means scaled to have unit variance. This will thenprevent ‘large’ data from swamping ‘small’ data. Scaling of equations has proved adifficult and somewhat subjective issue in the literature (see, for instance, Dahlquist and Björck 1974, p 181ff).Despite these cautions, I have found the solutions to least-squares problemsobtained by the singular-value decomposition approach to be remarkably resilientto the omission of scaling and the subtraction of means.As a final example of the importance of using decomposition methods forleast-squares problems, consider the data (Nash and Lefkovitch 1976)This is a regression through the origin and can be shown to have the exact solutionwith a zero residual sum of squares.
If we wish to use a method which only scansthe data once, that is, explicit residuals are not computed, then solution of thenormal equations allows the residual sum of squares to be computed viar T r = b T b – bTAx.(5.23)Alternatively, algorithm 4 can be used, to form the sum of squares by meansof the uncorrelated residuals (4.30).The following solutions were found using a Hewlett-Packard 9830 desk calculator (machine precision equal to 1E-11, but all arrays in the examples storedin split precision equal to 1E–5):(i) Conventional regression performed by using the Choleski decomposition(§7.1) to solve the normal equations gave( α) for α = 8andr Tr = 4·22E–4andr T r = 0·046709.(h) for α = 64Some comments on the formation of the cross-products matrix(ii) Algorithm 4 gave(a) for α =8andrT r = 0andrT r = 0.71(6) for α = 64,Since the first edition of this book appeared, several authors have consideredproblems associated with the formation of the sum of squares and cross-productsmatrix, in particular the question of collinearity.
See, for example, Nash (1979b) andStewart (1987).PreviousChapter 6LINEAR EQUATIONS-A DIRECT APPROACH6.1. INTRODUCTIONSo far we have been concerned with solving linear least-squares problems. Nowthe usually simpler problem of linear equations will be considered. Note that aprogram designed to solve least-squares problems will give solutions to linearequations. The residual sum of squares must be zero if the equations areconsistent. While this is a useful way to attack sets of equations which aresuspected to possess singular coefficient matrices, since the singular-value decomposition permits such to be identified, in general the computational cost will betoo high. Therefore this chapter will examine a direct approach to solving systemsof linear equations.
This is a variant of the elimination method taught to studentsin secondary schools, but the advent of automatic computation has changed onlyits form, showing its substance to be solid.6.2. GAUSS ELIMINATIONLet us now look at this approach to the solution of equations (2.2). First note thatif A is upper-triangular so that Aij = 0 if i > j, then it is easy to solve for x by aback-substitution, that is(6.1)(6.2)and generally(6.3)These equations follow directly from equations (2.2) and the supposed upper- orright-triangular structure of A. The Gauss elimination scheme uses this idea tofind solutions to simultaneous linear equations by constructing the triangular formRx = f(6.4)from the original equations.Note that each of the equations (2.2), that isfor i = 1, 2, .
. . , n(6.5)can be multiplied by an arbitrary quantity without altering the validity of theequation; if we are not to lose any information this quantity must be non-zero.72HomeNextLinear equations—a direct approach73Furthermore, the sum of any two equations of (6.5) is also an equation of the set(6.5). Multiplying the first equation (i.e. that for i = 1) bymi1 = Ai1/A11(6.6)and subtracting it from the ith equation gives new equationsfor i = 2, 3, . . ., n(6.7)where= Aik – mi 1 A 1 k(6.8)and(6.9)But(6.10)so that we have eliminated all but the first element of column 1 of A .
This processcan now be repeated with new equations 2, 3, . . . , n to eliminate all but the firsttwo elements of column 2. The element A12 is unchanged because equation 1 isnot a participant in this set of eliminations. By performing (n - 1) such sets ofeliminations we arrive at an upper-triangular matrix R. This procedure can bethought of as an ordered sequence of multiplications by elementary matrices.
Theelementary matrix which eliminates Aij will be denoted Mij and is defined bywhereM ij = 1 n – mij(6.11)ijmij = Aij/Ajj(the elements in A are all current, not original, values) and wherehaving 1 in the position ij and zeros elsewhere, that is(6.12)ijis the matrix(6.13)which uses the Kronecker delta, δ ir = 1 for i = r and δ ir = 0 otherwise.
The effecton Mij when pre-multiplying a matrix A is to replace the ith row with thedifference between the ith row and mij times the jth row, that is, ifA' = M i jA(6.14)thenfor r i(6.15)(6.16)with k = 1, 2, . . . , n. Since Ajk = 0 for k < j, for computational purposes one needonly use k = j, ( j+ 1), . . . , n. Thus(6.17)= L - 1A(6.18)74Compact numerical methods for computersgives the triangular matrix in question. The choice of symbol(6.19)is deliberate, since the matrix product is lower-triangular by virtue of thelower-triangular nature of each Mij and the lemmas below which will be statedwithout proof (see, for instance, Mostow and Sampson 1969, p 226).Lemma 1.
The product of two lower-/upper-triangular matrices is also lower/upper-triangular.Lemma 2. The inverse of a lower-/upper-triangular matrix is also lower-/uppertriangular.By virtue of lemma 2, the matrix L (the inverse of L-l) is lower-triangular. Theelements of L are given by0for j > iLijThis is proved simply by forming=1for j = im ijfor j < iL - 1L = 1(6.20)(6.21)using (6.19) and (6.20).
Note that (6.18) implies thatA = LR(6.22)which is a triangular decomposition of the matrix A. It permits us to rewrite theoriginal equationsA x = LRx = b(6.23)asRx = L- lAx = L- lb = f.(6.24)Because we can retain the triangular structure by writing the unit matrix1 = DD - 1(6.25)where D is a non-singular diagonal matrix so thatA = LR = LDD- 1R = L'R'(6.26)the Gauss elimination is not the only means to obtain a triangular decompositionof A.In fact, the Gauss elimination procedure as it has been explained so far isunsatisfactory for computation in finite arithmetic because the mij are computedfrom current elements of the matrix, that is, those elements which have beencomputed as a result of eliminating earlier elements in the scheme. Recall that(6.12)using the prime to denote current values of the matrix elements, that is, thosevalues which have resulted from eliminating elements in columns 1, 2, . .
. , (j – 1).Linear equations—a direct approach75Ifis zero, we cannot proceed, and ‘small’are quite likely to occur duringsubtractions involving digit cancellations, so that multipliers mij that are large andinaccurate are possible. However, we can ensure that multipliers mij are all lessthan one in magnitude by permuting the rows of A' (and hence A) so that thelargest offor i = j, (j + 1), .