Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C (523184), страница 20
Текст из файла (страница 20)
. . , 1 − λN )(2.5.15)where all the λi ’s are positive. Evidently any satisfying 0 < < 2/(maxi λi ) will giveR < 1. It is not difficult to show that the optimal choice for , giving the most rapidconvergence for equation (2.5.11), is = 2/(max λi + min λi )i(2.5.16)iRarely does one know the eigenvalues of AT · A in equation (2.5.16).
Pan and Reifderive several interesting bounds, which are computable directly from A. The followingchoices guarantee the convergence of Bn as n → ∞,≤1a2jkor≤1max|aij | × max|aij |(2.5.17)j,kijjiThe latter expression is truly a remarkable formula, which Pan and Reif derive by noting thatthe vector norm in equation (2.5.12) need not be the usual L2 norm, but can instead be eitherthe L∞ (max) norm, or the L1 (absolute value) norm. See their work for details.Another approach, with which we have had some success, is to estimate the largesteigenvalue statistically, by calculating si ≡ |A · vi |2 for several unit vector vi ’s with randomlychosen directions in N -space.
The largest eigenvalue λ can then be bounded by the maximumof 2 max si and 2N Var(si )/µ(si ), where Var and µ denote the sample variance and mean,respectively.CITED REFERENCES AND FURTHER READING:Johnson, L.W., and Riess, R.D. 1982, Numerical Analysis, 2nd ed. (Reading, MA: AddisonWesley), §2.3.4, p. 55.Golub, G.H., and Van Loan, C.F. 1989, Matrix Computations, 2nd ed.
(Baltimore: Johns HopkinsUniversity Press), p. 74.Dahlquist, G., and Bjorck, A. 1974, Numerical Methods (Englewood Cliffs, NJ: Prentice-Hall),§5.5.6, p. 183.Forsythe, G.E., and Moler, C.B. 1967, Computer Solution of Linear Algebraic Systems (Englewood Cliffs, NJ: Prentice-Hall), Chapter 13.Ralston, A., and Rabinowitz, P. 1978, A First Course in Numerical Analysis, 2nd ed.
(New York:McGraw-Hill), §9.5, p. 437.Pan, V., and Reif, J. 1985, in Proceedings of the Seventeenth Annual ACM Symposium onTheory of Computing (New York: Association for Computing Machinery). [1]592.6 Singular Value Decomposition2.6 Singular Value DecompositionThere exists a very powerful set of techniques for dealing with sets of equationsor matrices that are either singular or else numerically very close to singular. In manycases where Gaussian elimination and LU decomposition fail to give satisfactoryresults, this set of techniques, known as singular value decomposition, or SVD,will diagnose for you precisely what the problem is. In some cases, SVD willnot only diagnose the problem, it will also solve it, in the sense of giving you auseful numerical answer, although, as we shall see, not necessarily “the” answerthat you thought you should get.SVD is also the method of choice for solving most linear least-squares problems.We will outline the relevant theory in this section, but defer detailed discussion ofthe use of SVD in this application to Chapter 15, whose subject is the parametricmodeling of data.SVD methods are based on the following theorem of linear algebra, whose proofis beyond our scope: Any M × N matrix A whose number of rows M is greater thanor equal to its number of columns N , can be written as the product of an M × Ncolumn-orthogonal matrix U, an N × N diagonal matrix W with positive or zeroelements (the singular values), and the transpose of an N × N orthogonal matrix V.The various shapes of these matrices will be made clearer by the following tableau: = AU w1 ·w2 ·······VTwN(2.6.1)The matrices U and V are each orthogonal in the sense that their columns areorthonormal,MUik Uin = δkn1≤k≤N1≤n≤N(2.6.2)Vjk Vjn = δkn1≤k≤N1≤n≤N(2.6.3)i=1Nj=160Chapter 2.Solution of Linear Algebraic Equationsor as a tableau,UT · U= · VT=V1(2.6.4)Since V is square, it is also row-orthonormal, V · VT = 1.The SVD decomposition can also be carried out when M < N .
In this casethe singular values wj for j = M + 1, . . . , N are all zero, and the correspondingcolumns of U are also zero. Equation (2.6.2) then holds only for k, n ≤ M .The decomposition (2.6.1) can always be done, no matter how singular thematrix is, and it is “almost” unique. That is to say, it is unique up to (i) makingthe same permutation of the columns of U, elements of W, and columns of V (orrows of VT ), or (ii) forming linear combinations of any columns of U and V whosecorresponding elements of W happen to be exactly equal.
An important consequenceof the permutation freedom is that for the case M < N , a numerical algorithm forthe decomposition need not return zero wj ’s for j = M + 1, . . . , N ; the N − Mzero singular values can be scattered among all positions j = 1, 2, . . . , N .At the end of this section, we give a routine, svdcmp, that performs SVD onan arbitrary matrix A, replacing it by U (they are the same shape) and giving backW and V separately. The routine svdcmp is based on a routine by Forsythe etal. [1], which is in turn based on the original routine of Golub and Reinsch, found, invarious forms, in [2-4] and elsewhere. These references include extensive discussionof the algorithm used.
As much as we dislike the use of black-box routines, we aregoing to ask you to accept this one, since it would take us too far afield to coverits necessary background material here. Suffice it to say that the algorithm is verystable, and that it is very unusual for it ever to misbehave. Most of the concepts thatenter the algorithm (Householder reduction to bidiagonal form, diagonalization byQR procedure with shifts) will be discussed further in Chapter 11.If you are as suspicious of black boxes as we are, you will want to verify yourselfthat svdcmp does what we say it does. That is very easy to do: Generate an arbitrarymatrix A, call the routine, and then verify by matrix multiplication that (2.6.1) and(2.6.4) are satisfied.
Since these two equations are the only defining requirementsfor SVD, this procedure is (for the chosen A) a complete end-to-end check.Now let us find out what SVD is good for.2.6 Singular Value Decomposition61SVD of a Square MatrixIf the matrix A is square, N × N say, then U, V, and W are all square matricesof the same size. Their inverses are also trivial to compute: U and V are orthogonal,so their inverses are equal to their transposes; W is diagonal, so its inverse is thediagonal matrix whose elements are the reciprocals of the elements wj . From (2.6.1)it now follows immediately that the inverse of A isA−1 = V · [diag (1/wj )] · UT(2.6.5)The only thing that can go wrong with this construction is for one of the wj ’sto be zero, or (numerically) for it to be so small that its value is dominated byroundoff error and therefore unknowable.
If more than one of the wj ’s have thisproblem, then the matrix is even more singular. So, first of all, SVD gives you aclear diagnosis of the situation.Formally, the condition number of a matrix is defined as the ratio of the largest(in magnitude) of the wj ’s to the smallest of the wj ’s. A matrix is singular if itscondition number is infinite, and it is ill-conditioned if its condition number is toolarge, that is, if its reciprocal approaches the machine’s floating-point precision (forexample, less than 10−6 for single precision or 10−12 for double).For singular matrices, the concepts of nullspace and range are important.Consider the familiar set of simultaneous equationsA·x=b(2.6.6)where A is a square matrix, b and x are vectors. Equation (2.6.6) defines A as alinear mapping from the vector space x to the vector space b. If A is singular, thenthere is some subspace of x, called the nullspace, that is mapped to zero, A · x = 0.The dimension of the nullspace (the number of linearly independent vectors x thatcan be found in it) is called the nullity of A.Now, there is also some subspace of b that can be “reached” by A, in the sensethat there exists some x which is mapped there.
This subspace of b is called the rangeof A. The dimension of the range is called the rank of A. If A is nonsingular, then itsrange will be all of the vector space b, so its rank is N . If A is singular, then the rankwill be less than N . In fact, the relevant theorem is “rank plus nullity equals N .”What has this to do with SVD? SVD explicitly constructs orthonormal basesfor the nullspace and range of a matrix. Specifically, the columns of U whosesame-numbered elements wj are nonzero are an orthonormal set of basis vectors thatspan the range; the columns of V whose same-numbered elements wj are zero arean orthonormal basis for the nullspace.Now let’s have another look at solving the set of simultaneous linear equations(2.6.6) in the case that A is singular.