Heath - Scientific Computing (523150), страница 19
Текст из файла (страница 19)
Using three-digit decimal arithmetic to solve the system 0.641 0.242x10.883=,0.321 0.121x20.442Gaussian elimination with partial pivoting yields the triangular system 0.6410.242x10.883=,0−0.000242x2−0.000383and back-substitution then gives the computed solution0.782x=.1.58The exact residual for this solution is−0.000622r = b − Ax =,−0.000202which is as small as we can expect using only three-digit arithmetic.
Yet the exact solutionfor this system is easily seen to be1.00x=,1.00so that the error is almost as large as the solution. The cause of this phenomenon is thatthe matrix is very nearly singular (its condition number is more than 4000). The divisionthat determines x2 is between two quantities that are both on the order of rounding error,and hence the result is essentially arbitrary.
Yet, by design, when this arbitrary value forx2 is then substituted into the first equation, a value for x1 is computed so that the firstequation is satisfied. Thus, we get a small residual, but a poor solution.602.4.2CHAPTER 2. SYSTEMS OF LINEAR EQUATIONSEstimating AccuracyIn addition to being a reliable indicator of near singularity, the condition number alsoprovides a quantitative estimate for the error in the computed solution to a linear system,as we will now see.
Let x be the solution to the nonsingular linear system Ax = b, andlet x̂ be the solution to the system Ax̂ = b + ∆b with a perturbed right-hand side. If wedefine ∆x = x̂ − x, then we haveb + ∆b = Ax̂ = A(x + ∆x) = Ax + A∆x.Since Ax = b, we must have A∆x = ∆b, and hence ∆x = A−1 ∆b. Nowb = Ax ⇒ kbk ≤ kAk · kxk,and∆x = A−1 ∆b ⇒ k∆xk ≤ kA−1 k · k∆bk,which, upon using the definition cond(A) = kAk · kA−1 k, yields the estimatek∆xkk∆bk≤ cond(A).kxkkbkThus, the condition number of the matrix determines the possible relative change in thesolution due to a given relative change in the right-hand-side vector, regardless of thealgorithm used to compute the solution (compare with the general notion of conditionnumber defined in Section 1.2.5).
A similar result holds for relative changes in the entriesof the matrix A. If Ax = b and(A + E)x̂ = b,thenx − x̂ = A−1 (b − Ax̂) = A−1 E x̂,so thatk∆xk ≤ kA−1 k · kEk · kx̂k,which yields the estimatekEkk∆xk≤ cond(A).kx̂kkAkAs an alternative to the algebraic derivations just given, calculus can be used to estimatethe sensitivity of linear systems.
Introducing the real-valued parameter t, we define A(t) =A + tE and b(t) = b + t∆b, and consider the solution x(t) to the linear system A(t)x(t) =b(t). Differentiating this equation with respect to t, we obtainA0 (t)x(t) + A(t)x0 (t) = b0 (t),so that we havex0 (t) = −A−1 (t)A0 (t)x(t) + A−1 (t)b0 (t),and hence, evaluating at t = 0 and taking norms,kx0 kkb0 k≤ kA−1 k · kA0 k + kA−1 k ·≤ cond(A)kxkkxkkA0 k kb0 k+kAkkbk.2.4. ACCURACY OF SOLUTIONS61Thus, we again see that the relative change in the solution is bounded by the conditionnumber times the relative change in the problem data.A geometric interpretation in two dimensions of these sensitivity results is that if the twostraight lines defined by the two equations are nearly parallel, then their point of intersectionis not sharply defined if the lines are a bit fuzzy because of rounding errors or other sourcesof error. If, on the other hand, the lines are far from parallel, say nearly perpendicular, thentheir intersection is relatively sharply defined.
These two cases are illustrated in Fig. 2.2,where the dashed lines indicate the region of uncertainty for each solid line, so that theintersection point in each case could be anywhere within the shaded parallelogram. Thus,a large condition number is associated with a large uncertainty in the solution... .. .... .. ..... .... .... ..
.... .... .... .. .... .... .... .. ..... ... ... .. ......... ....... ....... .................. ....... ....... ....... ....................................................................................................................................... ....... ....... ................... ........ .......
....... ......... .... .... .. .... .... .... .. ..... ... .... .. ..... .... .... .. .... .... ..well-conditioned.............. ................................. ............... ....... ....... ....... ....... ............................................ .......................................
.................................................................................... ........................................................................................................................................................................................................................................................................................................................................................................................................................................
............................................................. ....... ............ ........ . ..... ....... ....... ....... ....... ......................................... ....... ....... ............ill-conditionedFigure 2.2: Well-conditioned and ill-conditioned linear systems.To summarize, if the input data are accurate to machine precision, then a reasonableestimate for the relative error in the computed solution to a linear system is given bykx̂ − xk≈ cond(A) mach .kxkOne simple way of interpreting these results is that the computed solution loses aboutlog10 (cond(A)) decimal digits of accuracy relative to the accuracy of the input.
In Example 2.10, for instance, with a condition number greater than 103 , we lost all of the three-digitprecision available and obtained an arbitrary solution.Before leaving the subject of assessing accuracy in terms of condition numbers, notethese two caveats:• The foregoing analysis estimates the relative error in the largest components of the solution vector. The relative error in the smaller components can be much larger, because avector norm is dominated by the largest components of a vector. Componentwise errorbounds can be obtained but are somewhat more complicated to compute, and we will notpursue this topic.
Componentwise bounds are of particular interest when the system ispoorly scaled.• The condition number of a matrix is affected by the scaling of the matrix (recall Example 2.3). A large condition number can result simply from poor scaling, as well asfrom near singularity. Rescaling the matrix can help the former, but not the latter (seeSection 2.4.3).622.4.3CHAPTER 2.
SYSTEMS OF LINEAR EQUATIONSImproving AccuracyAlthough the accuracy that can be expected in the solution of a linear system may seem setin concrete, accuracy can be enhanced in some cases by rescaling the system or by iterativelyimproving the initial computed solution. These measures are not always practicable, butthey may be worth trying.Recall from Example 2.3 that diagonal scaling of a linear system leaves the solution eitherunchanged (row scaling) or changed in such a way that the solution is easily recoverable(column scaling). In practice, however, scaling affects the conditioning of the system andthe selection of pivots in Gaussian elimination, both of which in turn affect the accuracyof the computed solution. Thus, row scaling and column scaling of a linear system canpotentially improve (or degrade) numerical stability and accuracy.Accuracy is usually enhanced if all the entries of the matrix have about the same orderof magnitude or, better still, if the uncertainties in the matrix entries are all of about thesame size.
Sometimes it is obvious by inspection how to scale the matrix to accomplish suchbalance by the choice of measurement units for the respective variables and by weightingeach equation according to its relative importance and accuracy. No general automatictechnique has ever been developed, however, that produces optimal scaling in an efficientand foolproof manner. Moreover, the scaling process itself can introduce rounding errorsunless care is taken (for example, by using only powers of the arithmetic base as scalingfactors).Example 2.11 Scaling. As a simple example, the linear system100x1x2 1=has condition number 1/ and hence is very ill-conditioned if is very small.
This illconditioning means that small perturbations in the input data can cause relatively largechanges in the solution. For example, perturbing the right-hand side by the vector [ 0 − ]Tchanges the solution from [ 1 1 ]T to [ 1 0 ]T .
If the second row is first multiplied by 1/,however, then the system becomes perfectly well-conditioned, and the same perturbationnow produces a commensurately small change in the solution. Thus, the apparent illconditioning was due purely to poor scaling. Unfortunately, how to correct poor scaling forgeneral matrices is much less obvious.Iterative refinement is another means of potentially improving the accuracy of a computed solution. Given an approximate solution x1 to the linear system Ax = b, computethe residualr1 = b − Ax1 .Now solve the linear systemAz1 = r1and takex2 = x1 + z12.5.
SPECIAL TYPES OF LINEAR SYSTEMS63as a new and “better” approximate solution, sinceAx2 = A(x1 + z1 ) = Ax1 + Az1 = (b − r1 ) + r1 = b.This process can be repeated to refine the solution successively until convergence, potentiallyproducing a solution that is accurate to full machine precision.Unfortunately, iterative refinement requires double the storage, since both the originalmatrix and its LU factorization are required (to compute the residual and to solve thesubsequent systems, respectively).
Moreover, for iterative refinement to produce meaningfulimprovement in the solution, the residual must usually be computed with higher precisionthan that used in computing the initial solution (recall Example 1.13).For these reasons, iterative improvement is often impractical to use routinely, but it canstill be useful in some circumstances. For example, iterative refinement can recover fullaccuracy for systems that are badly scaled, and can sometimes stabilize solution methodsthat are otherwise potentially unstable. Ironically, if the initial solution is relatively poor,then the residual may be large enough to be computed without requiring extra precision.We will return to iterative refinement later in Example 11.6.2.5Special Types of Linear SystemsThus far we have assumed that the linear system has a general matrix and is dense, meaning that essentially all of the matrix entries are nonzero.
If the matrix has some specialproperties, then work and storage can often be saved in solving the linear system. Someexamples of special properties that can be exploited include the following:• Symmetric: A = AT , i.e., aij = aji for all i, j.• Positive definite: xT Ax > 0 for all x 6= o.• Band : aij = 0 for all |i − j| > β, where β is the bandwidth of A. An important specialcase is a tridiagonal matrix , for which β = 1.• Sparse: most entries of A are zero.Techniques for handling symmetric and band systems are relatively straightforward variations on Gaussian elimination for dense systems.