Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 87
Текст из файла (страница 87)
For example,starting with [1, 1, 1]T we are lead to the vector[ a b c] T = [1.732 1 1.2704]T ,for which the computed and “exact” roots are21Edelman and Murakami [346, 1995] and Toh and Trefethen [1007, 1994] analyse thestability of this method of finding polynomial roots; the method is stable.484A UTOMATIC E RROR A NALYSISWhen the cubic is evaluated at the computed roots the results are of order10–2, whereas they are of order 10–15 for z. Since the roots are well separatedthe problem of computing them is not ill conditioned, so we can be sure thatz is an accurate approximation to the exact roots.
The conclusion is that theformulae, as programmed, are numerically unstable.Recalling the discussion for a quadratic equation in §1.8, a likely reasonfor the observed instability is cancellation in (24.5). Instead of always takingthe “+” sign, we therefore take(24.6)When the argument of the square root is nonnegative, this formula suffersno cancellation in the subtraction; the same is true when the argument ofthe square root is negative, because then the square root is pure imaginary.With the use of (24.6), we were unable to locate instability using the objectivefunction described above. However, an alternative objective function can bederived as follows.
It is reasonable to ask that the computed roots be theexact roots of a slightly perturbed cubic. Thus each computed root shouldbe a root ofwhereis of the order of the unitroundoff. Notice that we are allowing the leading coefficient of unity to beperturbed. Denoting the unperturbed cubic by f, we find that this conditionimpliesthat.(24.7)is of order u. We therefore take the quantity in (24.7) as the function to bemaximized. On applying the MDS maximizer with starting vector [1, 1, 1] T ,we obtain after 10 iterations an objective function value of 1.2 x 10–11. Thecubic coefficient vector is (to three significant figures)[a b c] T = [-5.89 x 102 3.15 x 102 -1.36 x 101],and the computed roots are (ignoring tiny, spurious imaginary parts)= [4.75 x 10 -2 4.87 x 10 -1 5.89 x 10 2 ].The value of the objective function corresponding to the “exact” roots (computed as described above) is of order 10-16 (and the value of the previousrelative error objective function for the computed roots is of order 10–14 ).The conclusion is that even using (24.6) the formulae for the roots arenumerically unstable.
However, further theoretical analysis is needed to understand the stability fully; see Problem 24.3.24.4 INTERVAL ANALYSIS48524.4. Interval AnalysisInterval analysis has been an area of extensive research since the 1960s, and ithad been considered earlier by Turing and Wilkinson in the 1940s [1099, 1980,p. 104]. As the name suggests, the idea is to perform arithmetic on intervals[a,b] (b > a). The aim is to provide as the solution to a problem an intervalin which the desired result is guaranteed to lie, which may be particularlyappropriate if the initial data is uncertain (and hence can be thought of as aninterval).For the elementary operations, arithmetic on intervals is defined byand the results are given directly by the formulae[a,b] + [c,d] = [a + c, b + d],[a,b] - [c,d] = [a - d, b - c],[a,b] * [c,d] = [min (ac, ab, bc, bd), max (ac, ab, bc, bd)],[a,b] / [c,d] = [a,b] * [1/d, 1/c ],We will use the notation [x] for an interval [xl,x2] and we define width(x) :=x2 – x1 .In floating point arithmetic, an interval containing fl( [x] op [y]) is obtained by rounding computations on the left endpoint toand those onthe right endpoint to(both these rounding modes are supported in IEEEarithmetic).The success of an interval analysis depends on whether an answer is produced at all (an interval algorithm will break down with division by zero ifit attempts to divide by an interval containing zero), and on the width ofthe interval answer.
A one-line program that printswould be correct, robust, and fast, but useless. Interval analysis is controversial because,in general, narrow intervals cannot be guaranteed. One reason is that whendependencies occur in a calculation, in the sense that a variable appears morethan once, final interval lengths can be pessimistic. For example, if [x ] = [1,2]then[ x] - [x] = [-1, 1], [x]/[x] = [1/2, 2],whereas the optimal intervals for these calculations are, respectively, [0, O] and[1, 1].
These calculations can be interpreted as saying that there is no additiveor multiplicative inverse in interval arithmetic.An example of an algorithm for which interval arithmetic can be ineffectiveis GEPP, which in general gives very pessimistic error bounds and is unstablein interval arithmetic even for a well-conditioned matrix [798, 1977]. The486A UTOMATIC E RROR A NALYSISbasic problem is that the interval sizes grow exponentially. For example, inthe 2 x 2 reductionifthenwidth([ y ] – [x])width([ x] – [x]) = 2 width([x]).This type of growth is very likely to happen, unlike the superficially similarphenomenon of element growth in standard GEPP.
The poor interval boundsare entirely analogous to the pessimistic results returned by early forwarderror analyses of GEPP (see §9.6). Nickel [798, 1977] states that “The intervalGauss elimination method is one of the most unfavorable cases of intervalcomputation . . . nearly all other numerical methods give much better resultsif transformed to interval methods”.
Interval GE is effective, however, forcertain special classes of matrices, such as M-matrices.As already mentioned, there is a large body of literature on interval arithmetic, though, as Skeel notes (in an article that advocates interval arithmetic), “elaborate formalisms and abstruse notation make much of the literature impenetrable to all but the most determined outsiders” [922, 1989].Good sources of information include various conference proceedings and thejournal Computing. The earliest book is by Moore [778, 1966], whose laterbook [779, 1979] is one of the best references on the subject. Nickel [798,1977] gives an easy to read summary of research up to the mid 1970s. A morerecent reference is Alefeld and Herzberger [10, 1983].Yohe [1120, 1979] describes a Fortran 66 package for performing intervalarithmetic in which machine-specific features are confined to a few modules.It is designed to work with a precompiled for Fortran called Augment [254,1979], which allows the user to write programs as though Fortran had extended data types—in this case an INTERVAL type.
A version of the packagethat allows arbitrary precision interval arithmetic by incorporating Brent’smultiple precision arithmetic package (see §25.9) is described in [1121, 1980].In [1119, 1979], Yohe describes general principles of implementing nonstandard arithmetic in software, with particular reference to interval arithmetic.Kulisch and Miranker have proposed endowing a computer arithmetic witha super-accurate inner product, that is, a facility to compute an exactlyrounded inner product for any two vectors of numbers at the working precision [678, 1981], [679, 1983], [680, 1986].
This idea has been implementedin the package ACRITH from IBM, which employs interval arithmetic [124,1985]. Anomalies in early versions of ACRITH are described by Kahan andLeBlanc [639, 1985] and Jansen and Weidner [612, 1986]. For a readablediscussion of interval arithmetic and the super-accurate inner product, seeRail [858, 1991].
Cogent arguments against adopting a super-accurate innerproduct are given by Demmel [284, 1991].24.5 O THER W ORK487Fortran and Pascal compilers (Fortran SC and Pascal-XSC) and a C++class library that support a super-accurate inner product and interval arithmetic have been developed jointly by researchers at IBM and the University ofKarlsruhe [125, 1987], [661, 1993], [662, 1992]. A toolbox of routines writtenin Pascal-XSC for solving basic numerical problems and verifying the resultsis presented by Hammer, Hocks, Kulisch, and Ratz [500, 1993].Finally, we explain why any attempt to compute highly accurate answersusing interval arithmetic of a fixed precision (possibly combined with a superaccurate inner product ) cannot always succeed.
Suppose we have a sequenceof problems to solve, where the output of one is the input to the next: xi+1 =f i (x i ), i = 1:n, for smooth functions fi :Suppose x 1 is knownexactly (interval width zero). Working in finite precision interval arithmetic,we obtain an interval [x2 – α2, x2 + β 2] containing x2.
This is used as input tothe computation of f 2 . Even under the favorable assumption of no roundingerrors in the evaluation of f2, we obtain an interval answer whose width mustbe of order(the interval could be much bigger than f 2 (x2 ) – f 2 (x2 ± ε 2), depending onthe algorithm used to evaluate f2). In other words, the width of the intervalcontaining x3 is roughly proportional to the condition number of f 2 .
Whenthe output of the f2 computation is fed into f3 the interval width is multipliedby f´3 The width of the final interval containing xn+l is proportional to theproduct of the condition numbers of all the functions f i and if there areenough functions, or if they are conditioned badly enough, the final intervalwill provide no useful information The only way to avoid such a failure is touse variable precision arithmetic or to reformulate the problem to escape theproduct of the condition numbers of the fi .24.5. Other WorkIn this section we outline other work on automatic error analysis.In the 1970s, Miller and his co-authors developed methods and software forautomatically searching for numerical instability in algebraic processes [757,1975], [760, 1978], [762, 1980]. In [757, 1975] Miller defines a quantity σ ( d )that bounds, to first order, the sensitivity of an algorithm to perturbations inthe data d and in the intermediate quantities that the algorithm generates.He then defines the forward stability measure ρ(d) = σ(d)/k(d), where k(d)is a condition number for the problem under consideration.
The algorithmsto be analysed are required to contain no loops or conditional branches andare presented to Miller’s Fortran software in a special numeric encoding. Thesoftware automatically computes the partial derivatives needed to evaluate488A UTOMATIC E RROR A NALYSISρ(d), and attempts to maximize ρ using the method of alternating directions.Miller gives several examples illustrating the scope of his software; he shows,for example, that it can identify the instability of the classical Gram-Schmidtmethod for orthogonalizing a set of vectors.In [760, 1978], [761, 1978] Miller and Spooner extend the work in [757,1975] in several ways. The algorithm to be analysed is expressed in a Fortranlike language that allows for-loops but not logical tests.