Higham - Accuracy and Stability of Numerical Algorithms (523152), страница 15
Текст из файла (страница 15)
On these computerssubtracting any power of 2 from the next smaller floating point number givesan answer that is either a factor of 2 too large (as in the example above-e.g.,Cray X-MP or Y-MP) or is zero (Cray 2). In 1992 Cray announced that itwould produce systems that use IEEE standard double precision arithmeticby 1995.The lack of a guard digit is a serious drawback. It causes many algorithmsthat would otherwise work perfectly to fail some of the time (e.g., compensatedsummation-see §4.3). Here is an example of a result that holds only when aguard digit is used.
This result holds for any base β .Theorem 2.4 (Ferguson). Let x and y be floating point numbers for whiche ( x - y) < min(e(x),e(y)), where e(x) denotes the exponent of x in its normalized floating point representation. If subtraction is performed with a guarddigit then x - y is computed exactly (assuming x - y does not underflow oroverflow) .Proof. From the condition of the theorem the exponents of x and y differby at most 1. If the exponents are the same then fl(x - y) is computedexactly, so suppose the exponents differ by 1, which can happen only when xand y have the same sign.
Scale and interchange x and y if necessary so thatβ -1 < y < 1 < x < β, where β is the base. Now x is represented in base β asx1 .x 2 . . .xt and the exact difference z = x - y is of the formF LOATING P OINT A RITHMETIC50x 1 . x 2 ...x t 0.y1 . . . yt-1ytz1 .z 2 . . . ztzt +1But e(x - y) < e(y) and y < 1, so z1 = 0. The algorithm for computing zforms z 1 .z 2 .
. .zt +1 and then rounds to t digits; since z has at most t significantdigits this rounding introduces no error, and thus z is computed exactly.The next result is a corollary of the previous one but is more well known. Itis worth stating as a separate theorem because the conditions of the theoremare so elegant and easy to check (being independent of the base), and becausethis weaker theorem is sufficient for many purposes.Theorem 2.5 (Sterbenz). Let x and y be floating point numbers with y/2 <x < 2y. If subtraction is performed with a guard digit then x - y is computedexactly (assuming x - y does not underflow).Theorem 2.5 is vital in proving that certain special algorithms work.
Agood example involves Heron’s formula for the area A of a triangle with sidesof length a, b, and c:s = (a + b + c)/2.This formula is inaccurate for needle-shaped triangles: if ab + c then s aand the term s - a suffers severe cancellation. A way around this difficulty,devised by Kahan, is to rename a, b, and c so that a > b > c and then evaluate(2.7)The parentheses are essential! Kahan has shown that this formula gives thearea with a relative error bounded by a modest multiple of the unit roundoffprovided that a guard digit is used in subtraction [457, 1991, Thm. 3], [634,1990] (see Problem 2.22). If there is no guard digit, the computed result canbe very inaccurate.Kahan has made these interesting historical comments about guard digits[634, 1990]:CRAYs are not the first machines to compute differences blightedby lack of a guard digit.
The earliest IBM ’360s, from 1964 to 1967,subtracted and multiplied without a hexadecimal guard digit until SHARE, the IBM mainframe user group, discovered why theconsequential anomalies were intolerable and so compelled a guarddigit to be retrofitted. The earliest Hewlett-Packard financial calculator, the HP-80, had a similar problem. Even now, many acalculator (but not Hewlett-Packard’s) lacks a guard digit.2.5 C HOICEOFB ASEANDD ISTRIBUTIONOFN UMBERS512.5. Choice of Base and Distribution of NumbersWhat base β is best for a floating point number system? Most modern computers use base 2. Most hand-held calculators use base 10, since it makesthe calculator easier for the user to understand (how would you explain to anaive user that 0.1 is not exactly representable on a base 2 calculator?).
IBMmainframes traditionally have used base 16. Even base 3 has been tried-inan experimental machine called SETUN, built at Moscow State University inthe late 1950s [1066, 1960].Several criteria can be used to guide the choice of base. One is the impactof wobbling precision: as we saw at the end of §2.1, the spread of representation errors is smallest for small bases. Another possibility is to measure theworst-case representation error or the mean square representation error. Thelatter quantity depends on the assumed distribution of the numbers that arerepresented. Brent [144, 1973] shows that for the logarithmic distribution theworst-case error and the mean square error are both minimal for (normalized)base 2, provided that the most significant bit is not stored explicitly.The logarithmic distribution is defined by the property that the proportionof base β numbers with leading significant digit n isIt appears that in practice real numbers are logarithmically distributed.
In1938, Benford [90, 1938] noticed, as had Newcomb [794, 1881] before him,that the early pages of logarithm tables showed greater signs of wear than thelater ones. (He was studying dirty books!) This prompted him to carry out asurvey of 20,229 “real-life” numbers, whose decimal representations he foundmatched the logarithmic distribution closely.The observed logarithmic distribution of leading significant digits has notbeen fully explained.
Some proposed explanations are based on the assumption that the actual distribution is scale invariant, but this assumption isequivalent to the observation to be explained [1032, 1984]. Barlow [57, 1981],[58, 1981], [60, 1988] and Turner [1031, 1982], [1032, 1984] give useful insightby showing that if uniformly distributed numbers are multiplied together, thenthe resulting distribution converges to the logarithmic one; see also Boyle [141,1994].
Furthermore, it is an interesting result that the leading significant digits of the numbers qk, k = 0,1,2,. . . , are logarithmically distributed if q ispositive and is not a rational power of 10; when q = 2 and the digit is 7 thisis Gelfand’s problem [829, 1981, pp.
50-51].The nature of the logarithmic distribution is striking. For decimal numbers, the digits 1 to 9 are not equally likely to be a leading significant digit.The probabilities are as follows:FLOATING POINT ARITHMETIC5210.30120.17630.12540.09750.07960.06770.05880.05190.046As an example, here is the leading significant digit distribution for the elements of the inverse of one random 100 × 100 matrix from the normal N(0, 1)distribution:1234567890.334 0.163 0.100 0.087 0.077 0.070 0.063 0.056 0.051For an entertaining survey of work on the distribution of leading significantdigits see Raimi [856, 1976] (and also the popular article [855, 1969]).2.6.
Statistical Distribution of Rounding ErrorsMost rounding error analyses, including all the ones in this book, are designedto produce worst-case bounds for the error. The analyses ignore the signs ofrounding errors and are often the result of many applications of the triangleinequality and the submultiplicative inequality. Consequently, although thebounds may well give much insight into a method, they tend to be pessimisticif regarded as error estimates.Statistical statements about the effect of rounding on a numerical processcan be obtained from statistical analysis coupled with probabilistic modelsof the rounding errors. For example, a well-known rule of thumb is that amore realistic error estimate for a numerical met hod is obtained by replacingthe dimension-dependent constants in a rounding error bound by their squareroot: thus if the bound is f(n)u, the rule of thumb says that the error is typically of order(see, for example, Wilkinson [1088, 1963, pp.
26, 102]).This rule of thumb can be supported by assuming that the rounding errorsare independent random variables and applying the central limit theorem.Statistical analysis of rounding errors goes back to one of the first papers onrounding error analysis, by Goldstine and von Neumann [462, 1951].As we noted in §1.17, rounding errors are not random. See Problem 2.10for an example of how two rounding errors cancel in one particular class ofcomputations.
Forsythe [389, 1959] points out that rounding errors do notnecessarily behave like independent random variables and proposes a randomform of rounding (intended for computer testing) to which statistical analysisis applicable.Henrici [517, 1962], [518, 1963], [519, 1964] assumes models for roundingerrors and then derives the probability distribution of the overall error, mainlyin the context of difference methods for differential equations. Hull and Swenson [593, 1966] give an insightful discussion of probabilistic models, pointingout that “There is no claim that ordinary rounding and chopping are randomprocesses, or that successive errors are independent. The question to be decided is whether or not these particular probabilistic models of the processes2.7 A LTERNATIVE N UMBER S YSTEMS53will adequately describe what actually happens” (see also the ensuing note byHenrici [520, 1966]).Since the late 1980s Chaitin-Chatelin and her co-workers have been developing a method called PRECISE, which involves a statistical analysis ofthe effect on a computed solution of random perturbations in the data; seeBrunet [152, 198 9], Chatelin and Brunet [202, 1990], and Chaitin-Chatelinand Frayssé [190, 1996].