c14-1 (779576), страница 2
Текст из файла (страница 2)
In real life it is good practiceto believe in skewnesses only when they are several or many times as large as this.The kurtosis is also a nondimensional quantity. It measures the relativepeakedness or flatness of a distribution. Relative to what? A normal distribution,what else! A distribution with positive kurtosis is termed leptokurtic; the outlineof the Matterhorn is an example.
A distribution with negative kurtosis is termedplatykurtic; the outline of a loaf of bread is an example. (See Figure 14.1.1.) And,as you no doubt expect, an in-between distribution is termed mesokurtic.The conventional definition of the kurtosis is4 N 1 Xxj − x−3(14.1.6)Kurt(x1 . . . xN ) =Nσj=1where the −3 term makes the value zero for a normal distribution.Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited.
To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).(a)negative(platykurtic)positive(leptokurtic)14.1 Moments of a Distribution: Mean, Variance, Skewness613j=1but this can magnify the roundoff error by a large factor and is generally unjustifiablein terms of computing speed. A clever way to minimize roundoff error, especiallyfor large samples, is to use the corrected two-pass algorithm [1]: First calculate x,then calculate Var(x1 . . .
xN ) by2 NNX11 X2(14.1.8)(xj − x) −(xj − x)Var(x1 . . . xN ) =N −1 N j=1j=1The second sum would be zero if x were exact, but otherwise it does a good job ofcorrecting the roundoff error in the first term.#include <math.h>void moment(float data[], int n, float *ave, float *adev, float *sdev,float *var, float *skew, float *curt)Given an array of data[1..n], this routine returns its mean ave, average deviation adev,standard deviation sdev, variance var, skewness skew, and kurtosis curt.{void nrerror(char error_text[]);int j;float ep=0.0,s,p;if (n <= 1) nrerror("n must be at least 2 in moment");s=0.0;First pass to get the mean.for (j=1;j<=n;j++) s += data[j];*ave=s/n;*adev=(*var)=(*skew)=(*curt)=0.0;Second pass to get the first (absolute), secfor (j=1;j<=n;j++) {ond, third, and fourth moments of the*adev += fabs(s=data[j]-(*ave));deviation from the mean.ep += s;*var += (p=s*s);*skew += (p *= s);*curt += (p *= s);}*adev /= n;*var=(*var-ep*ep/n)/(n-1);Corrected two-pass formula.*sdev=sqrt(*var);Put the pieces together according to the conif (*var) {ventional definitions.*skew /= (n*(*var)*(*sdev));*curt=(*curt)/(n*(*var)*(*var))-3.0;} else nrerror("No skew/kurtosis when variance = 0 (in moment)");}Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).The standard deviation of (14.1.6)as an estimator of the kurtosis of anpunderlying normal distribution is 96/N. However, the kurtosis depends on sucha high moment that there are many real-life distributions for which the standarddeviation of (14.1.6) as an estimator is effectively infinite.Calculation of the quantities defined in this section is perfectly straightforward.Many textbooks use the binomial theorem to expand out the definitions into sumsof various powers of the data, e.g., the familiarNX1 x2j − N x2 ≈ x2 − x2(14.1.7)Var(x1 .
. . xN ) =N −1614Chapter 14.Statistical Description of DataSemi-InvariantsThe mean and variance of independent random variables are additive: If x and y aredrawn independently from two, possibly different, probability distributions, then(x + y) = x + yVar(x + y) = Var(x) + Var(x)(14.1.9)so that, e.g., M2 = Var(x), then the first few semi-invariants, denoted Ik are given byI2 = M2I3 = M3I5 = M5 − 10M2 M3I4 = M4 − 3M22I6 = M6 − 15M2 M4 − 10M32 + 30M23(14.1.11)Notice that the skewness and kurtosis, equations (14.1.5) and (14.1.6) are simple powersof the semi-invariants,3/2Skew(x) = I3 /I2Kurt(x) = I4 /I22(14.1.12)A Gaussian distribution has all its semi-invariants higher than I2 equal to zero. A Poissondistribution has all of its semi-invariants equal to its mean.
For more details, see [2].Median and ModeThe median of a probability distribution function p(x) is the value xmed forwhich larger and smaller values of x are equally probable:Zxmedp(x) dx =−∞1=2Z∞p(x) dx(14.1.13)xmedThe median of a distribution is estimated from a sample of values x1 , .
. . ,xN by finding that value xi which has equal numbers of values above it and belowit. Of course, this is not possible when N is even. In that case it is conventionalto estimate the median as the mean of the unique two central values. If the valuesxj j = 1, . . . , N are sorted into ascending (or, for that matter, descending) order,then the formula for the median isN oddx(N+1)/2 ,(14.1.14)xmed = 1(xN/2 + x(N/2)+1 ),N even2If a distribution has a strong central tendency, so that most of its area is undera single peak, then the median is an estimator of the central value.
It is a morerobust estimator than the mean is: The median fails as an estimator only if the areain the tails is large, while the mean fails if the first moment of the tails is large;it is easy to construct examples where the first moment of the tails is large eventhough their area is negligible.To find the median of a set of values, one can proceed by sorting the set andthen applying (14.1.14). This is a process of order N log N . You might rightly thinkSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.
Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).Higher moments are not, in general, additive. However, certain combinations of them,called semi-invariants, are in fact additive. If the centered moments of a distribution aredenoted Mk ,ED(14.1.10)Mk ≡ (xi − x)k14.2 Do Two Distributions Have the Same Means or Variances?615CITED REFERENCES AND FURTHER READING:Bevington, P.R.
1969, Data Reduction and Error Analysis for the Physical Sciences (New York:McGraw-Hill), Chapter 2.Stuart, A., and Ord, J.K. 1987, Kendall’s Advanced Theory of Statistics, 5th ed. (London: Griffinand Co.) [previous eds. published as Kendall, M., and Stuart, A., The Advanced Theoryof Statistics], vol. 1, §10.15Norusis, M.J. 1982, SPSS Introductory Guide: Basic Statistics and Operations; and 1985, SPSSX Advanced Statistics Guide (New York: McGraw-Hill).Chan, T.F., Golub, G.H., and LeVeque, R.J. 1983, American Statistician, vol. 37, pp.