c14-7 (779582), страница 2
Текст из файла (страница 2)
For small computers, thisrestricts the usefulness of the tests to N less than several thousand.We now give computer implementations. The one-sample case is embodied in theroutine ks2d1s (that is, 2-dimensions, 1-sample). This routine calls a straightforward utilityroutine quadct to count points in the four quadrants, and it calls a user-supplied routinequadvl that must be capable of returning the integrated probability of an analytic model ineach of four quadrants around an arbitrary (x, y) point. A trivial sample quadvl is shown;realistic quadvls can be quite complicated, often incorporating numerical quadratures overanalytic two-dimensional distributions.648Chapter 14.Statistical Description of Data*d1=FMAX(*d1,fabs(fd-gd));For both the sample and the model, the distribution is integrated in each of fourquadrants, and the maximum difference is saved.}void quadct(float x, float y, float xx[], float yy[], unsigned long nn,float *fa, float *fb, float *fc, float *fd)Given an origin (x, y), and an array of nn points with coordinates xx[1..nn] and yy[1..nn],count how many of them are in each quadrant around the origin, and return the normalizedfractions.
Quadrants are labeled alphabetically, counterclockwise from the upper right. Usedby ks2d1s and ks2d2s.{unsigned long k,na,nb,nc,nd;float ff;na=nb=nc=nd=0;for (k=1;k<=nn;k++) {if (yy[k] > y) {xx[k] > x ? ++na : ++nb;} else {xx[k] > x ? ++nd : ++nc;}}ff=1.0/nn;*fa=ff*na;*fb=ff*nb;*fc=ff*nc;*fd=ff*nd;}#include "nrutil.h"void quadvl(float x, float y, float *fa, float *fb, float *fc, float *fd)This is a sample of a user-supplied routine to be used with ks2d1s.
In this case, the modeldistribution is uniform inside the square −1 < x < 1, −1 < y < 1. In general this routineshould return, for any point (x, y), the fraction of the total distribution in each of the fourquadrants around that point. The fractions, fa, fb, fc, and fd, must add up to 1. Quadrantsare alphabetical, counterclockwise from the upper right.{float qa,qb,qc,qd;qa=FMIN(2.0,FMAX(0.0,1.0-x));qb=FMIN(2.0,FMAX(0.0,1.0-y));qc=FMIN(2.0,FMAX(0.0,x+1.0));qd=FMIN(2.0,FMAX(0.0,y+1.0));*fa=0.25*qa*qb;*fb=0.25*qb*qc;*fc=0.25*qc*qd;*fd=0.25*qd*qa;}The routine ks2d2s is the two-sample case of the two-dimensional K–S test.
It also callsquadct, pearsn, and probks. Being a two-sample test, it does not need an analytic model.Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).}pearsn(x1,y1,n1,&r1,&dum,&dumm);Get the linear correlation coefficient r1.sqen=sqrt((double)n1);rr=sqrt(1.0-r1*r1);Estimate the probability using the K-S probability function probks.*prob=probks(*d1*sqen/(1.0+rr*(0.25-0.75/sqen)));14.7 Do Two-Dimensional Distributions Differ?649#include <math.h>#include "nrutil.h"d1=0.0;for (j=1;j<=n1;j++) {First, use points in the first sample as oriquadct(x1[j],y1[j],x1,y1,n1,&fa,&fb,&fc,&fd);gins.quadct(x1[j],y1[j],x2,y2,n2,&ga,&gb,&gc,&gd);d1=FMAX(d1,fabs(fa-ga));d1=FMAX(d1,fabs(fb-gb));d1=FMAX(d1,fabs(fc-gc));d1=FMAX(d1,fabs(fd-gd));}d2=0.0;for (j=1;j<=n2;j++) {Then, use points in the second sample asquadct(x2[j],y2[j],x1,y1,n1,&fa,&fb,&fc,&fd);origins.quadct(x2[j],y2[j],x2,y2,n2,&ga,&gb,&gc,&gd);d2=FMAX(d2,fabs(fa-ga));d2=FMAX(d2,fabs(fb-gb));d2=FMAX(d2,fabs(fc-gc));d2=FMAX(d2,fabs(fd-gd));}*d=0.5*(d1+d2);Average the K-S statistics.sqen=sqrt(n1*n2/(double)(n1+n2));pearsn(x1,y1,n1,&r1,&dum,&dumm);Get the linear correlation coefficient for eachpearsn(x2,y2,n2,&r2,&dum,&dumm);sample.rr=sqrt(1.0-0.5*(r1*r1+r2*r2));Estimate the probability using the K-S probability function probks.*prob=probks(*d*sqen/(1.0+rr*(0.25-0.75/sqen)));}CITED REFERENCES AND FURTHER READING:Fasano, G.
and Franceschini, A. 1987, Monthly Notices of the Royal Astronomical Society,vol. 225, pp. 155–170. [1]Peacock, J.A. 1983, Monthly Notices of the Royal Astronomical Society, vol. 202, pp. 615–627.[2]Spergel, D.N., Piran, T., Loeb, A., Goodman, J., and Bahcall, J.N. 1987, Science, vol. 237,pp. 1471–1473. [3]Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited.
To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).void ks2d2s(float x1[], float y1[], unsigned long n1, float x2[], float y2[],unsigned long n2, float *d, float *prob)Two-dimensional Kolmogorov-Smirnov test on two samples. Given the x and y coordinates ofthe first sample as n1 values in arrays x1[1..n1] and y1[1..n1], and likewise for the secondsample, n2 values in arrays x2 and y2, this routine returns the two-dimensional, two-sampleK-S statistic as d, and its significance level as prob.
Small values of prob show that thetwo samples are significantly different. Note that the test is slightly distribution-dependent, soprob is only an estimate.{void pearsn(float x[], float y[], unsigned long n, float *r, float *prob,float *z);float probks(float alam);void quadct(float x, float y, float xx[], float yy[], unsigned long nn,float *fa, float *fb, float *fc, float *fd);unsigned long j;float d1,d2,dum,dumm,fa,fb,fc,fd,ga,gb,gc,gd,r1,r2,rr,sqen;650Chapter 14.Statistical Description of Data14.8 Savitzky-Golay Smoothing Filtersgi =nRXcnfi+n(14.8.1)n=−nLHere nL is the number of points used “to the left” of a data point i, i.e., earlier than it, whilenR is the number used to the right, i.e., later.
A so-called causal filter would have nR = 0.As a starting point for understanding Savitzky-Golay filters, consider the simplestpossible averaging procedure: For some fixed nL = nR , compute each gi as the average ofthe data points from fi−nL to fi+nR . This is sometimes called moving window averagingand corresponds to equation (14.8.1) with constant cn = 1/(nL + nR + 1).
If the underlyingfunction is constant, or is changing linearly with time (increasing or decreasing), then nobias is introduced into the result. Higher points at one end of the averaging interval are onthe average balanced by lower points at the other end. A bias is introduced, however, ifthe underlying function has a nonzero second derivative. At a local maximum, for example,moving window averaging always reduces the function value.
In the spectrometric application,a narrow spectral line has its height reduced and its width increased. Since these parametersare themselves of physical interest, the bias introduced is distinctly undesirable.Note, however, that moving window averaging does preserve the area under a spectralline, which is its zeroth moment, and also (if the window is symmetric with nL = nR ) itsmean position in time, which is its first moment. What is violated is the second moment,equivalent to the line width.The idea of Savitzky-Golay filtering is to find filter coefficients cn that preserve highermoments. Equivalently, the idea is to approximate the underlying function within the movingwindow not by a constant (whose estimate is the average), but by a polynomial of higherorder, typically quadratic or quartic: For each point fi , we least-squares fit a polynomial to allSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use.















