c14-6 (779581)

Файл №779581 c14-6 (Numerical Recipes in C)c14-6 (779581)2017-12-272017-12-27СтудИзба

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла

14.6 Nonparametric or Rank Correlation639CITED REFERENCES AND FURTHER READING:Dunn, O.J., and Clark, V.A. 1974, Applied Statistics: Analysis of Variance and Regression (NewYork: Wiley).Hoel, P.G. 1971, Introduction to Mathematical Statistics, 4th ed. (New York: Wiley), Chapter 7.von Mises, R. 1964, Mathematical Theory of Probability and Statistics (New York: AcademicPress), Chapters IX(A) and IX(B).Korn, G.A., and Korn, T.M.

1968, Mathematical Handbook for Scientists and Engineers, 2nd ed.(New York: McGraw-Hill), §19.7.Norusis, M.J. 1982, SPSS Introductory Guide: Basic Statistics and Operations; and 1985, SPSSX Advanced Statistics Guide (New York: McGraw-Hill).14.6 Nonparametric or Rank CorrelationIt is precisely the uncertainty in interpreting the significance of the linearcorrelation coefficient r that leads us to the important concepts of nonparametric orrank correlation.

As before, we are given N pairs of measurements (xi , yi ). Before,difficulties arose because we did not necessarily know the probability distributionfunction from which the xi ’s or yi ’s were drawn.The key concept of nonparametric correlation is this: If we replace the valueof each xi by the value of its rank among all the other xi ’s in the sample, thatis, 1, 2, 3, . . ., N , then the resulting list of numbers will be drawn from a perfectlyknown distribution function, namely uniformly from the integers between 1 and N ,inclusive. Better than uniformly, in fact, since if the xi ’s are all distinct, then eachinteger will occur precisely once. If some of the xi ’s have identical values, it isconventional to assign to all these “ties” the mean of the ranks that they would havehad if their values had been slightly different.

This midrank will sometimes be aninteger, sometimes a half-integer. In all cases the sum of all assigned ranks will bethe same as the sum of the integers from 1 to N , namely 12 N (N + 1).Of course we do exactly the same procedure for the yi ’s, replacing each valueby its rank among the other yi ’s in the sample.Now we are free to invent statistics for detecting correlation between uniformsets of integers between 1 and N , keeping in mind the possibility of ties in the ranks.There is, of course, some loss of information in replacing the original numbers byranks.

We could construct some rather artificial examples where a correlation couldbe detected parametrically (e.g., in the linear correlation coefficient r), but could notSample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited.

To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).sxy += xt*yt;}*r=sxy/(sqrt(sxx*syy)+TINY);*z=0.5*log((1.0+(*r)+TINY)/(1.0-(*r)+TINY));Fisher’s z transformation.df=n-2;t=(*r)*sqrt(df/((1.0-(*r)+TINY)*(1.0+(*r)+TINY)));Equation (14.5.5).*prob=betai(0.5*df,0.5,df/(df+t*t));Student’s t probability./**prob=erfcc(fabs((*z)*sqrt(n-1.0))/1.4142136)*/For large n, this easier computation of prob, using the short routine erfcc, would give approximately the same value.}640Chapter 14.Statistical Description of DataSpearman Rank-Order Correlation CoefficientLet Ri be the rank of xi among the other x’s, Si be the rank of yi among theother y’s, ties being assigned the appropriate midrank as described above.

Then therank-order correlation coefficient is defined to be the linear correlation coefficientof the ranks, namely,P(Ri − R)(Si − S)qP(14.6.1)rs = q P i22(R−R)(S−S)iiiiThe significance of a nonzero value of rs is tested by computingsN −2t = rs1 − rs2(14.6.2)which is distributed approximately as Student’s distribution with N − 2 degrees offreedom. A key point is that this approximation does not depend on the originaldistribution of the x’s and y’s; it is always the same approximation, and alwayspretty good.It turns out that rs is closely related to another conventional measure ofnonparametric correlation, the so-called sum squared difference of ranks, defined asD=NX(Ri − Si )2(14.6.3)i=1(This D is sometimes denoted D**, where the asterisks are used to indicate thatties are treated by midranking.)When there are no ties in the data, then the exact relation between D and rs is6D(14.6.4)N3 − NWhen there are ties, then the exact relation is slightly more complicated: Let fk bethe number of ties in the kth group of ties among the Ri ’s, and let gm be the numberof ties in the mth group of ties among the Si ’s.

Then it turns out thatrs = 1 −61 P1 P33D + 121− 3k (fk − fk ) + 12m (gm − gm )N−Nrs ="#1/2 "#1/2P 3P3k (fk − fk )m (gm − gm )1−1−N3 − NN3 − N(14.6.5)Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machinereadable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMsvisit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).be detected nonparametrically.

Such examples are very rare in real life, however,and the slight loss of information in ranking is a small price to pay for a very majoradvantage: When a correlation is demonstrated to be present nonparametrically,then it is really there! (That is, to a certainty level that depends on the significancechosen.) Nonparametric correlation is more robust than linear correlation, moreresistant to unplanned defects in the data, in the same sort of sense that the medianis more robust than the mean.

For more on the concept of robustness, see §15.7.As always in statistics, some particular choices of a statistic have already beeninvented for us and consecrated, if not beatified, by popular use. We will discusstwo, the Spearman rank-order correlation coefficient (rs ), and Kendall’s tau (τ ).14.6 Nonparametric or Rank Correlation641holds exactly. Notice that if all the fk ’s and all the gm ’s are equal to one, meaningthat there are no ties, then equation (14.6.5) reduces to equation (14.6.4).In (14.6.2) we gave a t-statistic that tests the significance of a nonzero rs . It isalso possible to test the significance of D directly.

The expectation value of D inthe null hypothesis of uncorrelated data sets is1 31 X 31 X 3(N − N ) −(fk − fk ) −(g − gm )61212 m m(14.6.6)kits variance isVar(D) =(N − 1)N 2 (N + 1)236PP 33k (fk − fk )m (gm − gm )1−× 1−N3 − NN3 − N(14.6.7)and it is approximately normally distributed, so that the significance level is acomplementary error function (cf.

equation 14.5.2). Of course, (14.6.2) and (14.6.7)are not independent tests, but simply variants of the same test. In the program thatfollows, we calculate both the significance level obtained by using (14.6.2) and thesignificance level obtained by using (14.6.7); their discrepancy will give you an ideaof how good the approximations are. You will also notice that we break off the taskof assigning ranks (including tied midranks) into a separate function, crank.#include <math.h>#include "nrutil.h"void spear(float data1[], float data2[], unsigned long n, float *d, float *zd,float *probd, float *rs, float *probrs)Given two data arrays, data1[1..n] and data2[1..n], this routine returns their sum-squareddifference of ranks as D, the number of standard deviations by which D deviates from its nullhypothesis expected value as zd, the two-sided significance level of this deviation as probd,Spearman’s rank correlation rs as rs, and the two-sided significance level of its deviation fromzero as probrs.

Характеристики

Тип файла

PDF-файл

Размер

173,65 Kb

Материал

Numerical Recipes in C

Тип материала

Книга

Предмет

Цифровая обработка сигналов (ЦОС)

Высшее учебное заведение

МГТУ им. Н.Э.Баумана

Тип файла PDF

PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.

Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.

Список файлов книги

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.