Regression models for data sciense (779323)
Текст из файла
Regression Models for Data Science in RA companion book for the Coursera Regression ModelsclassBrian CaffoThis book is for sale at http://leanpub.com/regmodsThis version was published on 2015-08-05This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishingprocess. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools andmany iterations to get reader feedback, pivot until you have the right book and build traction onceyou do.This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0Unported LicenseAlso By Brian CaffoStatistical inference for data scienceTo Kerri, Penelope, Scarlett and BowieContentsPreface . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .About this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .About the cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .111Introduction . . . . . . . . . . . . . . . . . . . . . . . . .Before beginning . . . . . . . . . . . . . . . . . . . . .Regression models . . . . . . . . . . . . . . . . . . . . .Motivating examples . . . . . . . . . . . . . . . . . .
.Summary notes: questions for this book . . . . . . . . .Exploratory analysis of Galton’s Data . . . . . . . . . .The math (not required) . . . . . . . . . . . . . . . . .Comparing children’s heights and their parent’s heightsRegression through the origin . . . . . . . . . . . . .
.Exercises . . . . . . . . . . . . . . . . . . . . . . . . . ...............................................................................................................................................................................................222344781012Notation . . . . . . . . . . . . . . . . . . . . . .Some basic definitions . . . . . . . . . .
. . .Notation for data . . . . . . . . . . . . . . . .The empirical mean . . . . . . . . . . . . . . .The emprical standard deviation and varianceNormalization . . . . . . . . . . . . . . . . . .The empirical covariance . . . . . . . . .
. . .Some facts about correlation . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . .........................................................................................................................................................................................................................141414141515151616Ordinary least squares .
. . . . . . . . . .General least squares for linear equationsRevisiting Galton’s data . . . . . . . . . .Showing the OLS result . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . .........................................................................................................................1717192121Regression to the mean . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .A historically famous idea, regression to the mean . . . . . . . . . . . . . . . . . . . . . .Regression to the mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .232323...............CONTENTSExercises . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .26Statistical linear regression models . . . . . . . . . . .Basic regression model with additive Gaussian errors.Interpreting regression coefficients, the intercept . . .Interpreting regression coefficients, the slope . . . . .Using regression for prediction . . . . . .
. . . . . . .Example . . . . . . . . . . . . . . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . .............................................................................................................................................27272828292932Residuals . . . . . . . . . . . .Residual variation . . .
. . .Properties of the residuals .Example . . . . . . . . . . .Estimating residual variationSummarizing variation . . .R squared . . . . . . . . . .Exercises . . . . . . . . . . .........................................................................................................................................................................................................................................................................3434363741424445Regression inference .
. . . . . . . . .Reminder of the model . . . . . . . .Review . . . . . . . . . . . . . . . . .Results for the regression parametersExample diamond data set . . . . . .Getting a confidence interval . . . . .Prediction of outcomes . . . . . .
. .Summary notes . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . ......................................................................................................................................................................................................................................................................464646474749495152Multivariable regression analysis . .
. . . . . . . . .The linear model . . . . . . . . . . . . . . . . . . .Estimation . . . . . . . . . . . . . . . . . . . . . . .Example with two variables, simple linear regressionThe general case . . . . . . . . . . . . . . . . . . . .Simulation demonstrations . . . . . . . . . . . . . .Interpretation of the coefficients . . . .
. . . . . . .Fitted values, residuals and residual variation . . . .Summary notes on linear models . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . ...................................................................................................................................................................................................................53535455555656575858Multivariable examples and tricksData set for discussion . . . . .Simulation study . . .
. . . . .Back to this data set . . . . . . .....................................................................................59596162....................................................CONTENTSWhat if we include a completely unnecessary variable?Dummy variables are smart . . . . .
. . . . . . . . . .More than two levels . . . . . . . . . . . . . . . . . . .Insect Sprays . . . . . . . . . . . . . . . . . . . . . . . .Further analysis of the swiss dataset . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .Adjustment . . . . . .Experiment 1 . . .Experiment 2 . . .Experiment 3 . . .Experiment 4 . . .Experiment 5 . . .Some final thoughtsExercises . .
. . . .........................................................................................................................626364646972................................................................................................................................................................7373767778798080Residuals, variation, diagnostics .
. . . . . . .Residuals . . . . . . . . . . . . . . . . . . . .Influential, high leverage and outlying pointsResiduals, Leverage and Influence measures .Simulation examples . . . . . . . . . . . . .Example described by Stefanski . . . . . . .Back to the Swiss data . . . . . . . . . .
. .Exercises . . . . . . . . . . . . . . . . . . . .........................................................................................................................................................................................................8181828486889191Multiple variables and model selection .
. . . . . . . . . . . . . .Multivariable regression . . . . . . . . . . . . . . . . . . . . . .The Rumsfeldian triplet . . . . . . . . . . . . . . . . . . . . . . .General rules . . . . . . . . . . . . . . . . . . . . . . . . . . . .R squared goes up as you put regressors in the model . . . . .
. .Simulation demonstrating variance inflation . . . . . . . . . . .Summary of variance inflation . . . . . . . . . . . . . . . . . . .Swiss data revisited . . . . . . . . . . . . . . . . . . . . . . . . .Impact of over- and under-fitting on residual variance estimationCovariate model selection . . . . . .
. . . . . . . . . . . . . . .How to do nested model testing in R . . . . . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .............................................................................................................................................................. 92. 92. 93. 93. 94. 95. 96. 97. 98. 99. 100. 100Generalized Linear Models . .Example, linear models . . .Example, logistic regressionExample, Poisson regressionHow estimates are obtainedOdds and ends . . . . . . . ...............................................................................................................................................................................................................................................................................................................................................................101101102102103104CONTENTSExercises .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104Binary GLMs . . . . . . . . . . . . . . . . . .Example Baltimore Ravens win/loss . . . .Odds . . . . . . . . . . . . . . . . . . . . .Modeling the odds . . . . . . . . . .
Характеристики
Тип файла PDF
PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.
Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.