Regression models for data sciense
Описание файла
PDF-файл из архива "Regression models for data sciense", который расположен в категории "". Всё это находится в предмете "математическое моделирование" из 9 семестр (1 семестр магистратуры), которые можно найти в файловом архиве МГТУ им. Н.Э.Баумана. Не смотря на прямую связь этого архива с МГТУ им. Н.Э.Баумана, его также можно найти и в других разделах. Архив можно найти в разделе "книги и методические указания", в предмете "математическое моделирование" в общих файлах.
Просмотр PDF-файла онлайн
Текст из PDF
Regression Models for Data Science in RA companion book for the Coursera Regression ModelsclassBrian CaffoThis book is for sale at http://leanpub.com/regmodsThis version was published on 2015-08-05This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishingprocess. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools andmany iterations to get reader feedback, pivot until you have the right book and build traction onceyou do.This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0Unported LicenseAlso By Brian CaffoStatistical inference for data scienceTo Kerri, Penelope, Scarlett and BowieContentsPreface . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .About this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .About the cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .111Introduction . . . . . . . . . . . . . . . . . . . . . . . . .Before beginning . . . . . . . . . . . . . . . . . . . . .Regression models . . . . . . . . . . . . . . . . . . . . .Motivating examples . . . . . . . . . . . . . . . . . .
.Summary notes: questions for this book . . . . . . . . .Exploratory analysis of Galton’s Data . . . . . . . . . .The math (not required) . . . . . . . . . . . . . . . . .Comparing children’s heights and their parent’s heightsRegression through the origin . . . . . . . . . . . . .
.Exercises . . . . . . . . . . . . . . . . . . . . . . . . . ...............................................................................................................................................................................................222344781012Notation . . . . . . . . . . . . . . . . . . . . . .Some basic definitions . . . . . . . . . .
. . .Notation for data . . . . . . . . . . . . . . . .The empirical mean . . . . . . . . . . . . . . .The emprical standard deviation and varianceNormalization . . . . . . . . . . . . . . . . . .The empirical covariance . . . . . . . . .
. . .Some facts about correlation . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . .........................................................................................................................................................................................................................141414141515151616Ordinary least squares .
. . . . . . . . . .General least squares for linear equationsRevisiting Galton’s data . . . . . . . . . .Showing the OLS result . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . .........................................................................................................................1717192121Regression to the mean . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .A historically famous idea, regression to the mean . . . . . . . . . . . . . . . . . . . . . .Regression to the mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .232323...............CONTENTSExercises . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .26Statistical linear regression models . . . . . . . . . . .Basic regression model with additive Gaussian errors.Interpreting regression coefficients, the intercept . . .Interpreting regression coefficients, the slope . . . . .Using regression for prediction . . . . . .
. . . . . . .Example . . . . . . . . . . . . . . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . .............................................................................................................................................27272828292932Residuals . . . . . . . . . . . .Residual variation . . .
. . .Properties of the residuals .Example . . . . . . . . . . .Estimating residual variationSummarizing variation . . .R squared . . . . . . . . . .Exercises . . . . . . . . . . .........................................................................................................................................................................................................................................................................3434363741424445Regression inference .
. . . . . . . . .Reminder of the model . . . . . . . .Review . . . . . . . . . . . . . . . . .Results for the regression parametersExample diamond data set . . . . . .Getting a confidence interval . . . . .Prediction of outcomes . . . . . .
. .Summary notes . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . ......................................................................................................................................................................................................................................................................464646474749495152Multivariable regression analysis . .
. . . . . . . . .The linear model . . . . . . . . . . . . . . . . . . .Estimation . . . . . . . . . . . . . . . . . . . . . . .Example with two variables, simple linear regressionThe general case . . . . . . . . . . . . . . . . . . . .Simulation demonstrations . . . . . . . . . . . . . .Interpretation of the coefficients . . . .
. . . . . . .Fitted values, residuals and residual variation . . . .Summary notes on linear models . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . ...................................................................................................................................................................................................................53535455555656575858Multivariable examples and tricksData set for discussion . . . . .Simulation study . . .
. . . . .Back to this data set . . . . . . .....................................................................................59596162....................................................CONTENTSWhat if we include a completely unnecessary variable?Dummy variables are smart . . . . .
. . . . . . . . . .More than two levels . . . . . . . . . . . . . . . . . . .Insect Sprays . . . . . . . . . . . . . . . . . . . . . . . .Further analysis of the swiss dataset . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .Adjustment . . . . . .Experiment 1 . . .Experiment 2 . . .Experiment 3 . . .Experiment 4 . . .Experiment 5 . . .Some final thoughtsExercises . .
. . . .........................................................................................................................626364646972................................................................................................................................................................7373767778798080Residuals, variation, diagnostics .
. . . . . . .Residuals . . . . . . . . . . . . . . . . . . . .Influential, high leverage and outlying pointsResiduals, Leverage and Influence measures .Simulation examples . . . . . . . . . . . . .Example described by Stefanski . . . . . . .Back to the Swiss data . . . . . . . . . .
. .Exercises . . . . . . . . . . . . . . . . . . . .........................................................................................................................................................................................................8181828486889191Multiple variables and model selection .
. . . . . . . . . . . . . .Multivariable regression . . . . . . . . . . . . . . . . . . . . . .The Rumsfeldian triplet . . . . . . . . . . . . . . . . . . . . . . .General rules . . . . . . . . . . . . . . . . . . . . . . . . . . . .R squared goes up as you put regressors in the model . . . . .
. .Simulation demonstrating variance inflation . . . . . . . . . . .Summary of variance inflation . . . . . . . . . . . . . . . . . . .Swiss data revisited . . . . . . . . . . . . . . . . . . . . . . . . .Impact of over- and under-fitting on residual variance estimationCovariate model selection . . . . . .
. . . . . . . . . . . . . . .How to do nested model testing in R . . . . . . . . . . . . . . . .Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .............................................................................................................................................................. 92. 92. 93. 93. 94. 95. 96. 97. 98. 99. 100. 100Generalized Linear Models . .Example, linear models . . .Example, logistic regressionExample, Poisson regressionHow estimates are obtainedOdds and ends . . . . . . . ...............................................................................................................................................................................................................................................................................................................................................................101101102102103104CONTENTSExercises .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104Binary GLMs . . . . . . . . . . . . . . . . . .Example Baltimore Ravens win/loss . . . .Odds . . . . . . . . . . . . . . . . . . . . .Modeling the odds . . . . . . . . . .