The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 10

Файл №811377 The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf) 10 страницаThe Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377) страница 102020-08-252020-08-25СтудИзба

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 10)

Wefirst consider the case of a quantitative output, and place ourselves in theworld of random variables and probability spaces. Let X ∈ IRp denote areal valued random input vector, and Y ∈ IR a real valued random output variable, with joint distribution Pr(X, Y ). We seek a function f (X)for predicting Y given values of the input X. This theory requires a lossfunction L(Y, f (X)) for penalizing errors in prediction, and by far the mostcommon and convenient is squared error loss: L(Y, f (X)) = (Y − f (X))2 .This leads us to a criterion for choosing f ,EPE(f )==E(Y − f (X))2Z2[y − f (x)] Pr(dx, dy),(2.9)(2.10)the expected (squared) prediction error . By conditioning1 on X, we canwrite EPE asEPE(f ) = EX EY |X [Y − f (X)]2 |X(2.11)and we see that it suffices to minimize EPE pointwise:The solution isf (x) = argminc EY |X [Y − c]2 |X = x .f (x) = E(Y |X = x),(2.12)(2.13)the conditional expectation, also known as the regression function.

Thusthe best prediction of Y at any point X = x is the conditional mean, whenbest is measured by average squared error.The nearest-neighbor methods attempt to directly implement this recipeusing the training data. At each point x, we might ask for the average of all1 Conditioning here amounts to factoring the joint density Pr(X, Y ) = Pr(Y |X)Pr(X)where Pr(Y |X) = Pr(Y, X)/Pr(X), and splitting up the bivariate integral accordingly.2.4 Statistical Decision Theory19those yi s with input xi = x.

Since there is typically at most one observationat any point x, we settle forfˆ(x) = Ave(yi |xi ∈ Nk (x)),(2.14)where “Ave” denotes average, and Nk (x) is the neighborhood containingthe k points in T closest to x. Two approximations are happening here:• expectation is approximated by averaging over sample data;• conditioning at a point is relaxed to conditioning on some region“close” to the target point.For large training sample size N , the points in the neighborhood are likelyto be close to x, and as k gets large the average will get more stable.In fact, under mild regularity conditions on the joint probability distribution Pr(X, Y ), one can show that as N, k → ∞ such that k/N → 0,fˆ(x) → E(Y |X = x).

In light of this, why look further, since it seemswe have a universal approximator? We often do not have very large samples. If the linear or some more structured model is appropriate, then wecan usually get a more stable estimate than k-nearest neighbors, althoughsuch knowledge has to be learned from the data as well. There are otherproblems though, sometimes disastrous. In Section 2.5 we see that as thedimension p gets large, so does the metric size of the k-nearest neighborhood. So settling for nearest neighborhood as a surrogate for conditioningwill fail us miserably.

The convergence above still holds, but the rate ofconvergence decreases as the dimension increases.How does linear regression fit into this framework? The simplest explanation is that one assumes that the regression function f (x) is approximatelylinear in its arguments:f (x) ≈ xT β.(2.15)This is a model-based approach—we specify a model for the regression function. Plugging this linear model for f (x) into EPE (2.9) and differentiatingwe can solve for β theoretically:β = [E(XX T )]−1 E(XY ).(2.16)Note we have not conditioned on X; rather we have used our knowledgeof the functional relationship to pool over values of X. The least squaressolution (2.6) amounts to replacing the expectation in (2.16) by averagesover the training data.So both k-nearest neighbors and least squares end up approximatingconditional expectations by averages.

But they differ dramatically in termsof model assumptions:• Least squares assumes f (x) is well approximated by a globally linearfunction.202. Overview of Supervised Learning• k-nearest neighbors assumes f (x) is well approximated by a locallyconstant function.Although the latter seems more palatable, we have already seen that wemay pay a price for this flexibility.Many of the more modern techniques described in this book are modelbased, although far more flexible than the rigid linear model. For example,additive models assume thatf (X) =pXfj (Xj ).(2.17)j=1This retains the additivity of the linear model, but each coordinate functionfj is arbitrary.

It turns out that the optimal estimate for the additive modeluses techniques such as k-nearest neighbors to approximate univariate conditional expectations simultaneously for each of the coordinate functions.Thus the problems of estimating a conditional expectation in high dimensions are swept away in this case by imposing some (often unrealistic) modelassumptions, in this case additivity.Are we happy with the criterion (2.11)? What happens if we replace theL2 loss function with the L1 : E|Y − f (X)|? The solution in this case is theconditional median,fˆ(x) = median(Y |X = x),(2.18)which is a different measure of location, and its estimates are more robustthan those for the conditional mean.

L1 criteria have discontinuities intheir derivatives, which have hindered their widespread use. Other moreresistant loss functions will be mentioned in later chapters, but squarederror is analytically convenient and the most popular.What do we do when the output is a categorical variable G? The sameparadigm works here, except we need a different loss function for penalizingprediction errors. An estimate Ĝ will assume values in G, the set of possibleclasses. Our loss function can be represented by a K × K matrix L, whereK = card(G). L will be zero on the diagonal and nonnegative elsewhere,where L(k, ℓ) is the price paid for classifying an observation belonging toclass Gk as Gℓ .

Most often we use the zero–one loss function, where allmisclassifications are charged a single unit. The expected prediction errorisEPE = E[L(G, Ĝ(X))],(2.19)where again the expectation is taken with respect to the joint distributionPr(G, X). Again we condition, and can write EPE asEPE = EXKXk=1L[Gk , Ĝ(X)]Pr(Gk |X)(2.20)2.4 Statistical Decision Theory21Bayes Optimal Classifier.. ....

.... .... .... .... .... .... ..... ..... .... .... .... .... .... .... ..... ..... .... .... .... .... .... ..... ..... .... .... .... .... ..... ..... .... .... .... .... .... ..... ..... .... .... .... .... ..... ..... .... .... ......................................................................................................... .... .... ....

.... .... .... .... ..... ..... .... .... .... .... .... .... ..... ..... .... .... .... .... .... ..... ..... .... .... .... .... ..... ..... .... .... .... .... .... ..... ..... .... .... .... .... ..... ..... .... .... ...... .... .... .... .... .... ....

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... . . . . . . . . . . . . . .o...............................................o .... .... .... ....o.... .... .... .... .... .... .... .... .... .... .... .... .... o.... .... .... ....o.... .... .... .... .... ....

.... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ...... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. o. . . . . . . . . . . . .o.. .. ..o..o. . . . . . . . . . . . .o.............................. .. .. .. ... ... ... ... ...o... ... ... ... ... ... ... ... o.. .. .. .. ...

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. ..o. . .. .. ..o.. .. .. ..o....................................o .....

o..... ..... ..... ..... ..... o.. .. .. .. .. .. .. .. .. .. .. ... ...o.. .. .. .. ..o.. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o o o.... .... .... .... .... .... ..... .....o..... ..... ..... ..... ..... .....

..... .....o..... ..... .....oo..... ..... .....o..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ...... . . . . . . . . . .o. . . . .

Характеристики

Тип файла

PDF-файл

Размер

12,69 Mb

Материал

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Тип материала

Книга

Предмет

(ППП СОиАД) (SAS) Пакеты прикладных программ для статистической обработки и анализа данных

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

the-elements-of-statistical-learning.-data-mining_-inference_-and-prediction.pdf.rar

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.