The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 9

Файл №811377 The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf) 9 страницаThe Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377) страница 92020-08-252020-08-25СтудИзба

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 9)

... ... ... ... o.. ..o.. ... ... ... ... ...o.. .. .. .. ...o.. .. .. .. .. .. .. o. . . . ... ... ... oo.. .. .. .. .. .. ..o. .. .. .. .. .. .. .. .. .. .. ..o.o. ..o.o. .. .. .. .. .. .. .. .. .. .. .. .. o.. .. o.. .. .. .. .. .. .. .. .. ... ...

... ...o. . . .. .. .. .. .. .. .. ... oo.....oo.. .. .. .. .. .. .. .. .. .. .. .. .. .. o.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..o. ..o. .. .. .. .. ..o. .. .. o..o.. .. ... ... ... ... ... ... ... ... ... ... o. .. .. .. .. .. .. ....... .. .. .. .. .. .. .. .. .. .. .. .. ..

.. ..o.. .. o.. .. .. o.. .. ..o.. .. .. .. .. .. .. .. o.o.. o.o. .. .. .. .. .. ..o. .. .. .. o. .. .. .. .. .. .. .. ... ... ... ... ... ... ... ........oo...o... ... ... ... ... ... o... ... ... ... ... ... ... ... ... ... ... ... ...o... ... ... ... ... ...o...o... ... ... ...

... ... ... ... ...o... ... ... ... ... ... o...o... ... ...o... ...o... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. ..o. . .. .. .. .. .. .. .. .. .. .. .. .. .. o..o. . . . . .o. . . . . . . . . . .o. . . . . . .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. ..

.. .. .. ... ...o.. o.. .. o.. .. .. .. o. . . . . . .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o... ... ... ... o........ .. .. .. .. .. .. o. . . . .. .. .. .. .. .. .. .. .. .. .. .. .. ... ... ... ... ... ... ....o. ... ... ... ... ... ... ... o... ... ... ... ... ... ...

... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o..o.o.. o.. .. .. .. ... ... ... ... o.. .. o.. .. .. .. .. .. ..o.. .. .. .. o. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. o.. .. .. .. o...o..

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. ..o.. .. oo.. .. .. .. .. .. .. .. .. .. .. .. .. o.. .. .. .. .. .. .. .. .. ... o. . . .o. . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. ... . . . .. .. .. .. .. .. o.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

.. .. .. ..o... ... ... ... ... o. .. .. .. .. .. ... ... ... ... ... ... ... ... ... ...o..............o o .. ..o.. .... o.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..o.. .. .. .. .. .. o.. o.. .. .. .. .. .. ... ...o.. o. . . . . . . . . . . . . . . .. ..

.. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o... ... ... ... o..o.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. ..o.. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. ..o.o.................. o. . . . . . . . .. .. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... .... . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. .. ..o.. .. ... ... ... o... ... ... ... ... ... ... ...o.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ...

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o. . . .o. . . o. . . . . . . .. o.. .. .. .. .. .. .. .. .. .. .. .. o.. .. .. .. .. ..o.. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .... . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o.. .. .. .. .. .. ..

..o.. .. .. .. .. .. o..o.. .. .. .. .. o.. .. .. ..oo .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... o.. o.. o.. .. .. .. .. ..

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... ... ... ... ... ... ... ... o.. .. .. .. ..o....................... .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. .. o. . . .o. .o. . . .. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..o.. .. ... ... ... ... ... ... ... ... ... o. .

. . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... ... ... ... ... ... o. . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... ...oo.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... o.................. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..o.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ................................................................oFIGURE 2.2. The same classification example in two dimensions as in Figure 2.1. The classes are coded as a binary variable (BLUE = 0, ORANGE = 1) andthen fit by 15-nearest-neighbor averaging as in (2.8). The predicted class is hencechosen by majority vote amongst the 15-nearest neighbors.In Figure 2.2 we see that far fewer training observations are misclassifiedthan in Figure 2.1. This should not give us too much comfort, though, sincein Figure 2.3 none of the training data are misclassified.

A little thoughtsuggests that for k-nearest-neighbor fits, the error on the training datashould be approximately an increasing function of k, and will always be 0for k = 1. An independent test set would give us a more satisfactory meansfor comparing the different methods.It appears that k-nearest-neighbor fits have a single parameter, the number of neighbors k, compared to the p parameters in least-squares fits.

Although this is the case, we will see that the effective number of parametersof k-nearest neighbors is N/k and is generally bigger than p, and decreaseswith increasing k. To get an idea of why, note that if the neighborhoodswere nonoverlapping, there would be N/k neighborhoods and we would fitone parameter (a mean) in each neighborhood.It is also clear that we cannot use sum-of-squared errors on the trainingset as a criterion for picking k, since we would always pick k = 1! It wouldseem that k-nearest-neighbor methods would be more appropriate for themixture Scenario 2 described above, while for Gaussian data the decisionboundaries of k-nearest neighbors would be unnecessarily noisy.162.

Overview of Supervised Learning1−Nearest Neighbor Classifierooooo ooooo o ooo oo ooooo o ooo oooooo o oo ooo o ooo ooooo oooo o oooooooooo oooooo o ooo ooo ooooooooooo ooooo o ooo oo o oo ooo oo oooooooo oo oooooooo o oooo oooo o ooooooo oo oooooooo oo o oooooooooooo o ooo ooooooooo oo oo o ooo oooooooooooFIGURE 2.3. The same classification example in two dimensions as in Figure 2.1. The classes are coded as a binary variable (BLUE = 0, ORANGE = 1), andthen predicted by 1-nearest-neighbor classification.2.3.3 From Least Squares to Nearest NeighborsThe linear decision boundary from least squares is very smooth, and apparently stable to fit. It does appear to rely heavily on the assumptionthat a linear decision boundary is appropriate.

In language we will developshortly, it has low variance and potentially high bias.On the other hand, the k-nearest-neighbor procedures do not appear torely on any stringent assumptions about the underlying data, and can adaptto any situation. However, any particular subregion of the decision boundary depends on a handful of input points and their particular positions,and is thus wiggly and unstable—high variance and low bias.Each method has its own situations for which it works best; in particularlinear regression is more appropriate for Scenario 1 above, while nearestneighbors are more suitable for Scenario 2. The time has come to exposethe oracle! The data in fact were simulated from a model somewhere between the two, but closer to Scenario 2. First we generated 10 means mkfrom a bivariate Gaussian distribution N ((1, 0)T , I) and labeled this classBLUE.

Similarly, 10 more were drawn from N ((0, 1)T , I) and labeled classORANGE. Then for each class we generated 100 observations as follows: foreach observation, we picked an mk at random with probability 1/10, and2.3 Least Squares and Nearest Neighbors17k − Number of Nearest Neighbors0.30151 101694531211171829531672000.200.100.15Test Error0.25LinearTrainTestBayes235812Degrees of Freedom − N/kFIGURE 2.4. Misclassification curves for the simulation example used in Figures 2.1, 2.2 and 2.3.

A single training sample of size 200 was used, and a testsample of size 10, 000. The orange curves are test and the blue are training error for k-nearest-neighbor classification. The results for linear regression are thebigger orange and blue squares at three degrees of freedom. The purple line is theoptimal Bayes error rate.then generated a N (mk , I/5), thus leading to a mixture of Gaussian clusters for each class. Figure 2.4 shows the results of classifying 10,000 newobservations generated from the model.

We compare the results for leastsquares and those for k-nearest neighbors for a range of values of k.A large subset of the most popular techniques in use today are variants ofthese two simple procedures. In fact 1-nearest-neighbor, the simplest of all,captures a large percentage of the market for low-dimensional problems.The following list describes some ways in which these simple procedureshave been enhanced:• Kernel methods use weights that decrease smoothly to zero with distance from the target point, rather than the effective 0/1 weights usedby k-nearest neighbors.• In high-dimensional spaces the distance kernels are modified to emphasize some variable more than others.182. Overview of Supervised Learning• Local regression fits linear models by locally weighted least squares,rather than fitting constants locally.• Linear models fit to a basis expansion of the original inputs allowarbitrarily complex models.• Projection pursuit and neural network models consist of sums of nonlinearly transformed linear models.2.4 Statistical Decision TheoryIn this section we develop a small amount of theory that provides a framework for developing models such as those discussed informally so far.

Характеристики

Тип файла

PDF-файл

Размер

12,69 Mb

Материал

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Тип материала

Книга

Предмет

(ППП СОиАД) (SAS) Пакеты прикладных программ для статистической обработки и анализа данных

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

the-elements-of-statistical-learning.-data-mining_-inference_-and-prediction.pdf.rar

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.