The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 90

Файл №811377 The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf) 90 страницаThe Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377) страница 902020-08-252020-08-25СтудИзба

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 90)

Local connectivity makeseach unit responsible for extracting local features from the layer below, and11.7 Example: ZIP Code Data407TABLE 11.1. Test set performance of five different neural networks on a handwritten digit classification example (Le Cun, 1989).Net-1:Net-2:Net-3:Net-4:Net-5:Network ArchitectureSingle layer networkTwo layer networkLocally connectedConstrained network 1Constrained network 2Links25703214122622665194Weights25703214122611321060% Correct80.0%87.0%88.5%94.0%98.4%reduces considerably the total number of weights.

With many more hiddenunits than Net-2, Net-3 has fewer links and hence weights (1226 vs. 3214),and achieves similar performance.Net-4 and Net-5 have local connectivity with shared weights. All unitsin a local feature map perform the same operation on different parts of theimage, achieved by sharing the same weights. The first hidden layer of Net4 has two 8×8 arrays, and each unit takes input from a 3×3 patch just likein Net-3. However, each of the units in a single 8 × 8 feature map share thesame set of nine weights (but have their own bias parameter). This forcesthe extracted features in different parts of the image to be computed bythe same linear functional, and consequently these networks are sometimesknown as convolutional networks.

The second hidden layer of Net-4 hasno weight sharing, and is the same as in Net-3. The gradient of the errorfunction R with respect to a shared weight is the sum of the gradients ofR with respect to each connection controlled by the weights in question.Table 11.1 gives the number of links, the number of weights and theoptimal test performance for each of the networks.

We see that Net-4 hasmore links but fewer weights than Net-3, and superior test performance.Net-5 has four 4 × 4 feature maps in the second hidden layer, each unitconnected to a 5 × 5 local patch in the layer below. Weights are sharedin each of these feature maps. We see that Net-5 does the best, havingerrors of only 1.6%, compared to 13% for the “vanilla” network Net-2.The clever design of network Net-5, motivated by the fact that features ofhandwriting style should appear in more than one part of a digit, was theresult of many person years of experimentation.

This and similar networksgave better performance on ZIP code problems than any other learningmethod at that time (early 1990s). This example also shows that neuralnetworks are not a fully automatic tool, as they are sometimes advertised.As with all statistical models, subject matter knowledge can and should beused to improve their performance.This network was later outperformed by the tangent distance approach(Simard et al., 1993) described in Section 13.3.3, which explicitly incorporates natural affine invariances.

At this point the digit recognition datasetsbecome test beds for every new learning procedure, and researchers worked408Neural Networkshard to drive down the error rates. As of this writing, the best error rates ona large database (60, 000 training, 10, 000 test observations), derived fromstandard NIST2 databases, were reported to be the following: (Le Cun etal., 1998):• 1.1% for tangent distance with a 1-nearest neighbor classifier (Section 13.3.3);• 0.8% for a degree-9 polynomial SVM (Section 12.3);• 0.8% for LeNet-5, a more complex version of the convolutional network described here;• 0.7% for boosted LeNet-4. Boosting is described in Chapter 8. LeNet4 is a predecessor of LeNet-5.Le Cun et al. (1998) report a much larger table of performance results, andit is evident that many groups have been working very hard to bring thesetest error rates down.

They report a standard error of 0.1% on the errorestimates, which is based on a binomial average with N = 10, 000 andp ≈ 0.01. This implies that error rates within 0.1—0.2% of one anotherare statistically equivalent. Realistically the standard error is even higher,since the test data has been implicitly used in the tuning of the variousprocedures.11.8 DiscussionBoth projection pursuit regression and neural networks take nonlinear functions of linear combinations (“derived features”) of the inputs.

This is apowerful and very general approach for regression and classification, andhas been shown to compete well with the best learning methods on manyproblems.These tools are especially effective in problems with a high signal-to-noiseratio and settings where prediction without interpretation is the goal. Theyare less effective for problems where the goal is to describe the physical process that generated the data and the roles of individual inputs.

Each inputenters into the model in many places, in a nonlinear fashion. Some authors(Hinton, 1989) plot a diagram of the estimated weights into each hiddenunit, to try to understand the feature that each unit is extracting. Thisis limited however by the lack of identifiability of the parameter vectorsαm , m = 1, . . . , M . Often there are solutions with αm spanning the samelinear space as the ones found during training, giving predicted values that2 The National Institute of Standards and Technology maintain large databases, including handwritten character databases; http://www.nist.gov/srd/.11.9 Bayesian Neural Nets and the NIPS 2003 Challenge409are roughly the same.

Some authors suggest carrying out a principal component analysis of these weights, to try to find an interpretable solution. Ingeneral, the difficulty of interpreting these models has limited their use infields like medicine, where interpretation of the model is very important.There has been a great deal of research on the training of neural networks. Unlike methods like CART and MARS, neural networks are smoothfunctions of real-valued parameters. This facilitates the development ofBayesian inference for these models. The next sections discusses a successful Bayesian implementation of neural networks.11.9 Bayesian Neural Nets and the NIPS 2003ChallengeA classification competition was held in 2003, in which five labeled training datasets were provided to participants.

It was organized for a NeuralInformation Processing Systems (NIPS) workshop. Each of the data setsconstituted a two-class classification problems, with different sizes and froma variety of domains (see Table 11.2). Feature measurements for a validation dataset were also available.Participants developed and applied statistical learning procedures tomake predictions on the datasets, and could submit predictions to a website on the validation set for a period of 12 weeks. With this feedback,participants were then asked to submit predictions for a separate test setand they received their results. Finally, the class labels for the validationset were released and participants had one week to train their algorithmson the combined training and validation sets, and submit their final predictions to the competition website.

A total of 75 groups participated, with20 and 16 eventually making submissions on the validation and test sets,respectively.There was an emphasis on feature extraction in the competition. Artificial “probes” were added to the data: these are noise features with distributions resembling the real features but independent of the class labels.The percentage of probes that were added to each dataset, relative to thetotal set of features, is shown on Table 11.2. Thus each learning algorithmhad to figure out a way of identifying the probes and downweighting oreliminating them.A number of metrics were used to evaluate the entries, including thepercentage correct on the test set, the area under the ROC curve, and acombined score that compared each pair of classifiers head-to-head.

Theresults of the competition are very interesting and are detailed in Guyon etal. (2006). The most notable result: the entries of Neal and Zhang (2006)were the clear overall winners. In the final competition they finished first410Neural NetworksTABLE 11.2. NIPS 2003 challenge data sets. The column labeled p is the numberof features. For the Dorothea dataset the features are binary. Ntr , Nval and Nteare the number of training, validation and test cases, respectivelyDatasetDomainArceneDexterDorotheaGisetteMadelonMass spectrometryText classificationDrug discoveryDigit recognitionArtificialFeatureTypeDenseSparseSparseDenseDensep10,00020,000100,0005000500PercentProbes3050503096NtrNvalNte100300800600020001003003501000600700200080065001800in three of the five datasets, and were 5th and 7th on the remaining twodatasets.In their winning entries, Neal and Zhang (2006) used a series of preprocessing feature-selection steps, followed by Bayesian neural networks,Dirichlet diffusion trees, and combinations of these methods.

Характеристики

Тип файла

PDF-файл

Размер

12,69 Mb

Материал

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Тип материала

Книга

Предмет

(ППП СОиАД) (SAS) Пакеты прикладных программ для статистической обработки и анализа данных

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

the-elements-of-statistical-learning.-data-mining_-inference_-and-prediction.pdf.rar

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.