The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 43

Файл №811377 The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf) 43 страницаThe Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377) страница 432020-08-252020-08-25СтудИзба

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 43)

Similarly hK(·, xi ), K(·, xj )iHK =K(xi , xj ) (the reproducing property of HK ), and henceJ(f ) =NN XXK(xi , xj )αi αj(5.51)i=1 j=1PNfor f (x) = i=1 αi K(x, xi ).In light of (5.50) and (5.51), (5.48) reduces to a finite-dimensional criterionmin L(y, Kα) + λαT Kα.(5.52)αWe are using a vector notation, in which K is the N × N matrix with ijthentry K(xi , xj ) and so on.

Simple numerical algorithms can be used tooptimize (5.52). This phenomenon, whereby the infinite-dimensional problem (5.48) or (5.49) reduces to a finite dimensional optimization problem,has been dubbed the kernel property in the literature on support-vectormachines (see Chapter 12).1705. Basis Expansions and RegularizationThere is a Bayesian interpretation of this class of models, in which fis interpreted as a realization of a zero-mean stationary Gaussian process,with prior covariance function K.

The eigen-decomposition produces a series of orthogonal eigen-functions φj (x) with associated variances γj . Thetypical scenario is that “smooth” functions φj have large prior variance,while “rough” φj have small prior variances. The penalty in (5.48) is thecontribution of the prior to the joint likelihood, and penalizes more thosecomponents with smaller prior variance (compare with (5.43)).For simplicity we have dealt with the case here where all members of Hare penalized, as in (5.48).

More generally, there may be some componentsin H that we wish to leave alone, such as the linear functions for cubicsmoothing splines in Section 5.4. The multidimensional thin-plate splinesof Section 5.7 and tensor product splines fall into this category as well.In these cases there is a more convenient representation H = H0 ⊕ H1 ,with the null space H0 consisting of, for example, low degree polynomials in x that do not get penalized.

The penalty becomes J(f ) = kP1 f k,where P1 is theof f onto H1 . The solution has thePMorthogonal projectionPNform f (x) = j=1 βj hj (x) + i=1 αi K(x, xi ), where the first term represents an expansion in H0 . From a Bayesian perspective, the coefficients ofcomponents in H0 have improper priors, with infinite variance.5.8.2 Examples of RKHSThe machinery above is driven by the choice of the kernel K and the lossfunction L. We consider first regression using squared-error loss.

In thiscase (5.48) specializes to penalized least squares, and the solution can becharacterized in two equivalent ways corresponding to (5.49) or (5.52):min∞{cj }1NXi=1y i −∞Xj=12cj φj (xi ) + λ∞Xc2jγj=1 j(5.53)an infinite-dimensional, generalized ridge regression problem, ormin(y − Kα)T (y − Kα) + λαT Kα.α(5.54)The solution for α is obtained simply asα̂ = (K + λI)−1 y,(5.55)andfˆ(x) =NXj=1α̂j K(x, xj ).(5.56)5.8 Regularization and Reproducing Kernel Hilbert Spaces171The vector of N fitted values is given by===f̂Kα̂K(K + λI)−1 y(I + λK−1 )−1 y.(5.57)(5.58)The estimate (5.57) also arises as the kriging estimate of a Gaussian random field in spatial statistics (Cressie, 1993). Compare also (5.58) with thesmoothing spline fit (5.17) on page 154.Penalized Polynomial RegressionThe kernel K(x, y) = (hx, yi + 1)d (Vapnik, 1996), for x, y ∈ IRp , hasM = p+deigen-functions that span the space of polynomials in IRp ofdtotal degree d.

For example, with p = 2 and d = 2, M = 6 andK(x, y)==1 + 2x1 y1 + 2x2 y2 + x21 y12 + x22 y22 + 2x1 x2 y1 y2 (5.59)MXhm (x)hm (y)(5.60)m=1withh(x)T = (1,√2x1 ,√2x2 , x21 , x22 ,√2x1 x2 ).(5.61)One can represent h in terms of the M orthogonal eigen-functions andeigenvalues of K,1h(x) = VDγ2 φ(x),(5.62)where Dγ = diag(γ1 , γ2 , . . . , γM ), and V is M × M and orthogonal.Suppose we wish to solve the penalized polynomial regression problemmin{βm }M1NXi=1yi −MXm=1βm hm (xi )!2+λMX2βm.(5.63)m=1Substituting (5.62) into (5.63), we get an expression of the form (5.53) tooptimize (Exercise 5.16).The number of basis functions M = p+dcan be very large, often muchdlarger than N .

Equation (5.55) tells us that if we use the kernel representation for the solution function, we have only to evaluate the kernel N 2times, and can compute the solution in O(N 3 ) operations.This simplicity is not without implications. Each of the polynomials hmin (5.61) inherits a scaling factor from the particular form of K, which hasa bearing on the impact of the penalty in (5.63). We elaborate on this inthe next section.1725. Basis Expansions and Regularization0.80.40.0K(·, xm )Radial Kernel in IR1−2−101234XFIGURE 5.13. Radial kernels kk (x) for the mixture data, with scale parameterν = 1.

The kernels are centered at five points xm chosen at random from the 200.Gaussian Radial Basis FunctionsIn the preceding example, the kernel is chosen because it represents anexpansion of polynomials and can conveniently compute high-dimensionalinner products. In this example the kernel is chosen because of its functionalform in the representation (5.50).2The Gaussian kernel K(x, y) = e−ν||x−y|| along with squared-error loss,for example, leads to a regression model that is an expansion in Gaussianradial basis functions,2km (x) = e−ν||x−xm || , m = 1, .

. . , N,(5.64)each one centered at one of the training feature vectors xm . The coefficientsare estimated using (5.54).Figure 5.13 illustrates radial kernels in IR1 using the first coordinate ofthe mixture example from Chapter 2. We show five of the 200 kernel basisfunctions km (x) = K(x, xm ).Figure 5.14 illustrates the implicit feature space for the radial kernelwith x ∈ IR1 .

We computed the 200 × 200 kernel matrix K, and its eigendecomposition ΦDγ ΦT . We can think of the columns of Φ and the corresponding eigenvalues in Dγ as empirical estimates of the eigen expansion(5.45)2 . Although the eigenvectors are discrete, we can represent them asfunctions on IR1 (Exercise 5.17). Figure 5.15 shows the largest 50 eigenvalues of K. The leading eigenfunctions are smooth, and they are successivelymore wiggly as the order increases. This brings to life the penalty in (5.49),where we see the coefficients of higher-order functions get penalized morethan lower-order ones.

The right panel in Figure 5.14 shows the correspond2 The ℓth column of Φ is an estimate of φ , evaluated at each of the N observations.ℓAlternatively, the ith row of Φ is the estimated vector of basis functions φ(xi ), evaluatedat the point xi . Although in principle, there can be infinitely many elements in φ, ourestimate has at most N elements.5.8 Regularization and Reproducing Kernel Hilbert SpacesOrthonormal Basis ΦFeature Space H******** ********************************************* *********************** ********* ****** ***** * ******* ****** ************* *** * * ****** ******************* **************************************************************************************************************************************** **************** *************** * ********** *** ******************** ****** ****** ******* *********************** ********* **** ******** ***** ** ********** ********** ** ****** *** ********** *** ****************************** ***************** ******* **************** ************* * ********* **************** ******** **** ***** **** **** ***************** ************************** ** ** ***** ***** ********* *** *** ** ** ** * * ******* ******************************* **************** **** **** **** ***** ** **** ********* ******************************** **************** ****************** ******************** ******* ***************** ************* * *** *********** ** ********* ******************** **************************************************** ************************************************ ***** ****** *************************************** ***************************** *** ***** ******* *********************************** ********** ************ ************************* ************ ****** *********** ***** ******** **** *** ******* ***** ***** **** **** ******** ***** *** ****** ************* ****************** ***************************** ***** ***** ****************************** ********************** ***** *** ** ***** **** ** ** *** ** ***************** ******* **173***** ********** *** ** ******** ******* *** *** *** *** **** * *********** *********************** ********* ** *** ** ***************** ************************ ************************* *************** *********** *** *** ******************* ******* ************************ ************** ****************************************************************************** ****************************************************** ******************************* * ** *** **** ******* *** ****** * ****** ****** *** ** ** * *** ** ** * * ** ******** ******* ******* *** ****** * ** * * * * ** *** ** **** **** *** *** **** ** *** ** ** ** *** * * * ****** * * ** * * ** ** ******* ****** * * ** ***** ****** ***** **** ****** *********** ***************** *** ********** **** * **** ****** * ** *** ** ******** ** *** * * **** * * * ** * *** ** ** * * ***** ** *** *** ***** ****** ****** *** ** ****** ******************************************************************************** *** ************************** *** * * ****** ******************************************************** *** * * ****** **************************************************************************************** *** * * ****** ********************* ****************1e−071e−151e−11Eigenvalue1e−031e+01FIGURE 5.14.

Характеристики

Тип файла

PDF-файл

Размер

12,69 Mb

Материал

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Тип материала

Книга

Предмет

(ППП СОиАД) (SAS) Пакеты прикладных программ для статистической обработки и анализа данных

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

the-elements-of-statistical-learning.-data-mining_-inference_-and-prediction.pdf.rar

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.