The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 36

Файл №811377 The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf) 36 страницаThe Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377) страница 362020-08-252020-08-25СтудИзба

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 36)

, 3. Except in specialcases, we would typically prefer the third panel, which is also piecewiselinear, but restricted to be continuous at the two knots. These continuity restrictions lead to linear constraints on the parameters; for example,f (ξ1− ) = f (ξ1+ ) implies that β1 + ξ1 β4 = β2 + ξ1 β5 . In this case, since thereare two restrictions, we expect to get back two parameters, leaving four freeparameters.A more direct way to proceed in this case is to use a basis that incorporates the constraints:h1 (X) = 1,h2 (X) = X,h3 (X) = (X − ξ1 )+ ,h4 (X) = (X − ξ2 )+ ,where t+ denotes the positive part.

The function h3 is shown in the lowerright panel of Figure 5.1. We often prefer smoother functions, and thesecan be achieved by increasing the order of the local polynomial. Figure 5.2shows a series of piecewise-cubic polynomials fit to the same data, with1425. Basis Expansions and RegularizationPiecewise ConstantOOO OOOOO OOPiecewise LinearOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOξ2ξ2ξ1Continuous Piecewise LinearPiecewise-linear Basis FunctionOOOOOOOOOOOOOOOOOOξ1O OOOOO OOOOOOOOOOOO OOOOOOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOOO OOOOO OOOOOOOO OOOOO••OOOOOO(X − ξ1 )+OO•• • •••• •• •••• • • •• •Oξ1ξ2••••••••OOOOOOξ1••••••••••ξ2FIGURE 5.1.

The top left panel shows a piecewise constant function fit to someartificial data. The broken vertical lines indicate the positions of the two knotsξ1 and ξ2 . The blue curve represents the true function, from which the data weregenerated with Gaussian noise.

The remaining two panels show piecewise linear functions fit to the same data—the top right unrestricted, and the lower leftrestricted to be continuous at the knots. The lower right panel shows a piecewise–linear basis function, h3 (X) = (X − ξ1 )+ , continuous at ξ1 . The black pointsindicate the sample evaluations h3 (xi ), i = 1, . . . , N .5.2 Piecewise Polynomials and Splines143Piecewise Cubic PolynomialsDiscontinuousOOO OOOO OOOOOContinuousOOOOOOOOOOOOOOOOO OOOOO OOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOO OOOOOOOOOOOOOOOOOO OOOOOOO OOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOOOOOOOOOOOOOOξ1ξ2Continuous Second DerivativeOOOOOξ1ξ2OOOContinuous First DerivativeOOOOξ1OO OOOOOOOOOOOOξ1ξ2ξ2FIGURE 5.2.

A series of piecewise-cubic polynomials, with increasing orders ofcontinuity.increasing orders of continuity at the knots. The function in the lowerright panel is continuous, and has continuous first and second derivativesat the knots. It is known as a cubic spline. Enforcing one more order ofcontinuity would lead to a global cubic polynomial. It is not hard to show(Exercise 5.1) that the following basis represents a cubic spline with knotsat ξ1 and ξ2 :h1 (X) = 1,h3 (X) = X 2 ,h2 (X) = X,h4 (X) = X 3 ,h5 (X) = (X − ξ1 )3+ ,h6 (X) = (X − ξ2 )3+ .(5.3)There are six basis functions corresponding to a six-dimensional linear spaceof functions.

A quick check confirms the parameter count: (3 regions)×(4parameters per region) −(2 knots)×(3 constraints per knot)= 6.1445. Basis Expansions and RegularizationMore generally, an order-M spline with knots ξj , j = 1, . . . , K is apiecewise-polynomial of order M , and has continuous derivatives up toorder M − 2.

A cubic spline has M = 4. In fact the piecewise-constantfunction in Figure 5.1 is an order-1 spline, while the continuous piecewise linear function is an order-2 spline. Likewise the general form for thetruncated-power basis set would behj (X)=X j−1 , j = 1, . . . , M,hM +ℓ (X)=−1(X − ξℓ )M, ℓ = 1, . . . , K.+It is claimed that cubic splines are the lowest-order spline for which theknot-discontinuity is not visible to the human eye. There is seldom anygood reason to go beyond cubic-splines, unless one is interested in smoothderivatives.

In practice the most widely used orders are M = 1, 2 and 4.These fixed-knot splines are also known as regression splines. One needsto select the order of the spline, the number of knots and their placement.One simple approach is to parameterize a family of splines by the numberof basis functions or degrees of freedom, and have the observations xi determine the positions of the knots. For example, the expression bs(x,df=7)in R generates a basis matrix of cubic-spline functions evaluated at the Nobservations in x, with the 7 − 3 = 41 interior knots at the appropriate percentiles of x (20, 40, 60 and 80th.) One can be more explicit, however; bs(x,degree=1, knots = c(0.2, 0.4, 0.6)) generates a basis for linear splines,with three interior knots, and returns an N × 4 matrix.Since the space of spline functions of a particular order and knot sequenceis a vector space, there are many equivalent bases for representing them(just as there are for ordinary polynomials.) While the truncated powerbasis is conceptually simple, it is not too attractive numerically: powers oflarge numbers can lead to severe rounding problems.

The B-spline basis,described in the Appendix to this chapter, allows for efficient computationseven when the number of knots K is large.5.2.1 Natural Cubic SplinesWe know that the behavior of polynomials fit to data tends to be erraticnear the boundaries, and extrapolation can be dangerous. These problemsare exacerbated with splines. The polynomials fit beyond the boundaryknots behave even more wildly than the corresponding global polynomialsin that region. This can be conveniently summarized in terms of the pointwise variance of spline functions fit by least squares (see the example in thenext section for details on these variance calculations). Figure 5.3 compares1 A cubic spline with four knots is eight-dimensional.

The bs() function omits bydefault the constant term in the basis, since terms like this are typically included withother terms in the model.5.2 Piecewise Polynomials and Splines0.6•0.40.5Global LinearGlobal Cubic PolynomialCubic Spline - 2 knotsNatural Cubic Spline - 6 knots•0.3••0.2•••••• ••• ••••• •• • •••0.00.1Pointwise Variances1450.0••••••• ••• ••••• •••••••••••••••••••••• •• ••••••0.2••••••••••••••••••••••••0.4•••••• • •••••• • ••• •• • •• ••••• •• • •• ••••• • •••• ••••• • •• •• • •0.60.8••••• • ••••• •• • • •••••••1.0XFIGURE 5.3. Pointwise variance curves for four different models, with X consisting of 50 points drawn at random from U [0, 1], and an assumed error modelwith constant variance.

The linear and cubic polynomial fits have two and fourdegrees of freedom, respectively, while the cubic spline and natural cubic splineeach have six degrees of freedom. The cubic spline has two knots at 0.33 and 0.66,while the natural spline has boundary knots at 0.1 and 0.9, and four interior knotsuniformly spaced between them.the pointwise variances for a variety of different models. The explosion ofthe variance near the boundaries is clear, and inevitably is worst for cubicsplines.A natural cubic spline adds additional constraints, namely that the function is linear beyond the boundary knots. This frees up four degrees offreedom (two constraints each in both boundary regions), which can bespent more profitably by sprinkling more knots in the interior region. Thistradeoff is illustrated in terms of variance in Figure 5.3. There will be aprice paid in bias near the boundaries, but assuming the function is linear near the boundaries (where we have less information anyway) is oftenconsidered reasonable.A natural cubic spline with K knots is represented by K basis functions.One can start from a basis for cubic splines, and derive the reduced basis by imposing the boundary constraints.

For example, starting from thetruncated power series basis described in Section 5.2, we arrive at (Exercise 5.4):N1 (X) = 1,N2 (X) = X,Nk+2 (X) = dk (X) − dK−1 (X),(5.4)1465. Basis Expansions and Regularizationwhere(X − ξk )3+ − (X − ξK )3+.(5.5)ξK − ξkEach of these basis functions can be seen to have zero second and thirdderivative for X ≥ ξK .dk (X) =5.2.2 Example: South African Heart Disease (Continued)In Section 4.4.2 we fit linear logistic regression models to the South Africanheart disease data.

Here we explore nonlinearities in the functions usingnatural splines. The functional form of the model islogit[Pr(chd|X)] = θ0 + h1 (X1 )T θ1 + h2 (X2 )T θ2 + · · · + hp (Xp )T θp , (5.6)where each of the θj are vectors of coefficients multiplying their associatedvector of natural spline basis functions hj .We use four natural spline bases for each term in the model. For example,with X1 representing sbp, h1 (X1 ) is a basis consisting of four basis functions.

Характеристики

Тип файла

PDF-файл

Размер

12,69 Mb

Материал

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Тип материала

Книга

Предмет

(ППП СОиАД) (SAS) Пакеты прикладных программ для статистической обработки и анализа данных

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

the-elements-of-statistical-learning.-data-mining_-inference_-and-prediction.pdf.rar

The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.