The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 93
Текст из файла (страница 93)
By substituting(12.10)–(12.12) into (12.9), we obtain the Lagrangian (Wolfe) dual objective functionLD =NXi=1Nαi −N1XXαi αi′ yi yi′ xTi xi′ ,2 i=1 ′(12.13)i =1which gives a lower bound on the objective function (12.8) for any feasiblePNpoint. We maximize LD subject to 0 ≤ αi ≤ C and i=1 αi yi = 0. Inaddition to (12.10)–(12.12), the Karush–Kuhn–Tucker conditions includethe constraintsαi [yi (xTi β + β0 ) − (1 − ξi )]µ i ξi==0,0,(12.14)(12.15)yi (xTi β + β0 ) − (1 − ξi )≥0,(12.16)for i = 1, . . . , N . Together these equations (12.10)–(12.16) uniquely characterize the solution to the primal and dual problem.12.2 The Support Vector Classifier421From (12.10) we see that the solution for β has the formβ̂ =NXα̂i yi xi ,(12.17)i=1with nonzero coefficients α̂i only for those observations i for which theconstraints in (12.16) are exactly met (due to (12.14)). These observationsare called the support vectors, since β̂ is represented in terms of themalone.
Among these support points, some will lie on the edge of the margin(ξˆi = 0), and hence from (12.15) and (12.12) will be characterized by0 < α̂i < C; the remainder (ξˆi > 0) have α̂i = C. From (12.14) we cansee that any of these margin points (0 < α̂i , ξˆi = 0) can be used to solvefor β0 , and we typically use an average of all the solutions for numericalstability.Maximizing the dual (12.13) is a simpler convex quadratic programmingproblem than the primal (12.9), and can be solved with standard techniques(Murray et al., 1981, for example).Given the solutions β̂0 and β̂, the decision function can be written asĜ(x)=sign[fˆ(x)]=sign[xT β̂ + β̂0 ].(12.18)The tuning parameter of this procedure is the cost parameter C.12.2.2 Mixture Example (Continued)Figure 12.2 shows the support vector boundary for the mixture exampleof Figure 2.5 on page 21, with two overlapping classes, for two differentvalues of the cost parameter C.
The classifiers are rather similar in theirperformance. Points on the wrong side of the boundary are support vectors.In addition, points on the correct side of the boundary but close to it (inthe margin), are also support vectors. The margin is larger for C = 0.01than it is for C = 10, 000. Hence larger values of C focus attention moreon (correctly classified) points near the decision boundary, while smallervalues involve data further away. Either way, misclassified points are givenweight, no matter how far away. In this example the procedure is not verysensitive to choices of C, because of the rigidity of a linear boundary.The optimal value for C can be estimated by cross-validation, as discussed in Chapter 7. Interestingly, the leave-one-out cross-validation errorcan be bounded above by the proportion of support points in the data.
Thereason is that leaving out an observation that is not a support vector willnot change the solution. Hence these observations, being classified correctlyby the original boundary, will be classified correctly in the cross-validationprocess. However this bound tends to be too high, and not generally usefulfor choosing C (62% and 85%, respectively, in our examples).42212. Flexible Discriminants..
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
.... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .....
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
.. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... ...Training.. .. .. .. .. .. .. .. Error:.. .. .. .. .. .. 0.270.. .. .. .. .. .... .. ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... ..Test. . . . Error:. . . . . . . . . . 0.288........ .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .... .Bayes. . . . . . Error:. . . . . . . .
0.210......o..... ..... o..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ....... .. .. .. .. ..o.........................................o.. ..
.. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... ..o.. .. ..o..o. . . . . . . . . . . . .o............................o o.. .. o.. .. .. .. ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..... ..o. . .. .. ..o.. .. .. ..o....................................o o o.. .. ... ...o.. .. .. .. ..o.. .. ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...o...................o..o.. .. .. .. .. .. o.. .. .. .. .. .. .. .. .. .. .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...o ooo .... .... .... ....oo.... .... .... .... .... .... .... .... .... .... .... .... .... .... ....o....o.... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ....
.... .... .... .... .... .... ..................................oo ... ... ... ... ... ... ... ... ...oo... ... ... ... ... ... ... ...o.. .. .. .. ..o.. .. .. .. ..o.. .. o. . .. .. ..o•... .....o..... o..... ...... ...... ...... ...... ...... ...... ...... ...... o...... ...... ......
......o...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ....... . . . . . . . . . . . .. .. o. . . .. o. . . . . . . . . . . . . . . . . . .o. . . . . . .o o o....oo.... .... o....o.... .... .... .... .... .... .... .... .... .... ....o.... ....o.... .... .... .... oo• ooooooo..... o..... ..... ..... ..... o..... .....oo..... o..... ..... ..... ..... .....o..... o..... ..... o..... .....oo..... ..... .....
....... ....... ....... ....... o....... ....... ....... .......oo....... ....... .......o....... ....... ....... ....... ....... ....... .......o....... o....... ....... ....... ....... ....... .......o.......o.. .. .. .. .. .. .. .. .. .. .. .. o........o...o... ... o...o... o... ... ... ... ... ... ... ...o... ...o... ...o...