The Elements of Statistical Learning. Data Mining_ Inference_ and Prediction (811377), страница 103
Текст из файла (страница 103)
In this case LDA uses too many parameters, which areestimated with high variance, and its performance suffers. In casessuch as this we need to restrict or regularize LDA even further.In the remainder of this chapter we describe a class of techniques thatattend to all these issues by generalizing the LDA model. This is achievedlargely by three different ideas.The first idea is to recast the LDA problem as a linear regression problem.Many techniques exist for generalizing linear regression to more flexible,nonparametric forms of regression. This in turn leads to more flexible formsof discriminant analysis, which we call FDA. In most cases of interest, the3 Thisstudy predated the emergence of SVMs.44012. Flexible Discriminantsregression procedures can be seen to identify an enlarged set of predictorsvia basis expansions.
FDA amounts to LDA in this enlarged space, thesame paradigm used in SVMs.In the case of too many predictors, such as the pixels of a digitized image,we do not want to expand the set: it is already too large. The second idea isto fit an LDA model, but penalize its coefficients to be smooth or otherwisecoherent in the spatial domain, that is, as an image. We call this procedurepenalized discriminant analysis or PDA.
With FDA itself, the expandedbasis set is often so large that regularization is also required (again as inSVMs). Both of these can be achieved via a suitably regularized regressionin the context of the FDA model.The third idea is to model each class by a mixture of two or more Gaussians with different centroids, but with every component Gaussian, bothwithin and between classes, sharing the same covariance matrix. This allowsfor more complex decision boundaries, and allows for subspace reductionas in LDA. We call this extension mixture discriminant analysis or MDA.All three of these generalizations use a common framework by exploitingtheir connection with LDA.12.5 Flexible Discriminant AnalysisIn this section we describe a method for performing LDA using linear regression on derived responses.
This in turn leads to nonparametric and flexible alternatives to LDA. As in Chapter 4, we assume we have observationswith a quantitative response G falling into one of K classes G = {1, . . . , K},each having measured features X. Suppose θ : G 7→ IR1 is a function thatassigns scores to the classes, such that the transformed class labels are optimally predicted by linear regression on X: If our training sample has theform (gi , xi ), i = 1, 2, . . . , N , then we solveminβ,θNXi=1θ(gi ) − xTi β2,(12.52)with restrictions on θ to avoid a trivial solution (mean zero and unit variance over the training data).
This produces a one-dimensional separationbetween the classes.More generally, we can find up to L<b>Текст обрезан, так как является слишком большим</b>.